Re: [PATCH] Refactor *_abbrev_convert() functions

Поиск
Список
Период
Сортировка
От Aleksander Alekseev
Тема Re: [PATCH] Refactor *_abbrev_convert() functions
Дата
Msg-id CAJ7c6TPZVKk1Dn7LWRatBi=UFYfEPmfGGFYT8wQzDj7y8oJ_CQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Refactor *_abbrev_convert() functions  (John Naylor <johncnaylorls@gmail.com>)
Список pgsql-hackers
Hi John,

Many thanks for your feedback!

> There's more we can do here. Above the stanzas changed in the patch
> there is this, at least for varlena/bytea:
>
> hash = DatumGetUInt32(hash_any((unsigned char *) authoritative_data,
>                       Min(len, PG_CACHE_LINE_SIZE)));
>
> This makes no sense to me: hash_any() calls hash_bytes() and turns the
> result into a Datum, and then we just get it right back out of the
> Datum again.

I see similar patterns in files other than bytea.c and varlena.c.
Implemented as a separate patch.

> if (len > PG_CACHE_LINE_SIZE)
>   hash ^= DatumGetUInt32(hash_uint32((uint32) len));
>
> Similar here, but instead of hash_bytes_uint32(), we may as well use
> mumurhash32().

Ditto.

It's worth noting that timetz_hash() uses a similar pattern but I
choose to keep it as is. Changing it will break backward
compatibility. Also it breaks our tests.

Using hash_bytes_uint32() / hash_bytes_uint32_extended() directly in
timetz_hash() / timetz_hash_extended() is safe though. Proposed as a
separate patch.

> addHyperLogLog says "typically generated using
> hash_any()", but that function takes a uint32, not a Datum, so that
> comment should probably be changed. hash_bytes() is global, so we can
> use it directly.

Makes sense. Implemented as an independent patch.

-- 
Best regards,
Aleksander Alekseev

Вложения

В списке pgsql-hackers по дате отправления: