Re: [HACKERS] [POC] hash partitioning

Поиск
Список
Период
Сортировка
От amul sul
Тема Re: [HACKERS] [POC] hash partitioning
Дата
Msg-id CAAJ_b96n4Q_tNLqVG78pm=CG9JiDNxmUiK=BWODzoegPJkW8+w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] [POC] hash partitioning  (Andres Freund <andres@anarazel.de>)
Ответы Re: [HACKERS] [POC] hash partitioning  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Fri, Oct 13, 2017 at 3:00 AM, Andres Freund <andres@anarazel.de> wrote:
> On 2017-10-12 17:27:52 -0400, Robert Haas wrote:
>> On Thu, Oct 12, 2017 at 4:20 PM, Andres Freund <andres@anarazel.de> wrote:
>> >> In other words, it's not utterly fixed in stone --- we invented
>> >> --load-via-partition-root primarily to cope with circumstances that
>> >> could change hash values --- but we sure don't want to be changing it
>> >> with any regularity, or for a less-than-excellent reason.
>> >
>> > Yea, that's what I expected. It'd probably good for somebody to run
>> > smhasher or such on the output of the combine function (or even better,
>> > on both the 32 and 64 bit variants) in that case.
>>
>> Not sure how that test suite works exactly, but presumably the
>> characteristics in practice will depend the behavior of the hash
>> functions used as input the combine function - so the behavior could
>> be good for an (int, int) key but bad for a (text, date) key, or
>> whatever.
>
> I don't think that's true, unless you have really bad hash functions on
> the the component hashes. A hash combine function can't really do
> anything about badly hashed input, what you want is that it doesn't
> *reduce* the quality of the hash by combining.
>

I tried to get suggested SMHasher[1] test result for the hash_combine
for 32-bit and 64-bit version.

SMHasher works on hash keys of the form {0}, {0,1}, {0,1,2}... up to
N=255, using 256-N as the seed, for the hash_combine testing we
needed two hash value to be combined, for that, I've generated 64
and 128-bit hash using cityhash functions[2] for the given smhasher
key then split in two part to test 32-bit and 64-bit hash_combine
function respectively.   Attached patch for SMHasher code changes &
output of 32-bit and 64-bit hash_combine testing. Note that I have
skipped speed test this test which is irrelevant here.

By referring other hash function results [3], we can see that hash_combine
test results are not bad either.

Do let me know if current testing is not good enough or if you want me to do
more testing, thanks.

1] https://github.com/aappleby/smhasher
2] https://github.com/aappleby/smhasher/blob/master/src/CityTest.cpp
3] https://github.com/rurban/smhasher/tree/master/doc

Regards,
Amul

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: [HACKERS] Remove secondary checkpoint
Следующее
От: Ivan Kartyshov
Дата:
Сообщение: Re: [HACKERS] WIP: long transactions on hot standby feedback replica/ proof of concept