Re: General purpose hashing func in pgbench
От | Fabien COELHO |
---|---|
Тема | Re: General purpose hashing func in pgbench |
Дата | |
Msg-id | alpine.DEB.2.20.1712240846440.22976@lancre обсуждение исходный текст |
Ответ на | Re: General purpose hashing func in pgbench (Ildar Musin <i.musin@postgrespro.ru>) |
Ответы |
Re: General purpose hashing func in pgbench
|
Список | pgsql-hackers |
Hello Ildar, > Actually the "bad" one appears in YCSB. Fine. Then it must be kept, whatever its quality. > But if we should choose the only one I would stick to murmur too given > it provides better results while having similar computational > complexity. No. Keep both as there is a justification for the bad one. Just make "hash()" default to a good one. >> One implementation put constants in defines, the other one uses "const >> int". [...] > [...] it looked ugly and hard to read (IMHO), like: > > k *= MURMUR2_M; > k ^= k >> MURMUR2_R; > k *= MURMUR2_M; > result ^= k; > result *= MURMUR2_M; Yep. The ugliness is significantly linked to the choice of name. With MM2_MUL and MM2_ROT ISTM that it is more readable: > k *= MM2_MUL; > k ^= k >> MM2_ROT; > k *= MM2_MUL; > result ^= k; > result *= MM2_MUL; > [...] So I'd better leave it the way it is. Actually I was thinking to > do the same to fnv1a too : ) I think that the implementation style should be homogeneous, so I'd suggest at least to stick to one style. I noticed from the source of all human knowledege (aka Wikipedia:-) that there seems to be a murmur3 successor. Have you considered it? One good reason to skip it would be that the implementation is long and complex. I'm not sure about a 8-byte input simplified version. Just a question: Have you looked at SipHash24? https://en.wikipedia.org/wiki/SipHash The interesting point is that it can use a key and seems somehow cryptographically secure, for a similar cost. However the how to decide for/control the key is unclear. -- Fabien.
В списке pgsql-hackers по дате отправления: