Re: Combining hash values

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Combining hash values
Дата
Msg-id 21850.1469982852@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Combining hash values  (Thomas Munro <thomas.munro@enterprisedb.com>)
Ответы Re: Combining hash values  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-hackers
Thomas Munro <thomas.munro@enterprisedb.com> writes:
> 2.  I suspect that this algorithm for combining hashes is weak, and
> could amplify weaknesses in the hash functions feeding it.

Very possibly, but ...

> Compare
> Boost's hash_combine, which does some more work before XORing:

>     seed ^= hash_value(v) + 0x9e3779b9 + (seed << 6) + (seed >> 2);

I can't help being reminded of Knuth's story about he tried to invent
the world's best random number generator, and was disappointed when
it almost immediately converged to a short repeating sequence.  If
there's any actual theoretical basis to the above, I'd be interested
to see it.  But as-is, the use of addition rather than XOR looks fishy,
and the way the old seed is shifted around looks more likely to result
in cancellation than anything else.

> That constant approximates the golden ratio (as a fraction of the 32
> bit hash space), and it also appears in our hash_any and hash_uint32
> functions.

I think it's just somebody's idea of a magic random number.  Your link
> [3] http://cstheory.stackexchange.com/questions/9639/how-did-knuth-derive-a
provides some reasons to think it might be a good choice for a very
specific application, but this is not that application --- in particular,
it doesn't involve multiplying by that constant.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Hash indexes and effective_cache_size in CREATE INDEX documentation
Следующее
От: Shay Rojansky
Дата:
Сообщение: Re: Slowness of extended protocol