Re: Hash support for arrays

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Hash support for arrays
Дата
Msg-id 10122.1288492081@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Hash support for arrays  (Greg Stark <gsstark@mit.edu>)
Список pgsql-hackers
Greg Stark <gsstark@mit.edu> writes:
> I suppose you could use crc where our interface does allow incremental use.

Hm, that's a thought.  I'm not sure though whether CRC really has the
behavior you want from a hash function, in particular that all the bits
are independent (a/k/a taking N low-order bits is a reasonable way to
cut the hash down to a suitable bucket index).  Anybody know?

> I'm not really familiar enough with the math to know whether CRC is
> any better than your XOR mixing. I suspect they're pretty similar. I
> suppose if you have arrays of various lengths containing repetitions
> of a single value than the non-CRC would produce a hash of 0 for all
> arrays with a length that's a multiple of 32. I'm not sure what CRC
> would do.

Some quick experimenting suggests that you get -1 from an array of 32 of
the same value, then 0 from 64 of the same, etc.  This doesn't bother me
particularly though.  There are always going to be some situations that
produce degenerate hash values.

> Oh, one question that occurred to me you didn't mention including the
> bounds of the array or the null bitmap in the hash function. I suppose
> it's correct with or without them (or in the case of the null bitmap
> another way to put it is the choice of the hash value to represent
> null is arbitrary).

Yeah.  I have it coded to treat a null element as if it had a hash value
of zero.  I don't see a great need to consider the array bounds per se.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Stark
Дата:
Сообщение: Re: Hash support for arrays
Следующее
От: Tom Lane
Дата:
Сообщение: Maximum function call nesting depth for regression tests