On Thu, Aug 3, 2017 at 6:08 PM, Andres Freund <andres@anarazel.de> wrote:
>> That's another way to go, but it requires inventing a way to thread
>> the IV through the hash opclass interface.
>
> Only if we really want to do it really well :P. Using a hash_combine()
> like
>
> /*
> * Combine two hash values, resulting in another hash value, with decent bit
> * mixing.
> *
> * Similar to boost's hash_combine().
> */
> static inline uint32
> hash_combine(uint32 a, uint32 b)
> {
> a ^= b + 0x9e3779b9 + (a << 6) + (a >> 2);
> return a;
> }
>
> between hash(IV) and the hashfunction should do the trick (the IV needs
> to hashed once, otherwise the bit mix is bad).
That seems pretty lame, although it's sufficient to solve the
immediate problem, and I have to admit to a certain predilection for
things that solve the immediate problem without creating lots of
additional work.
>> We could:
>>
>> - Invent a new hash_partition AM that doesn't really make indexes but
>> supplies hash functions for hash partitioning.
>> - Add a new, optional support function 2 to the hash AM that takes a
>> value of the type *and* an IV as an argument.
>> - Something else.
>
> Not arguing for it, but one option could also have pg_type.hash*
> function(s).
True. That is a bit less configurable because you can't then have
multiple functions for the same type. Going through the opclass
interface means you can have hash_portable_ops and hash_fast_ops and
let people choose. But this would be easy to implement and enough for
most people in practice.
> One thing that I think might be advisable to think about is that we're
> atm stuck with a relatively bad hash function for hash indexes (and hash
> joins/aggs), and we should probably evolve it at some point. At the same
> time there's currently people out there relying on the current hash
> functions remaining stable.
That to me is a bit of a separate problem.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company