On Tue, Mar 14, 2017 at 2:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> It's become pretty clear to me that there are a bunch of other things
>> about hash indexes which are not exactly great, the worst of which is
>> the way they grow by DOUBLING IN SIZE.
>
> Uh, what? Growth should happen one bucket-split at a time.
Technically, the buckets are created one at a time, but because of the
way hashm_spares works, the primary bucket pages for all bucket from
2^N to 2^{N+1}-1 must be physically consecutive. See
_hash_alloc_buckets.
>> Other things that are not so great:
>
>> - no multi-column support
>> - no amcanunique support
>> - every insert dirties the metapage
>> - splitting is generally too aggressive; very few overflow pages are
>> ever created unless you have piles of duplicates
>
> Yeah. It's a bit hard to see how to add multi-column support unless you
> give up the property of allowing queries on a subset of the index columns.
> Lack of amcanunique seems like mostly a round-tuit shortage. The other
> two are implementation deficiencies that maybe we can remedy someday.
>
> Another thing I'd like to see is support for 64-bit hash values.
>
> But all of these were mainly blocked by people not wanting to sink effort
> into hash indexes as long as they were unusable for production due to lack
> of WAL support. So this is a huge step forward.
Agreed, on all points.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company