All,
(dropping pgsql-advocacy off the cc list)
On 04/30/2014 10:11 AM, Robert Haas wrote:
> On Wed, Apr 30, 2014 at 12:54 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> On Wed, Apr 30, 2014 at 5:55 AM, ktm@rice.edu <ktm@rice.edu> wrote:
>>> I do not think that CPU costs matter as much as the O(1) probe to
>>> get a result value specifically for very large indexes/tables where
>>> even caching the upper levels of a B-tree index would kill your
>>> working set in memory. I know, I know, everyone has so much memory
>>> and can just buy more... but this does matter.
>>
>> Have you actually investigated how little memory it takes to store the
>> inner pages? It's typically less than 1% of the entire index. AFAIK,
>> hash indexes are not used much in any other system. I think MySQL has
>> them, and SQL Server 2014 has special in-memory hash table indexes for
>> in memory tables, but that's all I can find on Google.
Hash indexes are more important for MySQL because they have
index-organized tables.
> I thought the theoretical advantage of hash indexes wasn't that they
> were smaller but that you avoided a central contention point (the
> btree root).
Yes. And being smaller isn't insignificant; think of billion-row tables
with fairly random access over the whole table. Also, *theoretically*,
a hash index could avoid the rebalancing issues which cause our btree
indexes to become bloated and need a REINDEX with certain update patterns.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com