Обсуждение: Hash index use presently(?) discouraged since 2005: revive or bury it?

Поиск
Список
Период
Сортировка

Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Stefan Keller
Дата:
The doc at http://www.postgresql.org/docs/current/interactive/indexes-types.html
says: "Caution: Hash index operations are not presently WAL-logged, so
hash indexes might need to be rebuilt with REINDEX after a database
crash. They are also not replicated over streaming or file-based
replication. For these reasons, hash index use is presently
discouraged."

I found a thread here
http://archives.postgresql.org/pgsql-general/2005-05/msg00370.php
about <<"Hash index" vs. "b-tree index" (PostgreSQL 8.0)>> mentioning
some issues, like they
* are not faster than B-trees even for = comparisons
* aren't WAL safe
* have poor concurrency (require coarser locks),
* are significantly slower than creating a b+-tree index.

In fact these statements seem to rely on the docs back in version 7.2
(see http://www.postgresql.org/docs/7.2/static/indexes-types.html )

Has this been verified on a recent release? I can't believe that hash
performs so bad over all these points. Theory tells me otherwise and
http://en.wikipedia.org/wiki/Hash_table seems to be a success.

Are there any plans to give hash index another chance (or to bury it
with a reason)?

Stefan

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Peter Geoghegan
Дата:
On 14 September 2011 00:04, Stefan Keller <sfkeller@gmail.com> wrote:
> Has this been verified on a recent release? I can't believe that hash
> performs so bad over all these points. Theory tells me otherwise and
> http://en.wikipedia.org/wiki/Hash_table seems to be a success.

Hash indexes have been improved since 2005 - their performance was
improved quite a bit in 9.0. Here's a more recent analysis:

http://www.depesz.com/index.php/2010/06/28/should-you-use-hash-index/

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Tom Lane
Дата:
Peter Geoghegan <peter@2ndquadrant.com> writes:
> On 14 September 2011 00:04, Stefan Keller <sfkeller@gmail.com> wrote:
>> Has this been verified on a recent release? I can't believe that hash
>> performs so bad over all these points. Theory tells me otherwise and
>> http://en.wikipedia.org/wiki/Hash_table seems to be a success.

> Hash indexes have been improved since 2005 - their performance was
> improved quite a bit in 9.0. Here's a more recent analysis:

> http://www.depesz.com/index.php/2010/06/28/should-you-use-hash-index/

Yeah, looking into the git logs shows several separate major changes
committed during 2008, including storing only the hash code not the
whole indexed value (big win on wide values, and lets you index values
larger than one index page, which doesn't work in btree).  I think that
the current state of affairs is still what depesz said, namely that
there might be cases where they'd be a win to use, except the lack of
WAL support is a killer.  I imagine somebody will step up and do that
eventually.

The big picture though is that we're not going to remove hash indexes,
even if they're nearly useless in themselves, because hash index
opclasses provide the foundation for the system's knowledge of how to
do the datatype-specific hashing needed for hash joins and hash
aggregation.  And those things *are* big wins, even if hash indexes
themselves never become so.

            regards, tom lane

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Stefan Keller
Дата:
2011/9/14 Tom Lane <tgl@sss.pgh.pa.us>:
> (...) I think that
> the current state of affairs is still what depesz said, namely that
> there might be cases where they'd be a win to use, except the lack of
> WAL support is a killer.  I imagine somebody will step up and do that
> eventually.

Should I open a ticket?

Stefan

2011/9/14 Tom Lane <tgl@sss.pgh.pa.us>:
> Peter Geoghegan <peter@2ndquadrant.com> writes:
>> On 14 September 2011 00:04, Stefan Keller <sfkeller@gmail.com> wrote:
>>> Has this been verified on a recent release? I can't believe that hash
>>> performs so bad over all these points. Theory tells me otherwise and
>>> http://en.wikipedia.org/wiki/Hash_table seems to be a success.
>
>> Hash indexes have been improved since 2005 - their performance was
>> improved quite a bit in 9.0. Here's a more recent analysis:
>
>> http://www.depesz.com/index.php/2010/06/28/should-you-use-hash-index/
>
> Yeah, looking into the git logs shows several separate major changes
> committed during 2008, including storing only the hash code not the
> whole indexed value (big win on wide values, and lets you index values
> larger than one index page, which doesn't work in btree).  I think that
> the current state of affairs is still what depesz said, namely that
> there might be cases where they'd be a win to use, except the lack of
> WAL support is a killer.  I imagine somebody will step up and do that
> eventually.
>
> The big picture though is that we're not going to remove hash indexes,
> even if they're nearly useless in themselves, because hash index
> opclasses provide the foundation for the system's knowledge of how to
> do the datatype-specific hashing needed for hash joins and hash
> aggregation.  And those things *are* big wins, even if hash indexes
> themselves never become so.
>
>                        regards, tom lane
>

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Heikki Linnakangas
Дата:
On 14.09.2011 09:39, Stefan Keller wrote:
> Should I open a ticket?

What ticket? With whom?

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Heikki Linnakangas
Дата:
On 14.09.2011 03:24, Tom Lane wrote:
> The big picture though is that we're not going to remove hash indexes,
> even if they're nearly useless in themselves, because hash index
> opclasses provide the foundation for the system's knowledge of how to
> do the datatype-specific hashing needed for hash joins and hash
> aggregation.  And those things *are* big wins, even if hash indexes
> themselves never become so.

We could drop the hash indexam code but keep the opclasses etc. I'm not
sure that would gain us, though.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Leonardo Francalanci
Дата:
>>  Hash indexes have been improved since 2005 - their performance was

>>  improved quite a bit in 9.0. Here's a more recent analysis:
>
>>  http://www.depesz.com/index.php/2010/06/28/should-you-use-hash-index/
>
> The big picture though is that we're not going to remove hash indexes,
> even if they're nearly useless in themselves

Well, if they provide 3x the performance of btree indexes on index creation,
I wouldn't call them "useless" just because they're not logged or they can't
be unique. In fact, I think the docs should specify that in index creation
they're actually better than btree (if, in fact, they are and the "depesz" test
is not a corner case).

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Stefan Keller
Дата:
2011/9/14 Tom Lane <tgl@sss.pgh.pa.us> writes:
> (...) I think that
> the current state of affairs is still what depesz said, namely that
> there might be cases where they'd be a win to use, except the lack of
> WAL support is a killer.  I imagine somebody will step up and do that
> eventually.

How much of work (in man days) do you estimate would this mean for
someone who can program but has to learn PG internals first?

Stefan

2011/9/14 Tom Lane <tgl@sss.pgh.pa.us>:
> Peter Geoghegan <peter@2ndquadrant.com> writes:
>> On 14 September 2011 00:04, Stefan Keller <sfkeller@gmail.com> wrote:
>>> Has this been verified on a recent release? I can't believe that hash
>>> performs so bad over all these points. Theory tells me otherwise and
>>> http://en.wikipedia.org/wiki/Hash_table seems to be a success.
>
>> Hash indexes have been improved since 2005 - their performance was
>> improved quite a bit in 9.0. Here's a more recent analysis:
>
>> http://www.depesz.com/index.php/2010/06/28/should-you-use-hash-index/
>
> Yeah, looking into the git logs shows several separate major changes
> committed during 2008, including storing only the hash code not the
> whole indexed value (big win on wide values, and lets you index values
> larger than one index page, which doesn't work in btree).  I think that
> the current state of affairs is still what depesz said, namely that
> there might be cases where they'd be a win to use, except the lack of
> WAL support is a killer.  I imagine somebody will step up and do that
> eventually.
>
> The big picture though is that we're not going to remove hash indexes,
> even if they're nearly useless in themselves, because hash index
> opclasses provide the foundation for the system's knowledge of how to
> do the datatype-specific hashing needed for hash joins and hash
> aggregation.  And those things *are* big wins, even if hash indexes
> themselves never become so.
>
>                        regards, tom lane
>

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Tom Lane
Дата:
Stefan Keller <sfkeller@gmail.com> writes:
> 2011/9/14 Tom Lane <tgl@sss.pgh.pa.us> writes:
>> (...) I think that
>> the current state of affairs is still what depesz said, namely that
>> there might be cases where they'd be a win to use, except the lack of
>> WAL support is a killer.  I imagine somebody will step up and do that
>> eventually.

> How much of work (in man days) do you estimate would this mean for
> someone who can program but has to learn PG internals first?

No idea ... I'm probably not the best person to estimate how long it
would take someone to get up to speed on the relevant internals,
but I'm sure that would take longer than actually doing the work.
While it's not a trivial task, I think it fits the definition of
"a small matter of programming": a piece of code whose anticipated
length is significantly greater than its complexity.

            regards, tom lane

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Merlin Moncure
Дата:
On Wed, Sep 14, 2011 at 4:03 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> On 14.09.2011 03:24, Tom Lane wrote:
>>
>> The big picture though is that we're not going to remove hash indexes,
>> even if they're nearly useless in themselves, because hash index
>> opclasses provide the foundation for the system's knowledge of how to
>> do the datatype-specific hashing needed for hash joins and hash
>> aggregation.  And those things *are* big wins, even if hash indexes
>> themselves never become so.
>
> We could drop the hash indexam code but keep the opclasses etc. I'm not sure
> that would gain us, though.

HM, what if you junked the current hash indexam, and just implemented
a wrapper over btree so that the 'hash index' was just short hand for
hashing the value into a standard index?

merlin

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Claudio Freire
Дата:
On Thu, Sep 15, 2011 at 5:00 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>
> HM, what if you junked the current hash indexam, and just implemented
> a wrapper over btree so that the 'hash index' was just short hand for
> hashing the value into a standard index?

I'm doing this (only by hand, indexing on hash(blah)) on an
application, and it works wonders.
But... it's kinda not a hash table. It's still O(log N).

However, it would be a *very* useful feature if it can be made
transparent for applications.
And I would prefer it over a true hashtable, in the end. Hashes are,
in fact, O(N) worst case.

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Merlin Moncure
Дата:
On Thu, Sep 15, 2011 at 3:28 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
> On Thu, Sep 15, 2011 at 5:00 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>>
>> HM, what if you junked the current hash indexam, and just implemented
>> a wrapper over btree so that the 'hash index' was just short hand for
>> hashing the value into a standard index?
>
> I'm doing this (only by hand, indexing on hash(blah)) on an
> application, and it works wonders.
> But... it's kinda not a hash table. It's still O(log N).
>
> However, it would be a *very* useful feature if it can be made
> transparent for applications.
> And I would prefer it over a true hashtable, in the end. Hashes are,
> in fact, O(N) worst case.

yeah -- in my (limited) testing, with int4 or int8, btree handily
meets or beats hash on creation, access time, and index size.  this
suggests to me that a separate index implementation for hash isn't
buying us much -- the integer btree code is highly optimized.

merlin

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Tom Lane
Дата:
Merlin Moncure <mmoncure@gmail.com> writes:
> HM, what if you junked the current hash indexam, and just implemented
> a wrapper over btree so that the 'hash index' was just short hand for
> hashing the value into a standard index?

Surely creating such a wrapper would be *more* work than adding WAL
support to the hash AM.

I'm not entirely following this eagerness to junk that AM, anyway.
We've put a lot of sweat into it over the years, in the hopes that
it would eventually be good for something.  It's on the edge of
being good for something now, and there's doubtless room for more
improvements, so why are the knives out?

            regards, tom lane

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Claudio Freire
Дата:
On Fri, Sep 16, 2011 at 12:38 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I'm not entirely following this eagerness to junk that AM, anyway.
> We've put a lot of sweat into it over the years, in the hopes that
> it would eventually be good for something.  It's on the edge of
> being good for something now, and there's doubtless room for more
> improvements, so why are the knives out?

There are lots of issues with hash tables. I'm not going to enumerate
them, you probably know them better than I.

But the reality of it is that btree on hash values is a very useful
index kind. It has stable performance, is very compact, and supports
any type, even user defined, if the hashing function can be
customized. They're better than hashes in all my tests for my use case
(which is indexing over a column with long strings), and the only
drawback is that they have to be supported by application code.

If PG could have a native implementation, I'm sure lots of people
would find it useful.

Maybe scrapping the hash index is too much, but support for indexing
with btree with hashing would be very neat.

I read recently hash removed the need to store the value in the index,
so I don't expect such a wrapper to be difficult to write.

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Merlin Moncure
Дата:
On Thu, Sep 15, 2011 at 5:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Merlin Moncure <mmoncure@gmail.com> writes:
>> HM, what if you junked the current hash indexam, and just implemented
>> a wrapper over btree so that the 'hash index' was just short hand for
>> hashing the value into a standard index?
>
> Surely creating such a wrapper would be *more* work than adding WAL
> support to the hash AM.
>
> I'm not entirely following this eagerness to junk that AM, anyway.
> We've put a lot of sweat into it over the years, in the hopes that
> it would eventually be good for something.  It's on the edge of
> being good for something now, and there's doubtless room for more
> improvements, so why are the knives out?

Just making an observation.  Some quick tests follow the sig.  I think
the point here is that something has to be done -- now that the
replication train has left the station, not having WAL has gone from
quirky annoyance to major functionality failure.  The recent hash work
has brought down index build times to a reasonable level, but they are
still getting beat by btree.  Of course, it's not quite apples to
apples (I figure the timings will even out to an extent once you add
in the hashing wrapper), but I can't help but wonder if the btree code
is a better driver and consolidating code is a good thing.

merlin

postgres=# create table v as select generate_series(1,10000000) as x;
SELECT 10000000
Time: 16750.961 ms
postgres=# create index on v(x);
CREATE INDEX
Time: 15158.637 ms
postgres=# create index on v using hash(x);
CREATE INDEX
Time: 22505.468 ms

postgres=# \d v
       Table "public.v"
 Column |  Type   | Modifiers
--------+---------+-----------
 x      | integer |
Indexes:
    "v_x_idx" btree (x)
    "v_x_idx1" hash (x)

postgres=# select relname, relfilenode from pg_class where relname like 'v_x%';
 relname  | relfilenode
----------+-------------
 v_x_idx  |       16525
 v_x_idx1 |       16526
(2 rows)

c:\Program Files\PostgreSQL\9.0\data>dir/s | grep 16525
09/15/2011  07:46 PM       224,641,024 16525

c:\Program Files\PostgreSQL\9.0\data>dir/s | grep 16526
09/15/2011  07:49 PM       268,451,840 16526

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Claudio Freire
Дата:
On Fri, Sep 16, 2011 at 3:00 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>
> c:\Program Files\PostgreSQL\9.0\data>dir/s | grep 16525
> 09/15/2011  07:46 PM       224,641,024 16525
>
> c:\Program Files\PostgreSQL\9.0\data>dir/s | grep 16526
> 09/15/2011  07:49 PM       268,451,840 16526

That's not surprising at all.
Hashes need to be bigger to avoid collisions.

What's more interesting than index creation, is index maintainance and
access costs.
In my experience, btree beats hash.
I haven't tried with 9.1, though.

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Merlin Moncure
Дата:
On Thu, Sep 15, 2011 at 8:00 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Thu, Sep 15, 2011 at 5:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Merlin Moncure <mmoncure@gmail.com> writes:
>>> HM, what if you junked the current hash indexam, and just implemented
>>> a wrapper over btree so that the 'hash index' was just short hand for
>>> hashing the value into a standard index?
>>
>> Surely creating such a wrapper would be *more* work than adding WAL
>> support to the hash AM.
>>
>> I'm not entirely following this eagerness to junk that AM, anyway.
>> We've put a lot of sweat into it over the years, in the hopes that
>> it would eventually be good for something.  It's on the edge of
>> being good for something now, and there's doubtless room for more
>> improvements, so why are the knives out?
>
> Just making an observation.  Some quick tests follow the sig.  I think
> the point here is that something has to be done -- now that the
> replication train has left the station, not having WAL has gone from
> quirky annoyance to major functionality failure.  The recent hash work
> has brought down index build times to a reasonable level, but they are
> still getting beat by btree.  Of course, it's not quite apples to
> apples (I figure the timings will even out to an extent once you add
> in the hashing wrapper), but I can't help but wonder if the btree code
> is a better driver and consolidating code is a good thing.

odd: I was pondering Claudio's point about maintenance of hash indexes
vs btree and decided to do some more tests.  Something very strange is
happening:  I decided to compare 'update v set x=x+1', historically
one of postgres's weaker points, on the 10M table indexed hash vs
btree.  The btree typically muddled through in about 5 minutes:

postgres=# update v set x=x+1;
UPDATE 10000000
Time: 302341.466 ms

recreating the table and hash index, I ran it again. 47 minutes into
the query, I started to get curious and noticed that cpu time disk
usage are hovering near zero but nothing is blocked. disk space on the
index is *slowly* increasing, now at:
09/15/2011  11:08 PM       541,024,256 16531

this is obviously, uh, windows, and I don't have good tools set up.
I'll repeat the test when i get into the office this morning.  thought
I'd point it out.  hm, cancelled the query, dropped the index, and
re-ran the update without any indexes, everything is normal -- the
update whistled through the table in about 35 seconds.

hm, recreated the hash index and looking more carefully now.  still
seeing the lousy behavior.  this definitely bears more investigation
(this is 9.0.4)...

merlin

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Stefan Keller
Дата:
2011/9/16 Tom Lane <tgl@sss.pgh.pa.us>:
> I'm not entirely following this eagerness to junk that AM, anyway.
> We've put a lot of sweat into it over the years, in the hopes that
> it would eventually be good for something.  It's on the edge of
> being good for something now, and there's doubtless room for more
> improvements, so why are the knives out?

No knives from my side. Sorry for the exaggerated subject title.
I'm also in favor for an enhanced hash index for cases where only "="
tests are processed and where only few inserts/deletes will occur.

Stefan

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Tomas Vondra
Дата:
Dne 15.9.2011 01:40, Tom Lane napsal(a):
> Stefan Keller <sfkeller@gmail.com> writes:
>> 2011/9/14 Tom Lane <tgl@sss.pgh.pa.us> writes:
>>> (...) I think that
>>> the current state of affairs is still what depesz said, namely that
>>> there might be cases where they'd be a win to use, except the lack of
>>> WAL support is a killer.  I imagine somebody will step up and do that
>>> eventually.
>
>> How much of work (in man days) do you estimate would this mean for
>> someone who can program but has to learn PG internals first?
>
> No idea ... I'm probably not the best person to estimate how long it
> would take someone to get up to speed on the relevant internals,
> but I'm sure that would take longer than actually doing the work.
> While it's not a trivial task, I think it fits the definition of
> "a small matter of programming": a piece of code whose anticipated
> length is significantly greater than its complexity.

We've been asked by a local university for PostgreSQL-related topics of
theses and seminary works, so I'm wondering if adding WAL support to
hash indexes would be a good fit ...

Can anyone estimate if a student with reasonable C-knowledge a can
implement this in about 4 months? It seems like a reasonable amount of
research and work to me.

Tomas

PS: All interesting thesis ideas are welcome.

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Jeff Janes
Дата:
On Tue, Sep 13, 2011 at 5:04 PM, Peter Geoghegan <peter@2ndquadrant.com> wrote:
> On 14 September 2011 00:04, Stefan Keller <sfkeller@gmail.com> wrote:
>> Has this been verified on a recent release? I can't believe that hash
>> performs so bad over all these points. Theory tells me otherwise and
>> http://en.wikipedia.org/wiki/Hash_table seems to be a success.

My understanding is that a huge amount of work has gone into making
btree what it is in
PG, and not nearly as much work has gone into making hash indexes what
they could be.


> Hash indexes have been improved since 2005 - their performance was
> improved quite a bit in 9.0. Here's a more recent analysis:
>
> http://www.depesz.com/index.php/2010/06/28/should-you-use-hash-index/

They are 3 time faster to build.  But if you rip the WAL logging out
of btree, how much faster would those get?

Also, that link doesn't address concurrency of selects at all, only of inserts.

Cheers,

Jeff

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Jeff Janes
Дата:
On Wed, Sep 14, 2011 at 4:03 PM, Stefan Keller <sfkeller@gmail.com> wrote:
> 2011/9/14 Tom Lane <tgl@sss.pgh.pa.us> writes:
>> (...) I think that
>> the current state of affairs is still what depesz said, namely that
>> there might be cases where they'd be a win to use, except the lack of
>> WAL support is a killer.  I imagine somebody will step up and do that
>> eventually.


I think that adding WAL to hash indexes without first
addressing the heavy-weight locking issue would be a mistake.
Even if the WAL was fixed, the bad performance under
concurrent selects would still make it at best a narrow
niche thing.  And fixing the locking *after* WAL is in place would
probably be very much harder than the other order.

> How much of work (in man days) do you estimate would this mean for
> someone who can program but has to learn PG internals first?

Are these 8 hour days? :)

I think it could be several months at least and a high likelihood of not
getting done at all.  (depending on how good the person is, of course).

They would first have to become familiar with the WAL log and replay system.
This is quite hairy.

Also, I think that adding WAL to hash indexes would be even harder than for
other indexes, because of bucket-splits, which can touch an arbitrarily high
number of pages.  At least, that is what lead me to give up on this last time
I looked into it seriously.

I think that if it were not for those bucket-splits, it would be
relatively easy
to get rid of both the heavy-weight locks, and to add WAL logging.  I had
considered proposing making hash indexes have a fixed number of buckets
specified at creation time.  That would be an unfortunate limitation, but I
think it would be a net win over non-WAL, non-highly-concurrent hash indexes
that currently exist.  Especially if the number of buckets could be enlarged
by concurrently making a new, larger, index and then dropping the old one.
I've only thought about proposing it, because currently I don't have time
to do anything on it if the proposal was well received.


Cheers,

Jeff

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Merlin Moncure
Дата:
On Sat, Sep 17, 2011 at 4:48 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Tue, Sep 13, 2011 at 5:04 PM, Peter Geoghegan <peter@2ndquadrant.com> wrote:
>> On 14 September 2011 00:04, Stefan Keller <sfkeller@gmail.com> wrote:
>>> Has this been verified on a recent release? I can't believe that hash
>>> performs so bad over all these points. Theory tells me otherwise and
>>> http://en.wikipedia.org/wiki/Hash_table seems to be a success.
>
> My understanding is that a huge amount of work has gone into making
> btree what it is in
> PG, and not nearly as much work has gone into making hash indexes what
> they could be.
>
>
>> Hash indexes have been improved since 2005 - their performance was
>> improved quite a bit in 9.0. Here's a more recent analysis:
>>
>> http://www.depesz.com/index.php/2010/06/28/should-you-use-hash-index/
>
> They are 3 time faster to build.  But if you rip the WAL logging out
> of btree, how much faster would those get?
>
> Also, that link doesn't address concurrency of selects at all, only of inserts.

Of course hash indexes are faster to build than varlen string indexes
:-).  I use natural keys 50-80% of the time and hash indexing would
remove some of the pain in cases where I don't need ordering and range
operations. In fact, if they are made to properly support wal logging
and uniqueness, I imagine they should supplant btree in a broad range
of cases, so much so that it would be awful nice to be able to have
syntax to choose hash for primary keys and unique constraints.

@ Jeff:
>I think that adding WAL to hash indexes without first
addressing the heavy-weight locking issue would be a mistake.
Even if the WAL was fixed, the bad performance under
concurrent selects would still make it at best a narrow
niche thing.  And fixing the locking *after* WAL is in place would
probably be very much harder than the other order.

Here again, I think that any proposed improvement in the current hash
index code should be measured against wrapping a btree index.   You
get wal logging and high concurrency for free if you decide to do
that.

merlin

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Jeff Janes
Дата:
On Thu, Sep 15, 2011 at 9:20 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>
> odd: I was pondering Claudio's point about maintenance of hash indexes
> vs btree and decided to do some more tests.  Something very strange is
> happening:  I decided to compare 'update v set x=x+1', historically
> one of postgres's weaker points, on the 10M table indexed hash vs
> btree.  The btree typically muddled through in about 5 minutes:
>
> postgres=# update v set x=x+1;
> UPDATE 10000000
> Time: 302341.466 ms
>
> recreating the table and hash index, I ran it again. 47 minutes into
> the query, I started to get curious and noticed that cpu time disk
> usage are hovering near zero but nothing is blocked. disk space on the
> index is *slowly* increasing, now at:
> 09/15/2011  11:08 PM       541,024,256 16531

The way you created the table, I think the rows are basically going to be
in order in the table, which means the btree index accesses are going to
visit the same block over and over again before going to the next block.

With hash indexes, it will jump all over the place.

Cheers,

Jeff

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Stefan Keller
Дата:
Merlin and Jeff,

General remark again:It's hard for me to imagine that btree is
superior for all the issues mentioned before. I still believe in hash
index for primary keys and certain unique constraints where you need
equality search and don't need ordering or range search.

2011/9/17 Jeff Janes <jeff.janes@gmail.com>:
(...)
> Also, that link doesn't address concurrency of selects at all, only of inserts.

How would (or did) you test and benchmark concurrency of inserts and selects?
Use pgbench with own config for a blackbox test?

2011/9/18 Merlin Moncure <mmoncure@gmail.com>:
> Here again, I think that any proposed improvement in the current hash
> index code should be measured against wrapping a btree index.   You
> get wal logging and high concurrency for free if you decide to do
> that.

As I understand, this would be an enhancement of btree. That's ok for
btree but not really exploiting all advantages of a separate hash
index, would'nt it?

Stefan

2011/9/18 Merlin Moncure <mmoncure@gmail.com>:
> On Sat, Sep 17, 2011 at 4:48 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>> On Tue, Sep 13, 2011 at 5:04 PM, Peter Geoghegan <peter@2ndquadrant.com> wrote:
>>> On 14 September 2011 00:04, Stefan Keller <sfkeller@gmail.com> wrote:
>>>> Has this been verified on a recent release? I can't believe that hash
>>>> performs so bad over all these points. Theory tells me otherwise and
>>>> http://en.wikipedia.org/wiki/Hash_table seems to be a success.
>>
>> My understanding is that a huge amount of work has gone into making
>> btree what it is in
>> PG, and not nearly as much work has gone into making hash indexes what
>> they could be.
>>
>>
>>> Hash indexes have been improved since 2005 - their performance was
>>> improved quite a bit in 9.0. Here's a more recent analysis:
>>>
>>> http://www.depesz.com/index.php/2010/06/28/should-you-use-hash-index/
>>
>> They are 3 time faster to build.  But if you rip the WAL logging out
>> of btree, how much faster would those get?
>>
>> Also, that link doesn't address concurrency of selects at all, only of inserts.
>
> Of course hash indexes are faster to build than varlen string indexes
> :-).  I use natural keys 50-80% of the time and hash indexing would
> remove some of the pain in cases where I don't need ordering and range
> operations. In fact, if they are made to properly support wal logging
> and uniqueness, I imagine they should supplant btree in a broad range
> of cases, so much so that it would be awful nice to be able to have
> syntax to choose hash for primary keys and unique constraints.
>
> @ Jeff:
>>I think that adding WAL to hash indexes without first
> addressing the heavy-weight locking issue would be a mistake.
> Even if the WAL was fixed, the bad performance under
> concurrent selects would still make it at best a narrow
> niche thing.  And fixing the locking *after* WAL is in place would
> probably be very much harder than the other order.
>
> Here again, I think that any proposed improvement in the current hash
> index code should be measured against wrapping a btree index.   You
> get wal logging and high concurrency for free if you decide to do
> that.
>
> merlin
>

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Jeff Janes
Дата:
On Sun, Sep 18, 2011 at 7:59 AM, Stefan Keller <sfkeller@gmail.com> wrote:
> Merlin and Jeff,
>
> General remark again:It's hard for me to imagine that btree is
> superior for all the issues mentioned before. I still believe in hash
> index for primary keys and certain unique constraints where you need
> equality search and don't need ordering or range search.

I certainly agree that hash indexes as implemented in PG
could be improved on.

>
> 2011/9/17 Jeff Janes <jeff.janes@gmail.com>:
> (...)
>> Also, that link doesn't address concurrency of selects at all, only of inserts.
>
> How would (or did) you test and benchmark concurrency of inserts and selects?
> Use pgbench with own config for a blackbox test?

I used pgbench -S -M prepared with a scale that fits in
shared_buffers, at various concurrencies.  drop the pgbench_accounts
primary key and build alternatingly a regular index and a hash index
between runs.  (If the scale doesn't fit in memory, that should
advantage the hash, but I haven't seen a large one--I've never tested
a size at which the branch blocks don't fit in memory)

It is hard to see real differences here because the index is not the
main bottleneck, regardless of which index is in use (at least on only
8 CPUs, with enough CPUs you might be able to drive the hash index
over the edge)

I also used a custom pgbench option -P, (a patch adding which feature
I was supposed to submit to this commitfest, but missed).  Cuts down
on a lot of the network chatter, locking, and other overhead and so
simulates an index look up occurring on the inside of a nested loop.

The performance at -c 1 was roughly equal, but at -c 8 the hash was
three times slower.

I don't recall concurrent testing inserts (not for speed, anyway).

Cheers,

Jeff

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Robert Klemme
Дата:
On Sun, Sep 18, 2011 at 9:31 PM, Stefan Keller <sfkeller@gmail.com> wrote:
> I'm simply referring to literature (like the intro Ramakrishnan & Gehrke).
> I just know that Oracle an Mysql actually do have them too and use it
> without those current implementation specific restrictions in
> Postgres.

Where exactly do you take that from that Oracle has hash indexes?  I
can't seem to find them:
http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/indexiot.htm#sthref293

Are you mixing this up with hash partitioning?
http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/schemaob.htm#sthref443

Or am I missing something?

> IMHO by design Hash Index (e.g. linear hashing) work best when:
> 1. only equal (=) tests are used (on whole values)
> 2. columns (key values) have very-high cardinality
>
> And ideally but not necessarily when index values do not change and
> number of rows are known ahead of time (avoiding O(N) worst case - but
> there are approaches to chaining with dynamic resizing).
>
> I just collected this to encourage ourselves that enhancing hash
> indexes could be worthwhile.

There's still the locking issue Jeff mentioned.  At least every time a
table resize occurs the whole index must be locked.  Or is there a
more fine granular locking strategy which I am overlooking?

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Thomas Kellerer
Дата:
Robert Klemme, 19.09.2011 13:13:
> On Sun, Sep 18, 2011 at 9:31 PM, Stefan Keller<sfkeller@gmail.com>  wrote:
>> I'm simply referring to literature (like the intro Ramakrishnan&  Gehrke).
>> I just know that Oracle an Mysql actually do have them too and use it
>> without those current implementation specific restrictions in
>> Postgres.
>
> Where exactly do you take that from that Oracle has hash indexes?  I
> can't seem to find them:
> http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/indexiot.htm#sthref293
>
> Are you mixing this up with hash partitioning?
> http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/schemaob.htm#sthref443
>
> Or am I missing something?

Maybe he was referring to a hash cluster:
http://download.oracle.com/docs/cd/E11882_01/server.112/e17118/statements_5001.htm

This is a storage option where you can store related rows (e.g. in a parent/child relationship) in the same phyiscal
databaseblock based on a hash value. That enables the databse to read parent and child rows with just a single IO. 

In the background Oracle probably has something like a hash index to support that.

Thomas

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Merlin Moncure
Дата:
On Sun, Sep 18, 2011 at 9:59 AM, Stefan Keller <sfkeller@gmail.com> wrote:
> Merlin and Jeff,
>
> General remark again:It's hard for me to imagine that btree is
> superior for all the issues mentioned before. I still believe in hash
> index for primary keys and certain unique constraints where you need
> equality search and don't need ordering or range search.

It is -- but please understand I'm talking about int32 tree vs hash.
Hashing as a technique of course is absolutely going to cream btree
for all kinds of data because of the advantages of working with
decomposed data -- we are all taught that in comp-sci 101 :-).  The
debate here is not about the advantages of hashing, but the specific
implementation of the hash index used.

Postgres's hash index implementation used to be pretty horrible -- it
stored the pre-hashed datum in the index which, while making it easier
to do certain things,  made it horribly slow, and, for all intents and
purposes, useless.  Somewhat recently,a lot of work was put in to fix
that -- the index now packs the hash code only which made it
competitive with btree and superior for larger keys.  However, certain
technical limitations like lack of WAL logging and uniqueness hold
hash indexing back from being used like it really should be.  In cases
where I really *do* need hash indexing, I do it in userland.

create table foo
(
  a_long_field text;
);
create index on foo(hash(a_long_field));

select * from foo where hash(a_long_field) = hash(some_value) and
a_long_field = some_value;

This technique works fine -- the main disadvantage is that enforcing
uniqueness is a PITA but since the standard index doesn't support it
either it's no great loss.  I also have the option of getting
'uniqueness' and being able to skip the equality operation if I
sacrifice some performance and choose a strong digest.  Until the hash
index issues are worked out, I submit that this remains the go-to
method to do this.

Now, my point here is that I've noticed that even with the latest
optimizations btree seems to still be superior to the hash indexing by
most metrics, so that:
create table foo
(
  an_int_field int;
  a_long_field text;
);

create index on foo(an_int_field);
create index on foo using hash(a_long_field);

On performance grounds alone, the btree index seems to be (from my
very limited testing) a better bet.  So I'm conjecturing that the
current hash implementation should be replaced with a formalization of
the userland technique shown above -- when you request a hash index,
the database will silently hash your field and weave it into a btree.
It's a hybrid: a hashed btree.  To truly demonstrate if the technique
was effective though, it would have to be coded up -- it's only fair
to compare if the btree based hash is also double checking the value
in the heap which the standard hash index must do.

The other way to go of course is to try and fix up the existing hash
index code -- add wal logging, etc. In theory, a customized hash
structure should be able to beat btree all day long which argues to
continue in this direction.

@ jeff:
>The way you created the table, I think the rows are basically going to be
in order in the table, which means the btree index accesses are going to
visit the same block over and over again before going to the next block.

This does not explain the behavior.  Yeah -- it may take longer but
your computer should not be sitting idle during create index
operations :-).  Unfortunately, I was not able to reproduce it on
linux.  I have to bite the bullet and get the mingw up if I want to
try and diagnose -- perhaps it is stalling in the semop calls.

merlin

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Robert Klemme
Дата:
On Mon, Sep 19, 2011 at 4:04 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Sun, Sep 18, 2011 at 9:59 AM, Stefan Keller <sfkeller@gmail.com> wrote:
>> Merlin and Jeff,
>>
>> General remark again:It's hard for me to imagine that btree is
>> superior for all the issues mentioned before. I still believe in hash
>> index for primary keys and certain unique constraints where you need
>> equality search and don't need ordering or range search.
>
> It is -- but please understand I'm talking about int32 tree vs hash.
> Hashing as a technique of course is absolutely going to cream btree
> for all kinds of data because of the advantages of working with
> decomposed data -- we are all taught that in comp-sci 101 :-).  The
> debate here is not about the advantages of hashing, but the specific
> implementation of the hash index used.
>
> Postgres's hash index implementation used to be pretty horrible -- it
> stored the pre-hashed datum in the index which, while making it easier
> to do certain things,  made it horribly slow, and, for all intents and
> purposes, useless.  Somewhat recently,a lot of work was put in to fix
> that -- the index now packs the hash code only which made it
> competitive with btree and superior for larger keys.  However, certain
> technical limitations like lack of WAL logging and uniqueness hold
> hash indexing back from being used like it really should be.  In cases
> where I really *do* need hash indexing, I do it in userland.
>
> create table foo
> (
>  a_long_field text;
> );
> create index on foo(hash(a_long_field));
>
> select * from foo where hash(a_long_field) = hash(some_value) and
> a_long_field = some_value;
>
> This technique works fine -- the main disadvantage is that enforcing
> uniqueness is a PITA but since the standard index doesn't support it
> either it's no great loss.  I also have the option of getting
> 'uniqueness' and being able to skip the equality operation if I
> sacrifice some performance and choose a strong digest.  Until the hash
> index issues are worked out, I submit that this remains the go-to
> method to do this.

Is this approach (storing the hash code in a btree) really faster than
a regular btree index on "a_long_field"?  And if so, for which kind of
data and load?

> Now, my point here is that I've noticed that even with the latest
> optimizations btree seems to still be superior to the hash indexing by
> most metrics, so that:
> create table foo
> (
>  an_int_field int;
>  a_long_field text;
> );
>
> create index on foo(an_int_field);
> create index on foo using hash(a_long_field);
>
> On performance grounds alone, the btree index seems to be (from my
> very limited testing) a better bet.  So I'm conjecturing that the
> current hash implementation should be replaced with a formalization of
> the userland technique shown above -- when you request a hash index,
> the database will silently hash your field and weave it into a btree.
> It's a hybrid: a hashed btree.

I'd rather call it a "btreefied hash" because you are storing a hash
but in a btree structure. :-) But that's a detail.  What I find
worrying is that then there is a certain level of obscuring the real
nature since "create index ... using hash" is not exactly creating a
hash table.

> To truly demonstrate if the technique
> was effective though, it would have to be coded up -- it's only fair
> to compare if the btree based hash is also double checking the value
> in the heap which the standard hash index must do.

Right.

> The other way to go of course is to try and fix up the existing hash
> index code -- add wal logging, etc. In theory, a customized hash
> structure should be able to beat btree all day long which argues to
> continue in this direction.

I still haven't seen a solution to locking when a hash table needs
resizing.  All hashing algorithms I can think of at the moment would
require a lock on the whole beast during the resize which makes this
type of index impractical for certain loads (heavy updating).

One solution would be to apply partitioning to the hash table itself
(e.g. have four partitions for the two least significant bits or 16
for the four lest significant bits) and treat them independently.  How
that would interact with WAL I have no idea though.

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Vitalii Tymchyshyn
Дата:
19.09.11 18:19, Robert Klemme написав(ла):
> On Mon, Sep 19, 2011 at 4:04 PM, Merlin Moncure<mmoncure@gmail.com>  wrote:
>>
>> Postgres's hash index implementation used to be pretty horrible -- it
>> stored the pre-hashed datum in the index which, while making it easier
>> to do certain things,  made it horribly slow, and, for all intents and
>> purposes, useless.  Somewhat recently,a lot of work was put in to fix
>> that -- the index now packs the hash code only which made it
>> competitive with btree and superior for larger keys.  However, certain
>> technical limitations like lack of WAL logging and uniqueness hold
>> hash indexing back from being used like it really should be.  In cases
>> where I really *do* need hash indexing, I do it in userland.
>>
>> create table foo
>> (
>>   a_long_field text;
>> );
>> create index on foo(hash(a_long_field));
>>
>> select * from foo where hash(a_long_field) = hash(some_value) and
>> a_long_field = some_value;
>>
>> This technique works fine -- the main disadvantage is that enforcing
>> uniqueness is a PITA but since the standard index doesn't support it
>> either it's no great loss.  I also have the option of getting
>> 'uniqueness' and being able to skip the equality operation if I
>> sacrifice some performance and choose a strong digest.  Until the hash
>> index issues are worked out, I submit that this remains the go-to
>> method to do this.
> Is this approach (storing the hash code in a btree) really faster than
> a regular btree index on "a_long_field"?  And if so, for which kind of
> data and load?

Actually sometimes the field in [potentially] so long, you can't use
regular b-tree because it won't fit in the page. Say, it is "text" type.
If you will create regular index, you will actually limit column value
size to few KB. I am using md5(text) indexes in this case coupled with
rather ugly queries (see above). Native support would be nice.

Best regards, Vitalii Tymchyshyn.

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Tom Lane
Дата:
Robert Klemme <shortcutter@googlemail.com> writes:
> I still haven't seen a solution to locking when a hash table needs
> resizing.  All hashing algorithms I can think of at the moment would
> require a lock on the whole beast during the resize which makes this
> type of index impractical for certain loads (heavy updating).

That seems rather drastically overstated.  The existing hash index code
only needs to hold an index-scope lock for a short interval while it
updates the bucket mapping information after a bucket split.  All other
locks are per-bucket or per-page.  The conflicting share-lockers of the
index-wide lock also only need to hold it for a short time, not for
their whole indexscans.  So that doesn't seem to me to be materially
worse than the locking situation for a btree, where we also sometimes
need exclusive lock on the btree root page, thus blocking incoming
indexscans for a short time.

            regards, tom lane

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Vitalii Tymchyshyn
Дата:
19.09.11 18:19, Robert Klemme написав(ла):
>
> I still haven't seen a solution to locking when a hash table needs
> resizing.  All hashing algorithms I can think of at the moment would
> require a lock on the whole beast during the resize which makes this
> type of index impractical for certain loads (heavy updating).
Sorry for the second reply, I should have not start writing until I've
read all your post. Anyway.
Do you need read lock? I'd say readers could use "old" copy of hash
table up until the moment new bigger copy is ready. This will simply
look like the update is not started yet, which AFAIK is OK for MVCC.
Yep, all the writers will wait.

Another option could be to start background build of larger hash - for
some time your performance will be degraded since you are writing to two
indexes instead of one plus second one is rebuilding, but I'd say low
latency solution is possible here.

One more: I don't see actually why can't you have a "rolling" expand of
hash table. I will try to describe it correct me if I am wrong:
1) The algorithm I am talking about will take "n" bits from hash code to
for hash table. So, during expansion it will double number of baskets.
2) Say, we are going from 2^n = n1 to 2^(n+1) = n2 = n1 * 2 baskets.
Each new pair of baskets will take data from single source basket
depending on the value of new hash bit used. E.g. if n were 2, we've had
4 baskets and new table will have 8 baskets. Everything from old basket
#1 will go into new baskets #2 and #3 depending on hash value.
3) So, we can have a counter on number of baskets processed. Any
operation on any lower numbered basket will go to "new set". Any
operation on any higher numbered basket will go to "old set". Any
operation on currently converting basket will block until conversion is
done.

P.S. Sorry for a lot of possibly dumb thoughts, I don't know why I've
got such a though stream on this topic :)

Best regards, Vitalii Tymchyshyn.

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Claudio Freire
Дата:
On Mon, Sep 19, 2011 at 12:54 PM, Vitalii Tymchyshyn <tivv00@gmail.com> wrote:
> 19.09.11 18:19, Robert Klemme написав(ла):
>>
>> I still haven't seen a solution to locking when a hash table needs
>> resizing.  All hashing algorithms I can think of at the moment would
>> require a lock on the whole beast during the resize which makes this
>> type of index impractical for certain loads (heavy updating).
>
> Sorry for the second reply, I should have not start writing until I've read
> all your post. Anyway.
> Do you need read lock? I'd say readers could use "old" copy of hash table up
> until the moment new bigger copy is ready. This will simply look like the
> update is not started yet, which AFAIK is OK for MVCC.
> Yep, all the writers will wait.

All this would get solved if there's no automatic hash index resizing.

DBAs would have to recreate (possibly concurrently) the hash to make it bigger.

Still, hash has lots of issues. I'm not sure how the hash is
implemented in PG, but usually, for low collision rates pseudorandom
walks are used to traverse collision chains. But pseudorandom
collision chains mean random I/O which is awful for a DB. Those
techniques have not been designed to work with secondary memory.

So, they would have to be adapted to working with secondary memory,
and that means a lot of R&D. It's not impossible, it's just a lot of
work.

I subscribe to the idea that, *in the meanwhile*, without scrapping
the hash index and in parallel to improving it, an option for
transparently-hashed btrees would be valuable.

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Jeff Janes
Дата:
On Mon, Sep 19, 2011 at 8:19 AM, Robert Klemme
<shortcutter@googlemail.com> wrote:
> On Mon, Sep 19, 2011 at 4:04 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>
>> The other way to go of course is to try and fix up the existing hash
>> index code -- add wal logging, etc. In theory, a customized hash
>> structure should be able to beat btree all day long which argues to
>> continue in this direction.
>
> I still haven't seen a solution to locking when a hash table needs
> resizing.  All hashing algorithms I can think of at the moment would
> require a lock on the whole beast during the resize which makes this
> type of index impractical for certain loads (heavy updating).

The current implementation doesn't EX lock the whole beast during
resizing, except a brief one at the beginning (and maybe at the end?)
of the split.  It does EX lock the entire bucket being split for the
duration of the split, though.  The main problem that I see is that
due to the potential for deadlocks, the locks involved have to be
heavy-weight.  Which means the shared locks which non-splitting
processes have to use to block against those EX locks have to be
heavy-weight too, and getting even shared heavy-weight locks means
exclusive light-weight locks, which means contention.

One way out would be to have a special process (probably vacuum) do
all the resizing/splitting, rather than having regular backends doing
it.  It should be possible to make this dead-lock free.

Another would be to make hash indexes only support bit-map scans.
Then the scanning hash-code would never have a need to pass control
back to the executor while still holding a bucket lock.

Another would be to make the current position of the hash index scan
be "refindable" if a split should occur during the scan, so that you
don't need to inhibit splits during a scan of the same bucket.  This
would probably be easy if there were no overflow pages.   But the
overflow pages get shuffled in with each other and with the main
bucket page during a split.  It would take quite some gymnastics to
get around that.



Cheers,

Jeff

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Merlin Moncure
Дата:
On Mon, Sep 19, 2011 at 10:19 AM, Robert Klemme
<shortcutter@googlemail.com> wrote:
> On Mon, Sep 19, 2011 at 4:04 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> On Sun, Sep 18, 2011 at 9:59 AM, Stefan Keller <sfkeller@gmail.com> wrote:
>>> Merlin and Jeff,
>>>
>>> General remark again:It's hard for me to imagine that btree is
>>> superior for all the issues mentioned before. I still believe in hash
>>> index for primary keys and certain unique constraints where you need
>>> equality search and don't need ordering or range search.
>>
>> It is -- but please understand I'm talking about int32 tree vs hash.
>> Hashing as a technique of course is absolutely going to cream btree
>> for all kinds of data because of the advantages of working with
>> decomposed data -- we are all taught that in comp-sci 101 :-).  The
>> debate here is not about the advantages of hashing, but the specific
>> implementation of the hash index used.
>>
>> Postgres's hash index implementation used to be pretty horrible -- it
>> stored the pre-hashed datum in the index which, while making it easier
>> to do certain things,  made it horribly slow, and, for all intents and
>> purposes, useless.  Somewhat recently,a lot of work was put in to fix
>> that -- the index now packs the hash code only which made it
>> competitive with btree and superior for larger keys.  However, certain
>> technical limitations like lack of WAL logging and uniqueness hold
>> hash indexing back from being used like it really should be.  In cases
>> where I really *do* need hash indexing, I do it in userland.
>>
>> create table foo
>> (
>>  a_long_field text;
>> );
>> create index on foo(hash(a_long_field));
>>
>> select * from foo where hash(a_long_field) = hash(some_value) and
>> a_long_field = some_value;
>>
>> This technique works fine -- the main disadvantage is that enforcing
>> uniqueness is a PITA but since the standard index doesn't support it
>> either it's no great loss.  I also have the option of getting
>> 'uniqueness' and being able to skip the equality operation if I
>> sacrifice some performance and choose a strong digest.  Until the hash
>> index issues are worked out, I submit that this remains the go-to
>> method to do this.
>
> Is this approach (storing the hash code in a btree) really faster than
> a regular btree index on "a_long_field"?  And if so, for which kind of
> data and load?
>
>> Now, my point here is that I've noticed that even with the latest
>> optimizations btree seems to still be superior to the hash indexing by
>> most metrics, so that:
>> create table foo
>> (
>>  an_int_field int;
>>  a_long_field text;
>> );
>>
>> create index on foo(an_int_field);
>> create index on foo using hash(a_long_field);
>>
>> On performance grounds alone, the btree index seems to be (from my
>> very limited testing) a better bet.  So I'm conjecturing that the
>> current hash implementation should be replaced with a formalization of
>> the userland technique shown above -- when you request a hash index,
>> the database will silently hash your field and weave it into a btree.
>> It's a hybrid: a hashed btree.
>
> I'd rather call it a "btreefied hash" because you are storing a hash
> but in a btree structure. :-) But that's a detail.  What I find
> worrying is that then there is a certain level of obscuring the real
> nature since "create index ... using hash" is not exactly creating a
> hash table.
>
>> To truly demonstrate if the technique
>> was effective though, it would have to be coded up -- it's only fair
>> to compare if the btree based hash is also double checking the value
>> in the heap which the standard hash index must do.
>
> Right.

so, i was curious, and decided to do some more performance testing. I
created a table like this:

create table foo as select r, r::text || 'acbdefghijklmnop' as b from
generate_series(1,10000000) r;
create index on foo(r);
create index on foo using hash(b);

to simulate the btree/hash hybrid, I cut a pgbench file like so (btree.sql):
\setrandom x 1 100000
select * from foo where r = :x  and b=:x::text || 'acbdefghijklmnop'

and this: for the standard hash (hash.sql):
\setrandom x 1 100000
select * from foo where b=:x::text || 'acbdefghijklmnop'

pgbench -n -c2 -T 10 -f hash.sql etc

On my test machine, hybrid hash eeks out a slight win on in-cache tests:
merlin@mmoncure-ubuntu:~$ pgbench -n -c2 -T 100 -f btree.sql -p 5490
tps = 3250.793656 (excluding connections establishing)

vs
merlin@mmoncure-ubuntu:~$ pgbench -n -c2 -T 100 -f hash.sql -p 5490
tps = 3081.730400 (excluding connections establishing)


To make the test into i/o bound, I change the setrandom from 100000 to
10000000; this produced some unexpected results. The hash index is
pulling about double the tps (~80 vs ~ 40) over the hybrid version.
Well, unless my methodology is wrong, it's unfair to claim btree is
beating hash in 'all cases'. hm.

merlin

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Claudio Freire
Дата:
On Mon, Sep 19, 2011 at 3:43 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> To make the test into i/o bound, I change the setrandom from 100000 to
> 10000000; this produced some unexpected results. The hash index is
> pulling about double the tps (~80 vs ~ 40) over the hybrid version.
> Well, unless my methodology is wrong, it's unfair to claim btree is
> beating hash in 'all cases'. hm.

Is this only selects?
Hash performs badly with updates, IIRC.
I haven't tried in a long while, though.

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Merlin Moncure
Дата:
On Mon, Sep 19, 2011 at 1:53 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
> On Mon, Sep 19, 2011 at 3:43 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> To make the test into i/o bound, I change the setrandom from 100000 to
>> 10000000; this produced some unexpected results. The hash index is
>> pulling about double the tps (~80 vs ~ 40) over the hybrid version.
>> Well, unless my methodology is wrong, it's unfair to claim btree is
>> beating hash in 'all cases'. hm.
>
> Is this only selects?
> Hash performs badly with updates, IIRC.
> I haven't tried in a long while, though.

just selects.  update test is also very interesting -- the only test I
did  for for updates is 'update foo set x=x+1' which was a win for
btree (20-30% faster typically).  perhaps this isn't algorithm induced
though -- lack of wal logging could actually hurt time to commit
because it deserializes i/o.

merlin

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

От
Tom Lane
Дата:
Merlin Moncure <mmoncure@gmail.com> writes:
> just selects.  update test is also very interesting -- the only test I
> did  for for updates is 'update foo set x=x+1' which was a win for
> btree (20-30% faster typically).  perhaps this isn't algorithm induced
> though -- lack of wal logging could actually hurt time to commit
> because it deserializes i/o.

In 9.1+, you could remove WAL from the comparison by doing the tests on
an unlogged table.

            regards, tom lane