Обсуждение: HASH: Out of overflow pages. Out of luck
Hi Everybody! I'm sorry I dropped out for so long -- was switching jobs and was on the verge of deportation for a while. Still not entirely back to normal, but can raise my head and look around. The first thing I discovered in the current version (7.2.1) -- as well as in 7.1.3 -- seems to be an old problem with the hash am. It's clustering too much. The data I'm trying to index is of the type text, so it uses hashvarlena(). The values are something like this: REC00014 REC00015 .... REC02463 RBS00001 RBS00002 .... RBS11021 .... It's like several very long sequences with multiple gaps. With the existing hashvarlena(), I can index about 2M rows, but not much more -- it comes back with the message about the overflow pages. hashvarlena() responds to this input in a piecewise linear fashion. That is, the succession 'REC00010' .. 'REC00019' results in a linear order 1346868191 .. 1346868200. The next ten hash values will also be sequential, but in a different range. The quality of the hash function can be a factor here, but probably not the major one. I was able to jack my limit up to over 3.7M rows by reversing the order of bytes in hashvarlena() -- I made the pointer go down instead of up. That spread the hash values more sparsely, but it failed with the same message when I fed it with more than 4M rows. I saw Tom answer a similar question a year ago, by saying that the hash access method is poorly supported and that there is no advantage to using it. I am not sure about the former, but the latter is not entirely true: we saw at least 20% gain in performance when we switched from btree to hash, and my boss considers 20% a big enough improvement. Besides, he knows the database theory and he is a long-time BerkelyDB user, and in his world, hash is greatly superior to btree, so he is wondering why are the postgres implementations so close. Besides, it's a tough challenge to explain it to a Libertarian that he'd better not do something. I guess we can make such people happy by either fixing hash, or by making btree very much worse -- whichever is easier :) Seriously, though, if anybody wants to look at the problem, I saved the set of keys that caused it as http://home.xnet.com/~selkovjr/tmp/prot.gz Also, I heard that other database systems have special access methods for sequences, that address this issue, and I heard as well that without an option to use an arbitrary hash function instead of a single built-in, such problems are bound to happen with particular data sets. How true is that? If a better access method exists, what is it? Thank you, Gene
> I saw Tom answer a similar question a year ago, by saying that the > hash access method is poorly supported and that there is no advantage > to using it. I am not sure about the former, but the latter is not > entirely true: we saw at least 20% gain in performance when we > switched from btree to hash, and my boss considers 20% a big enough > improvement. Besides, he knows the database theory and he is a > long-time BerkelyDB user, and in his world, hash is greatly superior > to btree, so he is wondering why are the postgres implementations so > close. Besides, it's a tough challenge to explain it to a Libertarian > that he'd better not do something. > > I guess we can make such people happy by either fixing hash, or by > making btree very much worse -- whichever is easier :) Cool. I'm sure that making btree much worse is definitely within my ability - I'll submit a patch shortly with new pg_bench results. Chris
On Mon, 2002-08-05 at 07:26, Gene Selkov, Jr. wrote: > Hi Everybody! > > I'm sorry I dropped out for so long -- was switching jobs and was on > the verge of deportation for a while. Still not entirely back to > normal, but can raise my head and look around. > > The first thing I discovered in the current version (7.2.1) -- as well > as in 7.1.3 -- seems to be an old problem with the hash am. It's > clustering too much. ... > The quality of the hash function can be a factor here, but probably > not the major one. I was able to jack my limit up to over 3.7M rows by > reversing the order of bytes in hashvarlena() -- I made the pointer go > down instead of up. That spread the hash values more sparsely, but it > failed with the same message when I fed it with more than 4M rows. > > I saw Tom answer a similar question a year ago, by saying that the > hash access method is poorly supported and that there is no advantage > to using it. I am not sure about the former, but the latter is not > entirely true: we saw at least 20% gain in performance when we > switched from btree to hash, and my boss considers 20% a big enough > improvement. Besides, he knows the database theory and he is a > long-time BerkelyDB user, As BerkelyDB came into being by splitting index methods out of an early version of Postgres, it should still have some similar structure left, so one possibility is to check what they are doing to not be that bad. Have you tried to index your dataset into a BerkelyDB database ? > and in his world, hash is greatly superior > to btree, so he is wondering why are the postgres implementations so > close. Besides, it's a tough challenge to explain it to a Libertarian > that he'd better not do something. ------------- Hannu
"Gene Selkov, Jr." <selkovjr@xnet.com> writes:
> I saw Tom answer a similar question a year ago, by saying that the
> hash access method is poorly supported and that there is no advantage
> to using it. I am not sure about the former, but the latter is not
> entirely true: we saw at least 20% gain in performance when we
> switched from btree to hash, and my boss considers 20% a big enough
> improvement. Besides, he knows the database theory and he is a
> long-time BerkelyDB user, and in his world, hash is greatly superior
> to btree, so he is wondering why are the postgres implementations so
> close. Besides, it's a tough challenge to explain it to a Libertarian
> that he'd better not do something.
Hey, if he wants to fix the hash code, more power to him ;-).  Patches
will be gladly accepted.
The real problem with the PG hash index code is that approximately zero
effort has been put into it since the code left Berkeley, while quite
a lot of work has been put into the btree code.  Thus, for most purposes
the btree index type leaves hash in the dust, no matter what theoretical
concerns may say.
If you or he would like to expend the effort to bring hash indexing up
to speed, I'll surely not stand in your way.  But be advised that
there's a lot of work to be done there (concurrency issues and WAL
support being at the top of my list) ... are you sure you believe that
hash is worth the effort?
        regards, tom lane
			
		> From: Hannu Krosing <hannu@tm.ee> > > As BerkelyDB came into being by splitting index methods out of an early > version of Postgres, it should still have some similar structure left, > so one possibility is to check what they are doing to not be that bad. > > Have you tried to index your dataset into a BerkelyDB database ? Yes, it works fine with BerkelyDB. I looked at both codes and I was stupefied with their complexity. Even if there is a similar structure, it must be very well disguised. Some of the data structures resemble each other's counterparts; the only piece that is exactly the same as one of the five BerkelyDB's hash functions. The only useful experiment that I feel I am capable of making is trying their __ham_hash5() function, with they claim is generally better than the other four, for most purposes. But they warn in their comments that there is no such thing as "a hash function" -- there must be one for each purpose. So another experiment I might try is writing an adapter for a user-supplied hash -- that might help in figuring out the role of the hash function in bin overflows. That should be easy enough to do, but fixing or re-writing the access method itself -- I'm sorry: the level of complexity scares me. Appears like a couple man-months (those Mythical Man-Months :). --Gene
On Wed, Aug 07, 2002 at 12:41:04AM -0500, selkovjr@xnet.com wrote: > Some of the data structures resemble each other's counterparts; > the only piece that is exactly the same as one of the five > BerkelyDB's hash functions. FYI, the development version of PostgreSQL uses a completely different (and higher quality) hash function. > The only useful experiment that I feel I am capable of making is > trying their __ham_hash5() function, with they claim is generally > better than the other four, for most purposes. I'm skeptical that changing the hash function would make a significant difference to the usability of hash indexes. At the very least, the reproducible deadlocks under concurrent access need to be fixed, as well as a host of other issues. Cheers, Neil -- Neil Conway <neilconway@rogers.com> PGP Key ID: DB3C29FC