Re: Hash index todo list item

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Hash index todo list item
Дата
Msg-id 46E13C39.8060005@enterprisedb.com
обсуждение исходный текст
Ответ на Re: Hash index todo list item  (Neil Conway <neilc@samurai.com>)
Ответы Re: Hash index todo list item  (Kenneth Marshall <ktm@rice.edu>)
Список pgsql-hackers
Neil Conway wrote:
> You might find this patch useful:
> 
>     http://archives.postgresql.org/pgsql-patches/2005-05/msg00164.php

Oh, I had forgot about that.

> It implements the "just store the hash in the index" idea; it also sorts
> the entries in a bucket by the hash value, which allows binary search to
> be used to locate candidate matches.
> 
> I was surprised that this didn't result in a performance improvement for
> the benchmarks that I ran, but I never got around to investigating
> further (either running more benchmarks or checking whether there was a
> bug in the implementation).

You did get a smaller index, which has cache benefits with larger
tables. You didn't compare the size comparison against b-tree, but a
smaller index is a good thing.

I think you left some money on the table on that front. Since all the
HashItems are fixed size, you can get rid of the line pointers
altogether. Given that a sizeof(HashItemData) is 8 bytes, and a line
pointer is 4 bytes, that should give a further 1/3 reduction in index
size. If you're willing to give up on the alignment of HashItemData, you
can get it down to 6 bytes.

Even from a CPU point of view, as the table becomes bigger and the
b-tree becomes deeper, the difference between O(1) and O(log n) lookup
starts to become more significant.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Martijn van Oosterhout
Дата:
Сообщение: Re: [FEATURE REQUEST] Streaming Onlinebackup (Maybe OFFTOPIC)
Следующее
От: Hannu Krosing
Дата:
Сообщение: Re: [FEATURE REQUEST] Streaming Onlinebackup (Maybe OFFTOPIC)