Re: Hash Indexes

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Hash Indexes
Дата
Msg-id CAA4eK1+JzHUi5o0vKBg_=DpmFMuy_1wDtZM2EfLoAKAxZ8gMcQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Hash Indexes  (Andres Freund <andres@anarazel.de>)
Ответы Re: Hash Indexes  (Mark Kirkwood <mark.kirkwood@catalyst.net.nz>)
Список pgsql-hackers
On Thu, Sep 15, 2016 at 7:53 PM, Andres Freund <andres@anarazel.de> wrote:
> Hi,
>
> On 2016-05-10 17:39:22 +0530, Amit Kapila wrote:
>> For making hash indexes usable in production systems, we need to improve
>> its concurrency and make them crash-safe by WAL logging them.
>
> One earlier question about this is whether that is actually a worthwhile
> goal.  Are the speed and space benefits big enough in the general case?
>

I think there will surely by speed benefits w.r.t reads for larger
indexes.  All our measurements till now have shown that there is a
benefit varying from 30~60% (for reads) with hash index as compare to
btree, and I think it could be even more if we further increase the
size of index.  On space front, I have not done any detailed study, so
it is not right to conclude anything, but it appears to me that if the
index is on char/varchar column where size of key is 10 or 20 bytes,
hash indexes should be beneficial as they store just hash-key.

> Could those benefits not be achieved in a more maintainable manner by
> adding a layer that uses a btree over hash(columns), and adds
> appropriate rechecks after heap scans?
>

I don't think it can be faster for reads than using real hash index,
but surely one can have that as a workaround.

> Note that I'm not saying that hash indexes are not worthwhile, I'm just
> doubtful that question has been explored sufficiently.
>

I think theoretically hash indexes are faster than btree considering
logarithmic complexity (O(1) vs. O(logn)), also the results after
recent optimizations indicate that hash indexes are faster than btree
for equal to searches.  I am not saying after the recent set of
patches proposed for hash indexes they will be better in all kind of
cases.  It could be beneficial for cases where indexed columns are not
updated heavily.

I think one can definitely argue that we can some optimizations in
btree and make them equivalent or better than hash indexes, but I am
not sure if it is possible for all-kind of use-cases.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: Tackling JsonPath support
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Write Ahead Logging for Hash Indexes