Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

Поиск

Список

Период

Сортировка

От	Vitalii Tymchyshyn
Тема	Re: Hash index use presently(?) discouraged since 2005: revive or bury it?
Дата	19 сентября 2011 г. 12:29:09
Msg-id	4E775FB9.6050704@gmail.com обсуждение исходный текст
Ответ на	Re: Hash index use presently(?) discouraged since 2005: revive or bury it? (Robert Klemme <shortcutter@googlemail.com>)
Список	pgsql-performance

Дерево обсуждения

19.09.11 18:19, Robert Klemme написав(ла):
> On Mon, Sep 19, 2011 at 4:04 PM, Merlin Moncure<mmoncure@gmail.com>  wrote:
>>
>> Postgres's hash index implementation used to be pretty horrible -- it
>> stored the pre-hashed datum in the index which, while making it easier
>> to do certain things,  made it horribly slow, and, for all intents and
>> purposes, useless.  Somewhat recently,a lot of work was put in to fix
>> that -- the index now packs the hash code only which made it
>> competitive with btree and superior for larger keys.  However, certain
>> technical limitations like lack of WAL logging and uniqueness hold
>> hash indexing back from being used like it really should be.  In cases
>> where I really *do* need hash indexing, I do it in userland.
>>
>> create table foo
>> (
>>   a_long_field text;
>> );
>> create index on foo(hash(a_long_field));
>>
>> select * from foo where hash(a_long_field) = hash(some_value) and
>> a_long_field = some_value;
>>
>> This technique works fine -- the main disadvantage is that enforcing
>> uniqueness is a PITA but since the standard index doesn't support it
>> either it's no great loss.  I also have the option of getting
>> 'uniqueness' and being able to skip the equality operation if I
>> sacrifice some performance and choose a strong digest.  Until the hash
>> index issues are worked out, I submit that this remains the go-to
>> method to do this.
> Is this approach (storing the hash code in a btree) really faster than
> a regular btree index on "a_long_field"?  And if so, for which kind of
> data and load?

Actually sometimes the field in [potentially] so long, you can't use
regular b-tree because it won't fit in the page. Say, it is "text" type.
If you will create regular index, you will actually limit column value
size to few KB. I am using md5(text) indexes in this case coupled with
rather ugly queries (see above). Native support would be nice.

Best regards, Vitalii Tymchyshyn.

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?