Re: Questions about btree_gin vs btree_gist for low cardinality columns

Поиск
Список
Период
Сортировка
От Morris de Oryx
Тема Re: Questions about btree_gin vs btree_gist for low cardinality columns
Дата
Msg-id CAKqncci+mtt-_5fdcOiNaxvtJF1ij5_dOTfda1t41mN0yVA=fw@mail.gmail.com
обсуждение исходный текст
Ответ на RE: Questions about btree_gin vs btree_gist for low cardinalitycolumns  (Steven Winfield <Steven.Winfield@cantabcapital.com>)
Ответы RE: Questions about btree_gin vs btree_gist for low cardinalitycolumns  (Steven Winfield <Steven.Winfield@cantabcapital.com>)
Список pgsql-general
I didn't notice Bloom filters in the conversation so far, and have been waiting for years for a good excuse to use a Bloom filter. I ran into them years back in Splunk, which is a distributed log store. There's an obvious benefit to a probabalistic tool like a Bloom filter there since remote lookup (and/or retrieval from cold storage) is quite expensive, relative to a local, hashed lookup. I haven't tried them in Postgres.

In the case of a single column with a small set of distinct values over a large set of rows, how would a Bloom filter be preferable to, say, a GIN index on an integer value? 

I have to say, this is actually a good reminder in my case. We've got a lot of small-distinct-values-big-rows columns. For example, "server_id", "company_id", "facility_id", and so on. Only a handful of parent keys with many millions of related rows. Perhaps it would be conceivable to use a Bloom index to do quick lookups on combinations of such values within the same table. I haven't tried Bloom indexes in Postgres, this might be worth some experimenting.

Is there any thought in the Postgres world of adding something like Oracle's bitmap indexes?

В списке pgsql-general по дате отправления:

Предыдущее
От: Karsten Hilbert
Дата:
Сообщение: CREATE DATABASE ... TEMPLATE ... vs checksums
Следующее
От: Steven Winfield
Дата:
Сообщение: RE: Questions about btree_gin vs btree_gist for low cardinalitycolumns