Re: GIST optimization to limit calls to operator on sub nodes

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: GIST optimization to limit calls to operator on sub nodes
Дата
Msg-id 3970.1404137042@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: GIST optimization to limit calls to operator on sub nodes  (Pujol Mathieu <mathieu.pujol@realfusio.com>)
Ответы Re: GIST optimization to limit calls to operator on sub nodes  (Pujol Mathieu <mathieu.pujol@realfusio.com>)
Список pgsql-performance
Pujol Mathieu <mathieu.pujol@realfusio.com> writes:
> Le 29/06/2014 22:30, Tom Lane a �crit :
>> I don't actually understand what's being requested here that the
>> NotConsistent case doesn't already cover.

> The NotConsistent case is correctly covered, the sub nodes are not
> tested because I know that no child could pass the consistent_test.
> The MaybeConsistent case is also correctly covered, all sub nodes are
> tested because I don't know which sub nodes will pass the consistent_test.
> My problem is with the FullyConsistent, because when I test a node I can
> know that all it's childs nodes and leaves will pass the test, so I want
> to notify GIST framework that it can't skip consistent test on those
> nodes. Like we can notify it when testing a leaf that it could skip
> consistent test on the row. Maybe I miss something on the API to do
> that. On my tests, the "recheck_flag" works only for leaves.

Hm ... that doesn't seem like a case that'd come up often enough to be
worth complicating the APIs for, unless maybe you are expecting a lot
of exact-duplicate index entries.  If you are, you might find that GIN
is a better fit for your problem than GIST --- it's designed to be
efficient for lots-of-duplicates.

Another view of this is that if you can make exact satisfaction checks
at upper-page entries, you're probably storing too much information in
the index entries (and thereby bloating the index).  The typical tradeoff
in GIST indexes is something like storing bounding boxes for geometric
objects --- which is necessarily lossy, but it results in small indexes
that are fast to search.  It's particularly important for upper-page
entries to be small, so that fanout is high and you have a better chance
of keeping all the upper pages in cache.

If you've got a compelling example where this actually makes sense,
I'd be curious to hear the details.

            regards, tom lane


В списке pgsql-performance по дате отправления:

Предыдущее
От: Craig Ringer
Дата:
Сообщение: Volatility - docs vs behaviour?
Следующее
От: Soni M
Дата:
Сообщение: Re: Postgres Replaying WAL slowly