Re: Index Skip Scan

Поиск
Список
Период
Сортировка
От Alexander Kuzmenkov
Тема Re: Index Skip Scan
Дата
Msg-id 522092c7-3cb7-a1ec-410b-dfa97ae722f3@postgrespro.ru
обсуждение исходный текст
Ответ на Re: Index Skip Scan  (Jesper Pedersen <jesper.pedersen@redhat.com>)
Ответы Re: Index Skip Scan  (Jesper Pedersen <jesper.pedersen@redhat.com>)
Список pgsql-hackers

Hi Jesper,

While testing this patch I noticed that current implementation doesn't perform well when we have lots of small groups of equal values. Here is the execution time of index skip scan vs unique over index scan, in ms, depending on the size of group. The benchmark script is attached.

group size    skip        unique
1             2,293.85    132.55
5             464.40      106.59
10            239.61      102.02
50            56.59       98.74
100           32.56       103.04
500           6.08        97.09

So, the current implementation can lead to performance regression, and the choice of the plan depends on the notoriously unreliable ndistinct statistics. The regression is probably because skip scan always does _bt_search to find the next unique tuple. I think we can improve this, and the skip scan can be strictly faster than index scan regardless of the data. As a first approximation, imagine that we somehow skipped equal tuples inside _bt_next instead of sending them to the parent Unique node. This would already be marginally faster than Unique + Index scan. A more practical implementation would be to remember our position in tree (that is, BTStack returned by _bt_search) and use it to skip pages in bulk. This looks straightforward to implement for a tree that does not change, but I'm not sure how to make it work with concurrent modifications. Still, this looks a worthwhile direction to me, because if we have a strictly faster skip scan, we can just use it always and not worry about our unreliable statistics. What do you think?

-- 
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: Protect syscache from bloating with negative cache entries
Следующее
От: Laurenz Albe
Дата:
Сообщение: Re: Loaded footgun open_datasync on Windows