Re: Index Skip Scan

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: Index Skip Scan
Дата
Msg-id 20190131.153153.83078698.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: Index Skip Scan  (Dmitry Dolgov <9erthalion6@gmail.com>)
Ответы Re: Index Skip Scan  (Jesper Pedersen <jesper.pedersen@redhat.com>)
Re: Index Skip Scan  (James Coleman <jtc331@gmail.com>)
Re: Index Skip Scan  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-hackers
Hello.

At Wed, 30 Jan 2019 18:19:05 +0100, Dmitry Dolgov <9erthalion6@gmail.com> wrote in
<CA+q6zcVP18wYiO=aa+fz3GuncuTF52q1sufB7ise37TJPSDK1w@mail.gmail.com>
> A bit of adjustment after nodes/relation -> nodes/pathnodes.

I had a look on this.

The name "index skip scan" is a different feature from the
feature with the name on other prodcuts, which means "index scan
with postfix key (of mainly of multi column key) that scans
ignoring the prefixing part" As Thomas suggested I'd suggest that
we call it "index hop scan". (I can accept Hopscotch, either:p)

Also as mentioned upthread by Peter Geoghegan, this could easly
give worse plan by underestimation. So I also suggest that this
has dynamic fallback function.  In such perspective it is not
suitable for AM API level feature.

If all leaf pages are on the buffer and the average hopping
distance is less than (expectedly) 4 pages (the average height of
the tree), the skip scan will lose. If almost all leaf pages are
staying on disk, we could win only by 2-pages step (skipping over
one page).

=====
As I'm writing the above, I came to think that it's better
implement this as an pure executor optimization.

Specifically, let _bt_steppage count the ratio of skipped pages
so far then if the ratio exceeds some threshold (maybe around
3/4) it gets into hopscotching mode, where it uses index scan to
find the next page (rather than traversing). As mentioned above,
I think using skip scan to go beyond the next page is a good
bet. If the success ration of skip scans goes below some
threshold (I'm not sure for now), we should fall back to
traversing.

Any opinions?

====

Some comments on the patch below.

+     skip scan approach, which is based on the idea of
+     <ulink url="https://wiki.postgresql.org/wiki/Free_Space_Map_Problems">
+     Loose index scan</ulink>. Rather than scanning all equal values of a key,
+     as soon as a new value is found, it will search for a larger value on the

I'm not sure it is a good thing to put a pointer to rather
unstable source in the documentation.


This adds a new AM method but it seems avaiable only for ordered
indexes, specifically btree. And it seems that the feature can be
implemented in btgettuple since btskip apparently does the same
thing. (I agree to Robert in the point in [1]).

[1] https://www.postgresql.org/message-id/CA%2BTgmobb3uN0xDqTRu7f7WdjGRAXpSFxeAQnvNr%3DOK5_kC_SSg%40mail.gmail.com


Related to the above, it seems better that the path generation of
skip scan is a part of index scan. Whether skip scan or not is a
matter of index scan itself.


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Arseny Sher
Дата:
Сообщение: Too rigorous assert in reorderbuffer.c
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Unused parameters & co in code