Re: Index Skip Scan

Поиск
Список
Период
Сортировка
От Dmitry Dolgov
Тема Re: Index Skip Scan
Дата
Msg-id 20200122160441.ograupmzyytde2mi@localhost
обсуждение исходный текст
Ответ на RE: Index Skip Scan  (Floris Van Nee <florisvannee@Optiver.com>)
Ответы Re: Index Skip Scan
Список pgsql-hackers
> On Wed, Jan 22, 2020 at 07:50:30AM +0000, Floris Van Nee wrote:
>
> Anyone please correct me if I'm wrong, but I think one case where the current patch relies on some data from the page
ithas locked before it in checking this hi/lo key. I think it's possible for the following sequence to happen. Suppose
wehave a very simple one leaf-page btree containing four elements: leaf page 1 = [2,4,6,8]
 
> We do a backwards index skip scan on this and have just returned our first tuple (8). The buffer is left pinned but
unlocked.Now, someone else comes in and inserts a tuple (value 5) into this page, but suppose the page happens to be
full.So a page split occurs. As far as I know, a page split could happen at any random element in the page. One of the
situationswe could be left with is:
 
> Leaf page 1 = [2,4]
> Leaf page 2 = [5,6,8]
> However, our scan is still pointing to leaf page 1.

In case if we just returned a tuple, the next action would be either
check the next page for another key or search down to the tree. Maybe
I'm missing something in your scenario, but the latter will land us on a
required page (we do not point to any leaf here), and before the former
there is a check for high/low key. Is there anything else missing?

> Now that I look at the patch again, I fear there currently may also be such a dependency in the "Advance forward but
readbackward"-case. It saves the offset number of a tuple in a variable, then does a _bt_search (releasing the lock and
pinon the page). At this point, anything can happen to the tuples on this page - the page may be compacted by vacuum
suchthat the offset number you have in your variable does not match the actual offset number of the tuple on the page
anymore.Then, at the check for (nextOffset == startOffset) later, there's a possibility the offsets are different even
thoughthey relate to the same tuple.
 

Interesting point. The original idea here was to check that we're not
returned to the same position after jumping, so maybe instead of offsets
we can check a tuple we found.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Sergei Kornilov
Дата:
Сообщение: Re: pgsql: walreceiver uses a temporary replication slot by default
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] kqueue