Re: ReadRecentBuffer() doesn't scale well

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: ReadRecentBuffer() doesn't scale well
Дата
Msg-id CAH2-WznwevAK-mf1BTO9QBPMee_ghSzxheBYLW6Wc5sseAF30A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: ReadRecentBuffer() doesn't scale well  (Thomas Munro <thomas.munro@gmail.com>)
Ответы Re: ReadRecentBuffer() doesn't scale well  (Thomas Munro <thomas.munro@gmail.com>)
Re: ReadRecentBuffer() doesn't scale well  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Mon, Jun 26, 2023 at 9:40 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> If the goal is to get rid of both pins and content locks, LSN isn't
> enough.  A page might be evicted and replaced by another page that has
> the same LSN because they were modified by the same record.  Maybe
> that's vanishingly rare, but the correct thing would be counter that
> goes up on modification AND eviction.

It should be safe to allow searchers to see a version of the root page
that is out of date. The Lehman & Yao design is very permissive about
these things. There aren't any special cases where the general rules
are weakened in some way that might complicate this approach.
Searchers need to check the high key to determine if they need to move
right -- same as always.

More concretely: A root page can be concurrently split when there is
an in-flight index scan that is about to land on it (which becomes the
left half of the split). It doesn't matter if it's a searcher that is
"between" the meta page and the root page. It doesn't matter if a
level was added. This is true even though nothing that you'd usually
think of as an interlock is held "between levels". The root page isn't
really special, except in the obvious way. We can even have two roots
at the same time (the true root, and the fast root).

--
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: ReadRecentBuffer() doesn't scale well
Следующее
От: Dilip Kumar
Дата:
Сообщение: Re: Improving btree performance through specializing by key shape, take 2