Re: _bt_split(), and the risk of OOM before its critical section

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: _bt_split(), and the risk of OOM before its critical section
Дата
Msg-id CAH2-Wz=D8SHWrkzeVW2u_=XnRA=XenMu=N3VkirxAKY1N0vWzg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: _bt_split(), and the risk of OOM before its critical section  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: _bt_split(), and the risk of OOM before its critical section  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
On Mon, May 6, 2019 at 4:11 PM Peter Geoghegan <pg@bowt.ie> wrote:
> I am tempted to move the call to _bt_truncate() out of _bt_split()
> entirely on HEAD, possibly relocating it to
> nbtsplitloc.c/_bt_findsplitloc(). That way, there is a clearer
> separation between how split points are chosen, suffix truncation, and
> the mechanical process of executing a legal page split.

I decided against that -- better to make it clear how truncation deals
with space overhead within _bt_split(). Besides, the resource
management around sharing a maybe-palloc()'d high key across module
boundaries seems complicated, and best avoided.

Attached draft patch for HEAD fixes the bug by organizing _bt_split()
into clear phases. _bt_split() now works as follows, which is a little
different:

*  An initial phase that is entirely concerned with the left page temp
buffer itself -- initializes its special area.

* Suffix truncation to get left page's new high key, and then add it
to left page.

* A phase that is mostly concerned with initializing the right page
special area, but also finishes off one or two details about the left
page that needed to be delayed. This is also where the "shadow
critical section" begins. Note also that this is where
_bt_vacuum_cycleid() is called, because its contract actually
*requires* that caller has a buffer lock on both pages at once. This
should not be changed on the grounds that _bt_vacuum_cycleid() might
fail (nor for any other reason).

* Add new high key to right page if needed. (No change, other than the
fact that it happens later now.)

* Add other items to both leftpage and rightpage. Critical section
that copies leftpage into origpage buffer. (No changes here.)

I suppose I'm biased, but I prefer the new approach anyway. Adding the
left high key first, and then the right high key seems simpler and
more logical. It emphasizes the similarities and differences between
leftpage and rightpage. Furthermore, this approach fixes the
theoretical risk of leaving behind a minimally-initialized nbtree page
that has existed since 2010. We don't allocated *any* memory after the
point that a new rightpage buffer is acquired.

I suppose that this will need to be backpatched.

Thoughts?
--
Peter Geoghegan

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Melanie Plageman
Дата:
Сообщение: Re: accounting for memory used for BufFile during hash joins
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: We're leaking predicate locks in HEAD