Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Дата
Msg-id 20211116023918.5g6d6izevronx76m@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum  (Andres Freund <andres@anarazel.de>)
Ответы Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-bugs
Hi,

On 2021-11-10 19:08:21 -0800, Andres Freund wrote:
> On 2021-11-09 19:07:02 -0800, Peter Geoghegan wrote:
> > @@ -700,11 +720,18 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum, PruneState *prstate)
> >          }
> >
> >          /*
> > -         * Remember the last DEAD tuple seen.  We will advance past
> > -         * RECENTLY_DEAD tuples just in case there's a DEAD one after them;
> > -         * but we can't advance past anything else.  We have to make sure that
> > -         * we don't miss any DEAD tuples, since DEAD tuples that still have
> > -         * tuple storage after pruning will confuse VACUUM.
> > +         * Remember the last DEAD tuple seen from this HOT chain.
> > +         *
> > +         * We expect to sometimes find a RECENTLY_DEAD tuple after a DEAD
> > +         * tuple.  When this happens, the RECENTLY_DEAD tuple will be treated
> > +         * as if it was DEAD all along.  Manage that now.
>
> I now actually wonder why this is correct. There's another comment about it:
>
> > We also prune any RECENTLY_DEAD tuples preceding a DEAD tuple.
> > * This is OK because a RECENTLY_DEAD tuple preceding a DEAD tuple is really
>
> But that doesn't justify very much.
>
> What prevents the scenario that some other backend e.g. has a snapshot with
> xmin=xmax=RECENTLY_DEAD-row. If the RECENTLY_DEAD row has an xid that is later
> than the DEAD row, this afaict would make it perfectly legal to prune the DEAD
> row, but *not* the RECENTLY_DEAD one.

I think the code is actually correct if awfully commented. I.e. the above
scenario cannot happen in a harmful situation. The relevant piece is that the
set of "valid" snapshots is limited (for a specific cluster history).

A row can only be RECENTLY_DEAD if the deleting / updating transaction
committed.

An intermediary RECENTLY_DEAD row followed by a DEAD row would only be visible
to a snapshot that either had

1) row.xmax >= Snapshot.xmax
2) row.xmax in Snapshot.xip.

There cannot be a snapshot that sees the RECENTLY_DEAD row while also seeing
the subsequent row as DEAD, because the existence of one of the two conditions
would have held back the xid horizon:

1) is not possible, because Snapshot.xmin is <= Snapshot.xmax, and
  Snapshot.xmin >= MyProc->xmin. The subsequent row could not be DEAD.

2) is not possible, because .xip only contains xids >= Snapshot.xmin.


IOW, there *can* be a snapshot with xmin=xmax=RECENTLY_DEAD-row, but it'd not
see the RECENTLY_DEAD row anyway.

Greetings,

Andres Freund



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: pg_restore depending on user functions
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Logical Replication not working for few Tables