Re: HOT chain validation in verify_heapam()

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: HOT chain validation in verify_heapam()
Дата
Msg-id CA+TgmoZQGw0A2eS7-uxUoTwSFkDTYpPRFjMm6_Q351kWSic3Xw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: HOT chain validation in verify_heapam()  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
On Wed, Mar 22, 2023 at 5:42 PM Peter Geoghegan <pg@bowt.ie> wrote:
> However, this "second pass over page" loop has roughly the same
> problem as the nearby HeapTupleHeaderIsHotUpdated() coding pattern: it
> doesn't account for the fact that a tuple whose xmin was
> XID_IN_PROGRESS a little earlier on may not be in that state once we
> reach the second pass loop. Concurrent transaction abort needs to be
> accounted for. The loop needs to recheck xmin status, at least in the
> initially-XID_IN_PROGRESS-xmin case.

I don't understand why it would need to do that. If the transaction
has subsequently committed, it doesn't change anything: we'll get the
same report we would have gotten anyway. If the transaction has
subsequently aborted, we'll get a report about corruption that would
not have been reported if the abort had occurred slightly earlier.
However, the abort doesn't remove the corruption, just our ability to
detect it.

Consider a page where TID 1 is a redirect to TID 4; TID 2 is dead; and
TIDs 3 and 4 are heap-only tuples. Any other line pointers on the page
are unused. The only way this can validly happen is if there was a
tuple at TID 2 and it got updated to produce the tuple at TID 3 and
then that transaction aborted. Then it got updated again and produced
the tuple at TID 4 and that transaction was committed. But this
implies that the xmin of TID 3 must be aborted. If we observe that
it's in-progress, we know that the transaction that created TID 3 was
still running after TID 4 had already shown up, which should be
impossible, and so it's fair to report corruption. If the xmin of TID
3 then goes on to abort, a future attempt to verify this page won't be
able to notice the corruption any more, because it won't be able to
prove that TID 3's xmin aborted after TID 4's xmin committed. But a
current attempt to verify this page that has seen TID 3's xmin as
in-progress at any point after locking the page knows for sure that
TID 4 showed up before TID 3's inserter aborted, and that's
inconsistent with any legal order of operations.

--
Robert Haas
EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Daniel Gustafsson
Дата:
Сообщение: Re: Should vacuum process config file reload more often
Следующее
От: Andres Freund
Дата:
Сообщение: Re: POC: Lock updated tuples in tuple_update() and tuple_delete()