Re: HOT chain validation in verify_heapam()

Поиск
Список
Период
Сортировка
От Himanshu Upadhyaya
Тема Re: HOT chain validation in verify_heapam()
Дата
Msg-id CAPF61jBjb+VafRhsmRg2gN1mX89+iPyPrfa6ZdUVtoX7T35TOg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: HOT chain validation in verify_heapam()  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: HOT chain validation in verify_heapam()  (Aleksander Alekseev <aleksander@timescale.com>)
Список pgsql-hackers


On Wed, Sep 7, 2022 at 12:11 AM Robert Haas <robertmhaas@gmail.com> wrote:

I think the check should be written like this:

!TransactionIdEquals(pred_xmin, curr_xmin) && !TransctionIdDidCommit(pred_xmin)

The TransactionIdEquals check should be done first for the reason you
state: it's cheaper.

I think that we shouldn't be using TransactionIdDidAbort() at all,
because it can sometimes return false even when the transaction
actually did abort. See test_lockmode_for_conflict() and
TransactionIdIsInProgress() for examples of logic that copes with
this. What's happening here is that TransactionIdDidAbort doesn't ever
get called if the system crashes while a transaction is running. So we
can use TransactionIdDidAbort() only as an optimization: if it returns
true, the transaction is definitely aborted, but if it returns false,
we have to check whether it's still running. If not, it aborted
anyway.

TransactionIdDidCommit() does not have the same issue. A transaction
can abort without updating CLOG if the system crashes, but it can
never commit without updating CLOG. If the transaction didn't commit,
then it is either aborted or still in progress (and we don't care
which, because neither is an error here).

As to whether the existing formulation of the test has an error
condition, you're generally right that we should test
TransactionIdIsInProgress() before TransactionIdDidCommit/Abort,
because we during commit or abort, we first set the status in CLOG -
which is queried by TransactionIdDidCommit/Abort - and only afterward
update the procarray - which is queried by TransactionIdIsInProgress.
So normally TransactionIdIsInProgress should be checked first, and
TransactionIdDidCommit/Abort should only be checked if it returns
false, at which point we know that the return values of the latter
calls can't ever change. Possibly there is an argument for including
the TransactionIdInProgress check here too:

!TransactionIdEquals(pred_xmin, curr_xmin) &&
(TransactionIdIsInProgress(pred_xmin) ||
!TransctionIdDidCommit(pred_xmin))

...but I don't think it could change the answer. Most places that
check TransactionIdIsInProgress() first are concerned with MVCC
semantics, and here we are not. I think the only effects of including
or excluding the TransactionIdIsInProgress() test are (1) performance,
in that searching the procarray might avoid expense if it's cheaper
than searching clog, or add expense if the reverse is true and (2)
slightly changing the time at which we're first able to detect this
form of corruption. I am inclined to prefer the simpler form of the
test without TransactionIdIsInProgress() unless someone can say why we
shouldn't go that route.

Done, updated in the v3 patch.


--
Regards,
Himanshu Upadhyaya
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Himanshu Upadhyaya
Дата:
Сообщение: Re: HOT chain validation in verify_heapam()
Следующее
От: Aleksander Alekseev
Дата:
Сообщение: Re: Add common function ReplicationOriginName.