Re: Strangeness in xid allocation / snapshot setup

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Strangeness in xid allocation / snapshot setup
Дата
Msg-id 17046.994949241@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Strangeness in xid allocation / snapshot setup  ("Vadim Mikheev" <vmikheev@sectorbase.com>)
Список pgsql-hackers
"Vadim Mikheev" <vmikheev@sectorbase.com> writes:
> You forget about Tx Old! The point is that changes made by Tx Old *over*
> Tx New' changes effectively make those Tx New' changes *visible* to
> Tx S!

Yes, but what's that got to do with the order of operations in
GetSnapshotData?  The scenario you describe can occur anyway.
Only if Tx Old is running in Read Committed mode, of course.
But if it is, then it's capable of deciding to update a row updated
by Tx New.  Whether Tx S's xmax value is before or after Tx New's ID
is not going to change the behavior of Tx Old.

> And this is how it worked (MyProc->xid was updated while holding
> XXXGenLockId) in varsup.c from version 1.21 (Jun 1999) till
> version 1.36 (Mar 2001) when you occasionally moved it outside
> of locked code part:

Okay, so that part was my error.  I've changed it back.

I'd still like to change GetSnapshotData to read the nextXid before
it acquires SInvalLock, though.  If we did that, it'd be safe to make
GetNewTransactionId be
SpinAcquire(XidGenLockId);xid = nextXid++;SpinAcquire(SInvalLockId);MyProc->xid =
xid;SpinRelease(SInvalLockId);SpinRelease(XidGenLockId);

which is really necessary if you want to avoid assuming that
TransactionIds can be fetched and stored atomically.

Two other changes I think are needed in this area:

* In AbortTransaction, the clearing of MyProc->xid and MyProc->xmin
should be moved down to after RecordTransactionAbort and surrounded
by acquire/release SInvalLock (to avoid atomic fetch/store assumption).

* In HeapTupleSatisfiesVacuum (new tqual.c routine I just made
yesterday, by extracting the tuple time qual checks from vacuum.c),
the order of checking for process status should be    TransactionIdIsInProgress    TransactionIdDidCommit
TransactionIdDidAbort
not the present    TransactionIdDidAbort    TransactionIdDidCommit    TransactionIdIsInProgress

The current way is wrong because if the other process is just in process
of committing, we can get

VACUUM                        other

TransactionIdDidAbort - no
TransactionIdDidCommit - no
                    RecordTransactionCommit();                    MyProc->xid = 0;

TransactionIdIsInProgress - no

whereupon vacuum decides that the other process crashed --- oops.  If
we test TransactionIdIsInProgress *first* in tqual, and xact.c records
commit or abort *before* clearing MyProc->xid, then we cannot have this
race condition where the xact is no longer considered in progress but
not seen to be committed/aborted either.

(Note: this bug is not a problem for existing VACUUM, since it can
never see any tuples from open transactions anyway.  But it will be
fatal for concurrent VACUUM.)

Comments?
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Child itemid in update-chain marked as unused - can't continue repair_frag
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: [PATCHES] Re: [PATCH] Re: Setuid functions