Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)
От | Craig Ringer |
---|---|
Тема | Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint) |
Дата | |
Msg-id | CAMsr+YEHL1FoFeiFA5b5CuwxYMC7V243j0Kq7394MpoLmx66VQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [PATCH] Transaction traceability - txid_status(bigint) (Craig Ringer <craig@2ndquadrant.com>) |
Ответы |
Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)
(Robert Haas <robertmhaas@gmail.com>)
|
Список | pgsql-hackers |
On 27 September 2016 at 09:23, Craig Ringer <craig@2ndquadrant.com> wrote: > On 21 September 2016 at 22:16, Robert Haas <robertmhaas@gmail.com> wrote: >> >> It might not be too hard to add a second copy of oldestXid in shared >> memory that is updated before truncation rather than afterward... but >> yeah, like you, I'm not finding this nearly as straightforward as >> might have been hoped. > > Yeah. > > I suspect that'll be the way to go, to add another copy that's updated > before clog truncation. It just seems ... unclean. Like it shouldn't > be necessary, something else isn't right. But it's probably the lowest > pain option. OK, so summarizing the problem: slru.c and clog.c have no soft-failure entrypoints where we can look it up and fail gracefully if the xid isn't in the clog. vac_truncate_clog() calls TruncateCLOG to chop SLRU pages from the clog before it takes XidGenLock to advance oldestXid. So we cannot use oldestXid protected by XidGenLock as an interlock against concurrent clog truncation to prevent SLRU access errors looking up an xid, and there's no other candidate lock. This hasn't been an issue before because nobody looks up arbitrary user-supplied XIDs in clog, we only look up things that're protected by datfrozenxid etc. But the whole point of txid_status is to look up user-supplied xids. We can't just take ClogControlLock from txid_status() to block truncation because clog.c expects to own that lock, and takes it (via slru.c) in TransactionIdGetStatus, with no way to pass an already_locked state. We'd self-deadlock. Adding a second copy of oldestXid - say pendingOldestXid - won't actually help us unless we also take some suitable LWLock before updating it, otherwise truncation can continue after we look at pendingOldestXid but before we do the clog lookup. That means an extra LWLock for each clog truncation, but compared to the I/O done during clog truncation and the cost of the SlruScanDirectory() pre-check done by TruncateCLOG it's nothing. We could take XidGenLock twice, once to update this new pendingOldestXid field and once to update oldestXid, but that's a highly contested lock I'd rather not mess with even on a path that's not hit much. Instead, I've added a new LWLock, ClogTruncationLock, for that purpose. vac_truncate_clog() takes it if it decides to attempt clog truncation. This lock is held throughout the whole process of clog truncation and oldestXid advance, so there's no need for a new pendingOldestXid field in ShmemVariableCache. We just take the lock then look at oldestXid, knowing that it's guaranteed to correspond to an existing clog page that won't go away while we're looking. ClogTruncationLock is utterly uncontested so it's going to have no meaningful impact compared to all the work we do scanning the clog to decide whether we're even going to try truncating it, etc. (BTW, it seems like a pity that lwlocknames.txt doesn't have comments on each lock. We could have src/backend/storage/lmgr/generate-lwlocknames.pl transform the # comments into /* comments */ on the generated header. Thoughts?) -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Andreas KarlssonДата:
Сообщение: Re: [HACKERS] Why does plpython delay composite type resolution?