Re: Snapshot related assert failure on skink
От | Tomas Vondra |
---|---|
Тема | Re: Snapshot related assert failure on skink |
Дата | |
Msg-id | 82a7e769-8189-401c-a666-248e9cef2a04@vondra.me обсуждение исходный текст |
Ответ на | Re: Snapshot related assert failure on skink (Heikki Linnakangas <hlinnaka@iki.fi>) |
Ответы |
Re: Snapshot related assert failure on skink
|
Список | pgsql-hackers |
On 3/23/25 17:43, Heikki Linnakangas wrote: > On 21/03/2025 17:16, Andres Freund wrote: >> Am I right in understanding that the only scenario (when in >> STANDBY_SNAPSHOT_READY), where ExpireOldKnownAssignedTransactionIds() >> would >> "legally" remove a transaction, rather than the commit / abort records >> doing >> so, is if the primary crash-restarted while transactions were ongoing? >> >> Those transactions won't have a commit/abort records and thus won't >> trigger >> ExpireTreeKnownAssignedTransactionIds(), which otherwise would have >> updated >> ->xactCompletionCount? > > Correct. > >> When writing the snapshot caching patch, I tried to make sure that all >> the >> places that maintain ->latestCompletedXid also update >> ->xactCompletionCount. Afaict that's still the case. Which, I think, >> means >> that we're also missing calls to MaintainLatestCompletedXidRecovery()? > > Yep, I was just looking at that too. > >> If latestCompletedXid is incorrect visibility determinations end up >> wrong... > > I think it happens to work, because the transactions are effectively > aborted. latestCompletedXid is used to initialize xmax in > GetSnapshotData. If, for example, latestCompletedXid is incorrectly set > to 1000 even though XID 1001 already aborted, a snapshot with xmax=1000 > still correctly considers XID 1001 as "not visible". As soon as a > transaction commits, latestCompletedXid is fixed. > > AFAICS we could skip updating latestCompletedXid on aborts altogether > and rename it to latestCommittedXid. But it's hardly worth optimizing > for aborts. > > For the same reason, I believe the assertion failure we're discussing > here is also harmless. Even though the reused snapshot has a smaller > xmin than expected, all those transactions aborted and are thus not > visible anyway. > > In any case, let's be tidy and fix both latestCompletedXid and > xactCompletionCount. > Thanks for looking into this and pushing the fix. Would it make sense to add a comment documenting this reasoning about not handling aborts? Otherwise someone will get to rediscover this in the future ... regards -- Tomas Vondra
В списке pgsql-hackers по дате отправления: