Обсуждение: Multixid SLRU truncation bugs at wraparound

Поиск
Список
Период
Сортировка

Multixid SLRU truncation bugs at wraparound

От
Heikki Linnakangas
Дата:
While working on the reported pg_upgrade failure at multixid wraparound 
[1], I bumped into another bug related to multixid wraparound. If you 
run vacuum freeze, and it advances oldestMultiXactId, and nextMulti has 
just wrapped around to 0, you get this in the log:

> LOG:  MultiXact member wraparound protections are disabled because oldest checkpointed MultiXact 1 does not exist on
disk

Culprit: TruncateMultiXact does this:

     LWLockAcquire(MultiXactGenLock, LW_SHARED);
     nextMulti = MultiXactState->nextMXact;
     nextOffset = MultiXactState->nextOffset;
     oldestMulti = MultiXactState->oldestMultiXactId;
     LWLockRelease(MultiXactGenLock);
     Assert(MultiXactIdIsValid(oldestMulti));

     ...


     /*
      * First, compute the safe truncation point for MultiXactMember. 
This is
      * the starting offset of the oldest multixact.
      *
      * Hopefully, find_multixact_start will always work here, because we've
      * already checked that it doesn't precede the earliest MultiXact 
on disk.
      * But if it fails, don't truncate anything, and log a message.
      */
     if (oldestMulti == nextMulti)
     {
         /* there are NO MultiXacts */
         oldestOffset = nextOffset;
     }
     else if (!find_multixact_start(oldestMulti, &oldestOffset))
     {
         ereport(LOG,
                 (errmsg("oldest MultiXact %u not found, earliest 
MultiXact %u, skipping truncation",
                         oldestMulti, earliest)));
         LWLockRelease(MultiXactTruncationLock);
         return;
     }

Scenario 1: In the buggy scenario, oldestMulti is 1 and nextMulti is 0. 
We should take the "there are NO MultiXacts" codepath in that case, 
because we skip over 0 when assigning multixids. Instead, we call 
find_multixact_start with oldestMulti==1, which returns false because 
multixid 1 hasn't been assigned and the SLRU segment doesn't exist yet. 
There's a similar bug in SetOffsetVacuumLimit().

Scenario 2: In scenario 1 we just fail to truncate the SLRUs and you get 
the log message. But I think there might be more serious variants of 
this. If the SLRU segment exists but the offset for multixid 1 hasn't 
been set yet, find_multixact_start() will return 0 instead, and we will 
proceed with the truncation based on incorrect oldestOffset==0 value, 
possibly removing SLRU segments that are still needed.

Attached is a fix for scenarios 1 and 2, and a test case for scenario 1.

Scenario 3: I also noticed that the above code isn't prepared for the 
race condition that the offset corresponding to 'oldestMulti' hasn't 
been stored in the SLRU yet, even without wraparound. That could 
theoretically happen if the backend executing 
MultiXactIdCreateFromMembers() gets stuck for a long time between the 
calls to GetNewMultiXactId() and RecordNewMultiXact(), but I think we're 
saved by the fact that we only create new multixids while holding a lock 
on a heap page, and a system-wide VACUUM FREEZE that would advance 
oldestMulti would need to lock the heap page too. It's scary though, 
because it could also lead to truncating away members SLRU segments that 
are still needed. The attached patch does *not* address this scenario.

[1] 
https://www.postgresql.org/message-id/CACG%3DezaApSMTjd%3DM2Sfn5Ucuggd3FG8Z8Qte8Xq9k5-%2BRQis-g@mail.gmail.com

- Heikki

Вложения