Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)
Дата
Msg-id CAEepm=1XGJVijxqG2EE=3Tb2bbrQRTvnXA6vZN1FkOZNtH=Lqw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-bugs
On Fri, May 8, 2015 at 6:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> 1. The members SLRU is full all the way up to offsetStopLimit.
> 2. A checkpoint occurs, reaching MultiXactSetSafeTruncate(), which
> sets lastCheckpointedOldest.
> 3. Vacuum runs, calling SetMultiXactIdLimit(), calling
> DetermineSafeOldestOffset(), advancing
> MultiXactState->offsetStopLimit.
> 4. Since offsetStopLimit > lastCheckpointedOffset, it's now possible
> for someone to consume an MXID greater than offsetStopLimit, making
> MultiXactState->nextOffset > lastCheckpointedOffset
> 5. The checkpoint from step 1, continuing on its merry way, now calls
> TruncateMultiXact(), which sets rangeEnd > rangeStart and blows away
> nearly every file in the SLRU.

I am still working on reproducing this race scenario various different
ways including the way you described, but at step 4 I kept getting
stuck, unable to create new multixacts despite having vacuum-frozen
all databases (including template0) and advanced the cluster minimum
mxid.

I think I see why, and I think it's a bug:  if you vacuum freeze all
your databases, MultiXactState->oldestMultiXactId finishes up equal to
MultiXactState->nextMXact.  But that's not actually a multixact that
exists yet, so when when DetermineSafeOldestOffset calls
find_multixact_start, it reads a garbage offset (all zeros in practice
since pages start out zeroed) and produces a garbage value for
offsetStopLimit which might incorrectly stop you from creating any
more multixacts even though member space is entirely empty (but it
depends on where your nextOffset happens to be at the time).  I think
the fix is something like "if nextMXact == oldestMultiXactId, then
there are no active multixacts, so the offsetStopLimit should be set
to nextOffset - (a segment's worth)".

--
Thomas Munro
http://www.enterprisedb.com

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Alon
Дата:
Сообщение: Re: Re: Re: [BUGS] Re: [BUGS] Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)