Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)
Дата
Msg-id CAEepm=3C32VPJLOo45y0c3-3KWXNV2xM4jaPTSVjCRD2VG0Qgg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-bugs
On Sat, May 9, 2015 at 2:46 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, May 8, 2015 at 9:55 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>> Thomas Munro wrote:
>>> I think the fix is something like "if nextMXact == oldestMultiXactId,
>>> then there are no active multixacts, so the offsetStopLimit should be
>>> set to nextOffset - (a segment's worth)".
>>
>> Makes sense.
>
> Here's a patch that attempts to implement this.

Thanks.  I think I have managed to reproduce something like the data
loss race that we were speculating about.

0.  initdb, autovacuum = off, set up explode_mxact_members.c as
described elsewhere in the thread.
1.  Fill up the members SLRU completely (ie reach state where you can
no longer create a new multixact of any size).  pg_multixact/members
contains 82040 files and the last one is named 14077.
2.  Issue CHECKPOINT, but use a debugger to stop inside
TruncateMultiXact after it has read
MultiXactState->lastCheckpointedOldest and released the lock, but
before it calls SlruScanDirectory to delete files...
3.  Run VACUUM FREEZE in all databases (including template0).  datminmxid moves.
4.  Create lots of new multixacts.  pg_multixact/members now contains
82041 files and the last one is named 14078 (ie one extra segment,
with the highest possible segment number, which couldn't be created
before vacuuming because of the one segment gap enforced by
DetermineSafeOldestOffset).  Segments 0000-0016 have new modified
times.
5.  ... allow the checkpoint started in step 2 to continue.  It
deletes segments, keeping only 0000-0016.  The segment 14078 which
contained active member data has been incorrectly deleted.

--
Thomas Munro
http://www.enterprisedb.com

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)