Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts
Дата
Msg-id 20140613181408.GA6763@awork2.anarazel.de
обсуждение исходный текст
Ответ на Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-bugs
On 2014-06-13 13:51:51 -0400, Alvaro Herrera wrote:
> Andres Freund wrote:
> > Hi,
> >
> > When upgrading a < 9.3 cluster pg_upgrade doesn't bother to keep the old
> > multixacts around because they won't be read after the upgrade (and
> > aren't compatible). It just resets the new cluster's nextMulti to the
> > old + 1.
> > Unfortunately that means that there'll be a offsets/0000 file created by
> > initdb around. Sounds harmless enough, but that'll actually cause
> > problems if the old cluster had a nextMulti that's bigger than that
> > page.
> >
> > When vac_truncate_clog() calls TruncateMultiXact() that'll scan
> > pg_multixact/offsets to find the earliest existing segment. That'll be
> > 0000. If the to-be-truncated data is older than the last existing
> > segment it returns. Then it'll try to determine the last required data
> > in members/ by accessing the oldest data in offsets/.
>
> I'm trying to understand the mechanism of this bug, and I'm not
> succeeding.  If the offset/0000 was created by initdb, how come we try
> to delete a file that's not also members/0000?  I mean, surely the file
> as created by initdb is empty (zeroed).  In your sample error message
> downthread,

The bit you're missing is essentially the following in TruncateMultiXact():
    /*
     * Note we can't just plow ahead with the truncation; it's possible that
     * there are no segments to truncate, which is a problem because we are
     * going to attempt to read the offsets page to determine where to
     * truncate the members SLRU.  So we first scan the directory to determine
     * the earliest offsets page number that we can read without error.
     */
That check is thwarted due to the 0000 segment. So the segment
containing the oldestMXact has already been removed, but we don't notice
that that's the case.

> ERROR: could not access status of transaction 2072053907
> DETAIL: Could not open file "pg_multixact/offsets/7B81": No such file or directory.
>
> what prompted the status of that multixid to be sought?  I see one
> possible path to this error message, which is SlruPhysicalReadPage().
> (There are other paths that lead to similar errors, but they use
> "transaction 0" instead, so we can rule those out; and we can rule out
> anything that uses MultiXactMemberCtl because of the path given in
> DETAIL.)

Same function:
    /*
     * First, compute the safe truncation point for MultiXactMember. This is
     * the starting offset of the multixact we were passed as MultiXactOffset
     * cutoff.
     */
    {
        int            pageno;
        int            slotno;
        int            entryno;
        MultiXactOffset *offptr;

        /* lock is acquired by SimpleLruReadPage_ReadOnly */

        pageno = MultiXactIdToOffsetPage(oldestMXact);
        entryno = MultiXactIdToOffsetEntry(oldestMXact);

        slotno = SimpleLruReadPage_ReadOnly(MultiXactOffsetCtl, pageno,
                                            oldestMXact);
        offptr = (MultiXactOffset *)
            MultiXactOffsetCtl->shared->page_buffer[slotno];
        offptr += entryno;
        oldestOffset = *offptr;

        LWLockRelease(MultiXactOffsetControlLock);
    }

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts