Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts
От | Andres Freund |
---|---|
Тема | Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts |
Дата | |
Msg-id | 20140613181408.GA6763@awork2.anarazel.de обсуждение исходный текст |
Ответ на | Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts (Alvaro Herrera <alvherre@2ndquadrant.com>) |
Список | pgsql-bugs |
On 2014-06-13 13:51:51 -0400, Alvaro Herrera wrote: > Andres Freund wrote: > > Hi, > > > > When upgrading a < 9.3 cluster pg_upgrade doesn't bother to keep the old > > multixacts around because they won't be read after the upgrade (and > > aren't compatible). It just resets the new cluster's nextMulti to the > > old + 1. > > Unfortunately that means that there'll be a offsets/0000 file created by > > initdb around. Sounds harmless enough, but that'll actually cause > > problems if the old cluster had a nextMulti that's bigger than that > > page. > > > > When vac_truncate_clog() calls TruncateMultiXact() that'll scan > > pg_multixact/offsets to find the earliest existing segment. That'll be > > 0000. If the to-be-truncated data is older than the last existing > > segment it returns. Then it'll try to determine the last required data > > in members/ by accessing the oldest data in offsets/. > > I'm trying to understand the mechanism of this bug, and I'm not > succeeding. If the offset/0000 was created by initdb, how come we try > to delete a file that's not also members/0000? I mean, surely the file > as created by initdb is empty (zeroed). In your sample error message > downthread, The bit you're missing is essentially the following in TruncateMultiXact(): /* * Note we can't just plow ahead with the truncation; it's possible that * there are no segments to truncate, which is a problem because we are * going to attempt to read the offsets page to determine where to * truncate the members SLRU. So we first scan the directory to determine * the earliest offsets page number that we can read without error. */ That check is thwarted due to the 0000 segment. So the segment containing the oldestMXact has already been removed, but we don't notice that that's the case. > ERROR: could not access status of transaction 2072053907 > DETAIL: Could not open file "pg_multixact/offsets/7B81": No such file or directory. > > what prompted the status of that multixid to be sought? I see one > possible path to this error message, which is SlruPhysicalReadPage(). > (There are other paths that lead to similar errors, but they use > "transaction 0" instead, so we can rule those out; and we can rule out > anything that uses MultiXactMemberCtl because of the path given in > DETAIL.) Same function: /* * First, compute the safe truncation point for MultiXactMember. This is * the starting offset of the multixact we were passed as MultiXactOffset * cutoff. */ { int pageno; int slotno; int entryno; MultiXactOffset *offptr; /* lock is acquired by SimpleLruReadPage_ReadOnly */ pageno = MultiXactIdToOffsetPage(oldestMXact); entryno = MultiXactIdToOffsetEntry(oldestMXact); slotno = SimpleLruReadPage_ReadOnly(MultiXactOffsetCtl, pageno, oldestMXact); offptr = (MultiXactOffset *) MultiXactOffsetCtl->shared->page_buffer[slotno]; offptr += entryno; oldestOffset = *offptr; LWLockRelease(MultiXactOffsetControlLock); } Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
В списке pgsql-bugs по дате отправления:
Предыдущее
От: Alvaro HerreraДата:
Сообщение: Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts
Следующее
От: Alvaro HerreraДата:
Сообщение: Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts