Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts
Дата
Msg-id 20140613175151.GN18688@eldon.alvh.no-ip.org
обсуждение исходный текст
Ответ на pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts  (Andres Freund <andres@2ndquadrant.com>)
Ответы Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts  (Andres Freund <andres@2ndquadrant.com>)
Список pgsql-bugs
Andres Freund wrote:
> Hi,
>
> When upgrading a < 9.3 cluster pg_upgrade doesn't bother to keep the old
> multixacts around because they won't be read after the upgrade (and
> aren't compatible). It just resets the new cluster's nextMulti to the
> old + 1.
> Unfortunately that means that there'll be a offsets/0000 file created by
> initdb around. Sounds harmless enough, but that'll actually cause
> problems if the old cluster had a nextMulti that's bigger than that
> page.
>
> When vac_truncate_clog() calls TruncateMultiXact() that'll scan
> pg_multixact/offsets to find the earliest existing segment. That'll be
> 0000. If the to-be-truncated data is older than the last existing
> segment it returns. Then it'll try to determine the last required data
> in members/ by accessing the oldest data in offsets/.

I'm trying to understand the mechanism of this bug, and I'm not
succeeding.  If the offset/0000 was created by initdb, how come we try
to delete a file that's not also members/0000?  I mean, surely the file
as created by initdb is empty (zeroed).  In your sample error message
downthread,

ERROR: could not access status of transaction 2072053907
DETAIL: Could not open file "pg_multixact/offsets/7B81": No such file or directory.

what prompted the status of that multixid to be sought?  I see one
possible path to this error message, which is SlruPhysicalReadPage().
(There are other paths that lead to similar errors, but they use
"transaction 0" instead, so we can rule those out; and we can rule out
anything that uses MultiXactMemberCtl because of the path given in
DETAIL.)

There are four callsites that lead to that:

RecordNewMultiXact
GetMultiXactIdMembers (2x)
TrimMultiXact

Of those, only GetMultiXactIdMembers is likely to be called from vacuum
(actually RecordNewMultiXact can too, in a few cases, if it happens to
freeze a multi by creating another multi; should be pretty rare).
But you were talking about vacuum truncating pg_multixact -- and I don't
see how that's related to these functions.

Is it possible that you pasted the wrong error message?

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: pg_restore PostgreSQL 9.3.3 problems
Следующее
От: Andres Freund
Дата:
Сообщение: Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts