Re: Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1
Дата
Msg-id CAEepm=2Sp4RiOFTXOz-R336YioOq-yrFbtOXu2=1TvwFrv+hCQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Ответы Re: Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1  (Robert Haas <robertmhaas@gmail.com>)
Re: Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers
On Wed, Jun 3, 2015 at 3:42 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Thomas Munro wrote:
>> On Tue, Jun 2, 2015 at 9:30 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>> > My guess is that the file existed, and perhaps had one or more pages,
>> > but the wanted page doesn't exist, so we tried to read but got 0 bytes
>> > back.  read() returns 0 in this case but doesn't set errno.
>> >
>> > I didn't find a way to set things so that the file exists but is of
>> > shorter contents than oldestMulti by the time the checkpoint record is
>> > replayed.
>>
>> I'm just starting to learn about the recovery machinery, so forgive me
>> if I'm missing something basic here, but I just don't get this.  As I
>> understand it, offsets/0046 should either have been copied with that
>> page present in it if it existed before the backup started (apparently
>> not in this case), or extended to contain it by WAL records that come
>> after the backup label but before the checkpoint record that
>> references it (also apparently not in this case).
>
> Exactly --- that's the spot at which I am, also.  I have had this
> spinning in my head for three days now, and tried every single variation
> that I could think of, but like you I was unable to reproduce the issue.
> However, our customer took a second base backup and it failed in exactly
> the same way, module some changes to the counters (the file that
> didn't exist was 004B rather than 0046).  I'm still at a loss at what
> the failure mode is.  We must be missing some crucial detail ...

I have finally reproduced that error!  See attached repro shell script.

The conditions are:

1.  next multixact == oldest multixact (no active multixacts, pointing
past the end)
2.  next multixact would be the first item on a new page (multixact % 2048 == 0)
3.  the page must not be the first in a segment (or we'd get the
read-zeroes case)

That gives you odds of 1/2048 * 31/32 * (probability of a wraparound
vacuum followed by no multixact creations right before your backup
checkpoint).  That seems like reasonably low odds... if it happened
twice in a row, maybe I'm missing something here and there is some
other way to get this...

I realise now that this is actually a symptom of a problem spotted by
Noah recently:

http://www.postgresql.org/message-id/20150601045534.GB23587@tornado.leadboat.com

He noticed the problem for segment boundaries, when not in recovery.
In recovery, segment boundaries don't raise an error (the read-zeroes
case applies), but page boundaries do.  The fix is probably to do
nothing if they are the same, as we do elsewhere, like in the attached
patch.

--
Thomas Munro
http://www.enterprisedb.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: [GENERAL] psql weird behaviour with charset encodings
Следующее
От: Craig Ringer
Дата:
Сообщение: Re: auto_explain sample rate