Re: Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1
Дата
Msg-id 20150603180146.GB133018@postgresql.org
обсуждение исходный текст
Ответ на Re: Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1  (Andres Freund <andres@anarazel.de>)
Ответы Re: Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Andres Freund wrote:
> On 2015-06-03 00:42:55 -0300, Alvaro Herrera wrote:
> > Thomas Munro wrote:
> > > On Tue, Jun 2, 2015 at 9:30 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> > > > My guess is that the file existed, and perhaps had one or more pages,
> > > > but the wanted page doesn't exist, so we tried to read but got 0 bytes
> > > > back.  read() returns 0 in this case but doesn't set errno.
> > > >
> > > > I didn't find a way to set things so that the file exists but is of
> > > > shorter contents than oldestMulti by the time the checkpoint record is
> > > > replayed.
> > >
> > > I'm just starting to learn about the recovery machinery, so forgive me
> > > if I'm missing something basic here, but I just don't get this.  As I
> > > understand it, offsets/0046 should either have been copied with that
> > > page present in it if it existed before the backup started (apparently
> > > not in this case), or extended to contain it by WAL records that come
> > > after the backup label but before the checkpoint record that
> > > references it (also apparently not in this case).
>
> That's not necessarily the case though, given how the code currently
> works. In a bunch of places the SLRUs are accessed *before* having been
> made consistent by WAL replay. Especially if several checkpoints/vacuums
> happened during the base backup the assumed state (i.e. the mxacts
> checkpoints refer to) of the data directory soon after the initial
> start, and the state of pg_multixact/ won't necessarily match at all.

Well, the log extract is quite simple; it says that as soon as the
standby starts replay WAL, it fetches exactly one WAL segment, which
contains exactly one online checkpoint record; this record contains
oldestMulti=4624559, which belongs in offset file 0046.  But the
basebackup only contains files starting from 004B onwards.  The base
backup takes a long time, so my guess of what happened is that the
backup label points to the start of the WAL stream (obviously), then the
data files are copied to the standby; during this long process, a
multixact truncation happens.  So by the time the base backup reaches
the pg_multixact directory, the 0046 file has been removed.

2015-05-27 16:15:17 UTC [4782]: [3-1] user=,db= LOG: entering standby mode
2015-05-27 16:15:18 UTC [4782]: [4-1] user=,db= LOG: restored log file "00000001000073DD000000AD" from archive
2015-05-27 16:15:18 UTC [4782]: [5-1] user=,db= FATAL: could not access status of transaction 4624559
2015-05-27 16:15:18 UTC [4782]: [6-1] user=,db= DETAIL: Could not read from file "pg_multixact/offsets/0046" at offset
147456:Success. 
2015-05-27 16:15:18 UTC [4778]: [4-1] user=,db= LOG: startup process (PID 4782) exited with exit code 1
2015-05-27 16:15:18 UTC [4778]: [5-1] user=,db= LOG: aborting startup due to startup process failure

One idea I had was: what if the oldestMulti pointed to another multi
earlier in the same 0046 file, so that it is read-as-zeroes (and the
file is created), and then a subsequent multixact truncate tries to read
a later page in the file.  In SlruPhysicalReadPage() this would give a
change for open() to not fail, and then read() can fail as above.
However, we would have an earlier LOG message about "reading as zeroes".

Really, the whole question of how this code goes past the open() failure
in SlruPhysicalReadPage baffles me.  I don't see any possible way for
the file to be created ...

> > Exactly --- that's the spot at which I am, also.  I have had this
> > spinning in my head for three days now, and tried every single variation
> > that I could think of, but like you I was unable to reproduce the issue.
> > However, our customer took a second base backup and it failed in exactly
> > the same way, module some changes to the counters (the file that
> > didn't exist was 004B rather than 0046).  I'm still at a loss at what
> > the failure mode is.  We must be missing some crucial detail ...
>
> I might have missed it in this already long thread. Could you share a
> bunch of details about hte case? It'd be very interesting to see the
> contents of the backup label (to see where start/end are), the contents
> of the initial checkpoint (to see which mxacts we assume to exist at
> start) and what the initial contents of pg_multixact are (to match up).

pg_xlogdump says about the checkpoint record:

rmgr: XLOG        len (rec/tot):     72/   104, tx:          0, lsn: 73DD/AD834470, prev 73DD/AD8343B8, bkp: 0000,
desc:checkpoint: redo 73DD/4003F648; tli 1; prev tli 1; fpw true; xid 1/3530575042; oid 81303440; multi 14003656;
offset66311341; oldest xid 2530472730 in DB 17081243; oldest multi 4624559 in DB 17081243; oldest running xid
3530551992;online 

I don't have the backup label.  But a listing of the files in
pg_multixact/offsets in the master has this:

./offsets:
total 35816
drwx------ 2 postgres postgres 4096 May 27 16:25 .
drwx------ 4 postgres postgres 4096 Jan 3 03:32 ..
-rw------- 1 postgres postgres 262144 Apr 20 13:07 004C
-rw------- 1 postgres postgres 262144 Apr 20 14:36 004D
-rw------- 1 postgres postgres 262144 Apr 20 14:39 004E
-rw------- 1 postgres postgres 262144 Apr 20 14:43 004F
-rw------- 1 postgres postgres 262144 Apr 20 15:06 0050
-rw------- 1 postgres postgres 262144 Apr 20 15:08 0051
-rw------- 1 postgres postgres 262144 Apr 20 15:11 0052
-rw------- 1 postgres postgres 262144 Apr 20 15:15 0053
-rw------- 1 postgres postgres 262144 Apr 20 15:20 0054
-rw------- 1 postgres postgres 262144 Apr 20 15:23 0055
-rw------- 1 postgres postgres 262144 Apr 20 15:25 0056
-rw------- 1 postgres postgres 262144 Apr 20 15:28 0057
-rw------- 1 postgres postgres 262144 Apr 20 15:31 0058
-rw------- 1 postgres postgres 262144 Apr 20 15:33 0059
-rw------- 1 postgres postgres 262144 Apr 20 15:38 005A
-rw------- 1 postgres postgres 262144 Apr 20 15:39 005B
-rw------- 1 postgres postgres 262144 Apr 20 15:41 005C
-rw------- 1 postgres postgres 262144 Apr 20 15:42 005D
-rw------- 1 postgres postgres 262144 Apr 20 15:47 005E
-rw------- 1 postgres postgres 262144 Apr 20 16:12 005F
-rw------- 1 postgres postgres 262144 Apr 20 16:41 0060
-rw------- 1 postgres postgres 262144 Apr 20 16:50 0061
-rw------- 1 postgres postgres 262144 Apr 20 17:51 0062
-rw------- 1 postgres postgres 262144 Apr 20 19:23 0063
-rw------- 1 postgres postgres 262144 Apr 20 22:08 0064
-rw------- 1 postgres postgres 262144 Apr 21 07:29 0065
-rw------- 1 postgres postgres 262144 Apr 21 11:43 0066
-rw------- 1 postgres postgres 262144 Apr 21 14:35 0067
-rw------- 1 postgres postgres 262144 Apr 21 16:02 0068
-rw------- 1 postgres postgres 262144 Apr 21 21:51 0069
-rw------- 1 postgres postgres 262144 Apr 22 08:40 006A
-rw------- 1 postgres postgres 262144 Apr 22 12:18 006B
-rw------- 1 postgres postgres 262144 Apr 22 16:22 006C
-rw------- 1 postgres postgres 262144 Apr 23 00:01 006D
-rw------- 1 postgres postgres 262144 Apr 23 09:08 006E
-rw------- 1 postgres postgres 262144 Apr 23 13:28 006F
-rw------- 1 postgres postgres 262144 Apr 23 17:35 0070
-rw------- 1 postgres postgres 262144 Apr 24 06:37 0071
-rw------- 1 postgres postgres 262144 Apr 24 12:27 0072
-rw------- 1 postgres postgres 262144 Apr 24 16:27 0073
-rw------- 1 postgres postgres 262144 Apr 25 06:37 0074
-rw------- 1 postgres postgres 262144 Apr 25 17:18 0075
-rw------- 1 postgres postgres 262144 Apr 26 15:05 0076
-rw------- 1 postgres postgres 262144 Apr 27 07:48 0077
-rw------- 1 postgres postgres 262144 Apr 27 13:11 0078
-rw------- 1 postgres postgres 262144 Apr 27 18:13 0079
-rw------- 1 postgres postgres 262144 Apr 28 08:41 007A
-rw------- 1 postgres postgres 262144 Apr 28 14:36 007B
-rw------- 1 postgres postgres 262144 Apr 29 00:34 007C
-rw------- 1 postgres postgres 262144 Apr 29 09:44 007D
-rw------- 1 postgres postgres 262144 Apr 29 14:49 007E
-rw------- 1 postgres postgres 262144 Apr 29 19:53 007F
-rw------- 1 postgres postgres 262144 Apr 30 08:43 0080
-rw------- 1 postgres postgres 262144 Apr 30 12:41 0081
-rw------- 1 postgres postgres 262144 Apr 30 15:16 0082
-rw------- 1 postgres postgres 262144 Apr 30 15:21 0083
-rw------- 1 postgres postgres 262144 Apr 30 17:15 0084
-rw------- 1 postgres postgres 262144 May 1 07:45 0085
-rw------- 1 postgres postgres 262144 May 1 17:06 0086
-rw------- 1 postgres postgres 262144 May 2 08:37 0087
-rw------- 1 postgres postgres 262144 May 3 00:14 0088
-rw------- 1 postgres postgres 262144 May 3 18:46 0089
-rw------- 1 postgres postgres 262144 May 4 06:33 008A
-rw------- 1 postgres postgres 262144 May 4 09:29 008B
-rw------- 1 postgres postgres 262144 May 4 13:42 008C
-rw------- 1 postgres postgres 262144 May 4 18:08 008D
-rw------- 1 postgres postgres 262144 May 5 06:56 008E
-rw------- 1 postgres postgres 262144 May 5 11:06 008F
-rw------- 1 postgres postgres 262144 May 5 14:47 0090
-rw------- 1 postgres postgres 262144 May 5 19:39 0091
-rw------- 1 postgres postgres 262144 May 6 07:36 0092
-rw------- 1 postgres postgres 262144 May 6 10:30 0093
-rw------- 1 postgres postgres 262144 May 6 15:23 0094
-rw------- 1 postgres postgres 262144 May 6 22:18 0095
-rw------- 1 postgres postgres 262144 May 7 08:22 0096
-rw------- 1 postgres postgres 262144 May 7 11:31 0097
-rw------- 1 postgres postgres 262144 May 7 14:18 0098
-rw------- 1 postgres postgres 262144 May 7 16:34 0099
-rw------- 1 postgres postgres 262144 May 7 18:39 009A
-rw------- 1 postgres postgres 262144 May 7 21:42 009B
-rw------- 1 postgres postgres 262144 May 8 07:57 009C
-rw------- 1 postgres postgres 262144 May 8 12:34 009D
-rw------- 1 postgres postgres 262144 May 8 17:53 009E
-rw------- 1 postgres postgres 262144 May 9 07:22 009F
-rw------- 1 postgres postgres 262144 May 9 16:42 00A0
-rw------- 1 postgres postgres 262144 May 10 08:43 00A1
-rw------- 1 postgres postgres 262144 May 10 19:23 00A2
-rw------- 1 postgres postgres 262144 May 11 08:20 00A3
-rw------- 1 postgres postgres 262144 May 11 08:48 00A4
-rw------- 1 postgres postgres 262144 May 11 12:23 00A5
-rw------- 1 postgres postgres 262144 May 11 15:36 00A6
-rw------- 1 postgres postgres 262144 May 11 21:24 00A7
-rw------- 1 postgres postgres 262144 May 12 08:15 00A8
-rw------- 1 postgres postgres 262144 May 12 09:20 00A9
-rw------- 1 postgres postgres 262144 May 12 09:42 00AA
-rw------- 1 postgres postgres 262144 May 12 11:04 00AB
-rw------- 1 postgres postgres 262144 May 12 11:06 00AC
-rw------- 1 postgres postgres 262144 May 12 11:09 00AD
-rw------- 1 postgres postgres 262144 May 12 11:26 00AE
-rw------- 1 postgres postgres 262144 May 12 11:26 00AF
-rw------- 1 postgres postgres 262144 May 12 11:31 00B0
-rw------- 1 postgres postgres 262144 May 12 12:06 00B1
-rw------- 1 postgres postgres 262144 May 12 12:37 00B2
-rw------- 1 postgres postgres 262144 May 12 13:18 00B3
-rw------- 1 postgres postgres 262144 May 12 14:34 00B4
-rw------- 1 postgres postgres 262144 May 12 14:29 00B5
-rw------- 1 postgres postgres 262144 May 12 14:44 00B6
-rw------- 1 postgres postgres 262144 May 12 15:16 00B7
-rw------- 1 postgres postgres 262144 May 12 15:26 00B8
-rw------- 1 postgres postgres 262144 May 12 15:35 00B9
-rw------- 1 postgres postgres 262144 May 12 15:47 00BA
-rw------- 1 postgres postgres 262144 May 12 16:02 00BB
-rw------- 1 postgres postgres 262144 May 12 16:18 00BC
-rw------- 1 postgres postgres 262144 May 12 17:00 00BD
-rw------- 1 postgres postgres 262144 May 12 18:48 00BE
-rw------- 1 postgres postgres 262144 May 12 19:59 00BF
-rw------- 1 postgres postgres 262144 May 12 21:39 00C0
-rw------- 1 postgres postgres 262144 May 13 05:31 00C1
-rw------- 1 postgres postgres 262144 May 13 07:35 00C2
-rw------- 1 postgres postgres 262144 May 13 08:16 00C3
-rw------- 1 postgres postgres 262144 May 13 08:30 00C4
-rw------- 1 postgres postgres 262144 May 13 09:00 00C5
-rw------- 1 postgres postgres 262144 May 13 09:19 00C6
-rw------- 1 postgres postgres 262144 May 13 09:41 00C7
-rw------- 1 postgres postgres 262144 May 13 10:03 00C8
-rw------- 1 postgres postgres 262144 May 13 10:17 00C9
-rw------- 1 postgres postgres 262144 May 13 10:40 00CA
-rw------- 1 postgres postgres 262144 May 13 11:05 00CB
-rw------- 1 postgres postgres 262144 May 13 11:28 00CC
-rw------- 1 postgres postgres 262144 May 13 12:02 00CD
-rw------- 1 postgres postgres 262144 May 13 12:58 00CE
-rw------- 1 postgres postgres 262144 May 13 13:06 00CF
-rw------- 1 postgres postgres 262144 May 13 13:12 00D0
-rw------- 1 postgres postgres 262144 May 13 13:21 00D1
-rw------- 1 postgres postgres 262144 May 13 13:31 00D2
-rw------- 1 postgres postgres 262144 May 13 13:40 00D3
-rw------- 1 postgres postgres 262144 May 13 13:53 00D4
-rw------- 1 postgres postgres 188416 May 27 16:40 00D5


--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stefan Kaltenbrunner
Дата:
Сообщение: Re: [CORE] postpone next week's release
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1