Re: BUG #3110: Online Backup introduces Duplicate OIDs

Поиск
Список
Период
Сортировка
От Randy Isbell
Тема Re: BUG #3110: Online Backup introduces Duplicate OIDs
Дата
Msg-id 5F2CCCB7-B2C5-4CA9-852F-CBD05251B80F@cisco.com
обсуждение исходный текст
Ответ на Re: BUG #3110: Online Backup introduces Duplicate OIDs  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: BUG #3110: Online Backup introduces Duplicate OIDs  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
On Mar 12, 2007, at 8:08 PM, Tom Lane wrote:

> Randy Isbell <jisbell@cisco.com> writes:
>> One thing I do find interesting:  while the backup created with my
>> "online" / Chap 23.3 processing contains 18 WAL files, only the last
>> 3 (most recent) are used during the recovery process.  Is that
>> normal?
>
> Hmm, no, that seems pretty suspicious.  The thing *should* be starting
> from the checkpoint created by pg_start_backup(), but that observation
> makes it sound like it is starting from some later one, probably
> whatever happened to be current when pg_control got archived.  Please
> check this by noting what the startup messages say about where replay
> commences, and compare to the backup label file (plain text, you can
> just look at it) and pg_control (use pg_controldata before starting
> the
> replay).

Here is the pg_controldata output for the saved cluster prior to
startup:

pg_control version number:            822
Catalog version number:               200611241
Database system identifier:           5038266934557908351
Database cluster state:               in production
pg_control last modified:             Tue Mar  6 12:43:53 2007
Current log file ID:                  2
Next log file segment:                6
Latest checkpoint location:           2/3035F48
Prior checkpoint location:            2/292EC
Latest checkpoint's REDO location:    2/3000840
Latest checkpoint's UNDO location:    0/0
Latest checkpoint's TimeLineID:       1
Latest checkpoint's NextXID:          0/18823426
Latest checkpoint's NextOID:          8774996
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Time of latest checkpoint:            Tue Mar  6 12:41:41 2007
Minimum recovery ending location:     0/0
Maximum data alignment:               4
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       8192
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Date/time type storage:               64-bit integers
Maximum length of locale name:        128
LC_COLLATE:                           C
LC_CTYPE:                             C


Here is the backup label file:
# cat 0000000100000001000000F3.00FB4C3C.backup
START WAL LOCATION: 1/F3FB4C3C (file 0000000100000001000000F3)
STOP WAL LOCATION: 2/56912D8 (file 000000010000000200000005)
CHECKPOINT LOCATION: 1/F3FB4C3C
START TIME: 2007-03-06 12:25:05 CST
LABEL: /usr/local/bck/db/sn200703061224.tar.gz
STOP TIME: 2007-03-06 12:44:27 CST


Here are the startup messages:

<2007-03-12 10:39:10 CDT 82475>LOG:  could not create socket for
statistics collector: Protocol not supported
<2007-03-12 10:39:10 CDT 82475>LOG:  trying another address for the
statistics collector
<2007-03-12 10:39:10 CDT 82889>LOG:  database system was interrupted
at 2007-03-06 12:43:53 CST
<2007-03-12 10:39:10 CDT 82889>LOG:  starting archive recovery
<2007-03-12 10:39:10 CDT 82889>LOG:  restore_command = "cp /usr/local/
pgsql/walarch/%f %p"
cp: /usr/local/pgsql/walarch/00000001.history: No such file or directory
<2007-03-12 10:39:10 CDT 82889>LOG:  restored log file
"000000010000000200000003" from archive
<2007-03-12 10:39:10 CDT 82889>LOG:  checkpoint record is at 2/3035F48
<2007-03-12 10:39:10 CDT 82889>LOG:  redo record is at 2/3000840;
undo record is at 0/0; shutdown FALSE
<2007-03-12 10:39:10 CDT 82889>LOG:  next transaction ID: 0/18823426;
next OID: 8774996
<2007-03-12 10:39:10 CDT 82889>LOG:  next MultiXactId: 1; next
MultiXactOffset: 0
<2007-03-12 10:39:10 CDT 82889>LOG:  automatic recovery in progress
<2007-03-12 10:39:10 CDT 82889>LOG:  redo starts at 2/3000840
<2007-03-12 10:39:10 CDT 82889>LOG:  REDO @ 2/3000840; LSN 2/30024C0:
prev 2/3000818; xid 18823427; bkpb1: Heap - insert: rel
1663/16390/22670; tid 211239/23
<2007-03-12 10:39:10 CDT 82889>LOG:  REDO @ 2/30024C0; LSN 2/3004514:
prev 2/3000840; xid 18823427; bkpb1: Btree - insert: rel
1663/16390/8358831; tid 9103/77
<2007-03-12 10:39:10 CDT 82889>LOG:  REDO @ 2/3004514; LSN 2/3005E9C:
prev 2/30024C0; xid 18823427; bkpb1: Btree - insert: rel
1663/16390/8359008; tid 36219/100
<2007-03-12 10:39:10 CDT 82889>LOG:  REDO @ 2/3005E9C; LSN 2/3007958:
prev 2/3004514; xid 18823427; bkpb1: Btree - insert: rel
1663/16390/8359009; tid 15905/133
<2007-03-12 10:39:10 CDT 82889>LOG:  REDO @ 2/3007958; LSN 2/3008AAC:
prev 2/3005E9C; xid 18823427; bkpb1: Btree - insert: rel
1663/16390/8359011; tid 13711/14
<2007-03-12 10:39:10 CDT 82889>LOG:  REDO @ 2/3008AAC; LSN 2/3009D98:
prev 2/3007958; xid 18823427; bkpb1: Btree - insert: rel
1663/16390/8359012; tid 15516/235
<2007-03-12 10:39:10 CDT 82889>LOG:  REDO @ 2/3009D98; LSN 2/300B0D4:
prev 2/3008AAC; xid 18823427; bkpb1: Btree - insert: rel
1663/16390/8359024; tid 12394/300
<2007-03-12 10:39:10 CDT 82889>LOG:  REDO @ 2/300B0D4; LSN 2/300B0FC:
prev 2/3009D98; xid 18823427: Transaction - commit: 2007-03-06 12:41:41


 From the backup label file it seems the starting checkpoint record
should be at 1/F3FB4C3C.  However the pg_control info indicates
2/3035F48.  Which is correct?

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #3110: Online Backup introduces Duplicate OIDs
Следующее
От: Tom Lane
Дата:
Сообщение: Re: BUG #3110: Online Backup introduces Duplicate OIDs