Обсуждение: uninitialized page in standby recovery
I built a standby with 9.4.12 and about a day later the standby crashed with this: 2018-02-02 16:20:44 EST,0, WARNING: page 1347460 of relation base/16391/16414 is uninitialized 2018-02-02 16:20:44 EST,0, CONTEXT: xlog redo visible: rel 1663/16391/16414; blk 1347460 2018-02-02 16:20:44 EST,0, PANIC: WAL contains references to invalid pages 2018-02-02 16:20:44 EST,0, CONTEXT: xlog redo visible: rel 1663/16391/16414; blk 1347460 2018-02-02 16:20:44 EST,0, LOG: startup process (PID 24057) was terminated by signal 6: Aborted 2018-02-02 16:20:44 EST,0, LOG: terminating any other active server processes Any hints to where the corruption begins? I don't see any disk i/o issues. Not sure what to look for in the release notes, but I'll try to patch asap, but that is difficult to get done politically.
On 2/5/18 9:06 AM, Ray Stell wrote: > I built a standby with 9.4.12 and about a day later the standby > crashed with this: > > 2018-02-02 16:20:44 EST,0, WARNING: page 1347460 of relation > base/16391/16414 is uninitialized > 2018-02-02 16:20:44 EST,0, CONTEXT: xlog redo visible: rel > 1663/16391/16414; blk 1347460 > 2018-02-02 16:20:44 EST,0, PANIC: WAL contains references to invalid > pages > 2018-02-02 16:20:44 EST,0, CONTEXT: xlog redo visible: rel > 1663/16391/16414; blk 1347460 > 2018-02-02 16:20:44 EST,0, LOG: startup process (PID 24057) was > terminated by signal 6: Aborted > 2018-02-02 16:20:44 EST,0, LOG: terminating any other active server > processes > > Any hints to where the corruption begins? I don't see any disk i/o > issues. Not sure what to look for in the release notes, > > but I'll try to patch asap, but that is difficult to get done > politically. > looks like bug 13822, discussed here: https://www.postgresql.org/message-id/20151217125025.6916.26898%40wrigleys.postgresql.org This discuss is in 9.4.5, how can I follow the thread? Is there a bug db to query?
On 2/5/18 9:06 AM, Ray Stell wrote: > I built a standby with 9.4.12 and about a day later the standby > crashed with this: > > 2018-02-02 16:20:44 EST,0, WARNING: page 1347460 of relation > base/16391/16414 is uninitialized > 2018-02-02 16:20:44 EST,0, CONTEXT: xlog redo visible: rel > 1663/16391/16414; blk 1347460 > 2018-02-02 16:20:44 EST,0, PANIC: WAL contains references to invalid > pages > 2018-02-02 16:20:44 EST,0, CONTEXT: xlog redo visible: rel > 1663/16391/16414; blk 1347460 > 2018-02-02 16:20:44 EST,0, LOG: startup process (PID 24057) was > terminated by signal 6: Aborted > 2018-02-02 16:20:44 EST,0, LOG: terminating any other active server > processes > > Any hints to where the corruption begins? I don't see any disk i/o > issues. Not sure what to look for in the release notes, > > but I'll try to patch asap, but that is difficult to get done > politically. > I begin to wonder about pg_basebackup in this old version. I rebuilt the stby again and this time when I fired up the stby I get: LOG: database system was not properly shut down; automatic recovery in progress LOG: redo starts at 2F45/1F4B7F8 FATAL: could not access status of transaction 4053124744 DETAIL: Could not read from file "pg_clog/0F19" at offset 90112: Success. CONTEXT: xlog redo commit: 2018-02-05 11:35:54.291398-05 LOG: startup process (PID 130590) exited with exit code 1 LOG: terminating any other active server processes right or wrong, I rsync-ed gp_clog and it recovered. Can you use pg_basebackup from a more current patch_level on 9.4.12 cluster?