On 2/5/18 9:06 AM, Ray Stell wrote:
> I built a standby with 9.4.12 and about a day later the standby
> crashed with this:
>
> 2018-02-02 16:20:44 EST,0, WARNING: page 1347460 of relation
> base/16391/16414 is uninitialized
> 2018-02-02 16:20:44 EST,0, CONTEXT: xlog redo visible: rel
> 1663/16391/16414; blk 1347460
> 2018-02-02 16:20:44 EST,0, PANIC: WAL contains references to invalid
> pages
> 2018-02-02 16:20:44 EST,0, CONTEXT: xlog redo visible: rel
> 1663/16391/16414; blk 1347460
> 2018-02-02 16:20:44 EST,0, LOG: startup process (PID 24057) was
> terminated by signal 6: Aborted
> 2018-02-02 16:20:44 EST,0, LOG: terminating any other active server
> processes
>
> Any hints to where the corruption begins? I don't see any disk i/o
> issues. Not sure what to look for in the release notes,
>
> but I'll try to patch asap, but that is difficult to get done
> politically.
>
I begin to wonder about pg_basebackup in this old version. I rebuilt
the stby again and this time when I fired up the stby I get:
LOG: database system was not properly shut down; automatic recovery in
progress
LOG: redo starts at 2F45/1F4B7F8
FATAL: could not access status of transaction 4053124744
DETAIL: Could not read from file "pg_clog/0F19" at offset 90112: Success.
CONTEXT: xlog redo commit: 2018-02-05 11:35:54.291398-05
LOG: startup process (PID 130590) exited with exit code 1
LOG: terminating any other active server processes
right or wrong, I rsync-ed gp_clog and it recovered. Can you use
pg_basebackup from a more current patch_level on 9.4.12 cluster?