Hi,
On 2020-04-05 15:49:16 -0700, Andres Freund wrote:
> When starting with on a data directory with an older WAL page magic we
> currently make that hard to debug. E.g.:
>
> 2020-04-05 15:31:04.314 PDT [1896669][:0] LOG: database system was shut down at 2020-04-05 15:24:56 PDT
> 2020-04-05 15:31:04.314 PDT [1896669][:0] LOG: invalid primary checkpoint record
> 2020-04-05 15:31:04.314 PDT [1896669][:0] PANIC: could not locate a valid checkpoint record
> 2020-04-05 15:31:04.315 PDT [1896668][:0] LOG: startup process (PID 1896669) was terminated by signal 6: Aborted
> 2020-04-05 15:31:04.315 PDT [1896668][:0] LOG: aborting startup due to startup process failure
> 2020-04-05 15:31:04.316 PDT [1896668][:0] LOG: database system is shut down
>
> As far as I can tell this is not just the case for a wrong page magic,
> but for all page level validation errors.
>
> I think this largely originates in:
>
> commit 0668719801838aa6a8bda330ff9b3d20097ea844
> Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
> Date: 2018-05-05 01:34:53 +0300
>
> Fix scenario where streaming standby gets stuck at a continuation record.
Heikki, Kyotaro, it'd be good if you could comment on what motivated
this approach. Because it sure as hell hides a lot of useful information
when there's a problem with WAL. Or well, all information.
- Andres
Greetings,
Andres Freund