On 11/26/18 10:13 PM, David Steele wrote:
> Hackers,
>
> I propose we remove the deprecated exclusive backup mode of
> pg_start_backup()/pg_stop_backup() for Postgres 12.
>
> The exclusive backup mode has a serious known issue. If Postgres
> terminates ungracefully during a backup (due to hardware, kernel,
> Postgres issues, etc.) then Postgres may refuse to restart.
>
> The reason is that the backup_label file will usually reference a
> checkpoint LSN that is older than the WAL available in pg_wal. Postgres
> does emit a helpful error message while PANIC'ing but that's cold
> comfort to an admin who must manually intervene to get their cluster
> operational again.
>
> The deprecated exclusive mode promises to make a difficult problem
> simple but doesn't live up to that promise. That's why it was replaced
> externally in 9.6 and why pg_basebackup has not used exclusive backups
> since it was introduced in 9.2.
>
> Non-exclusive backups have been available since 9.6 and several
> third-party solutions support this mode, in addition to pg_basebackup.
>
> The recently introduced recovery changes will break current automated
> solutions so this seems like a good opportunity to make improvements on
> the backup side as well.
>
> I'll submit a patch for the 2019-01 commitfest.
Attached is the patch.
I was a bit surprised by how much code went away. There was a lot of
code dedicated to making sure that backup_label got renamed on shutdown,
that there was not an exclusive backup running, etc.
There were no tests for exclusive backup so the test changes were minimal.
I did have to replace one "hot backup" in
010_logical_decoding_timelines.pl. I'm not sure why the backup was
being done this way, or why the standby needs a copy of pg_replslot
(which is not copied by pg_basebackup). It might be worth looking at
but it seemed out of scope for this patch.
Regards,
--
-David
david@pgmasters.net