Re: reassure me that it's good to copy pg_control last in a basebackup

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: reassure me that it's good to copy pg_control last in a basebackup
Дата
Msg-id 20171222052927.GA15816@paquier.xyz
обсуждение исходный текст
Ответ на reassure me that it's good to copy pg_control last in a base backup  (Chapman Flack <chap@anastigmatix.net>)
Ответы Re: reassure me that it's good to copy pg_control last in a basebackup  (Chapman Flack <chap@anastigmatix.net>)
Список pgsql-hackers
On Thu, Dec 21, 2017 at 10:48:49PM -0500, Chapman Flack wrote:
> From that description alone, I'd imagine a danger in redoing from a
> base backup in which pg_control was copied last. What if another
> checkpoint was made (after the one done by pg_start_backup) during
> the course of the backup, and the late-copied pg_control refers to
> it, but some of the files had been copied into the base backup
> too early to reflect it?

As long as you have a backup_label file to guarantee the start position
of recovery, that's not something to worry about. What would be bad is
to remove the backup_label file from a backup, which exposes you to
risks of corrupting an instance. This description stands for crash
recovery, where there is no backup_label file. Now you see why the
exclusive backup API can lead to problems? Imagine the case where
you take a exclusive backup and the instance from which a backup is
taken crashes, *with* a backup_label file on disk. Oops. That's one
reason behind non-exclusive backups, which is what pg_basebackup
uses as well.

> Looking harder, I think I see that the special care to grab
> pg_control last was introduced for the case of taking a base backup
> from a standby, and perhaps only matters in that case. The long
> discussion seems to be this one:
>
> https://www.postgresql.org/message-id/201108050646.p756kHC5023570%40ccmds32.silk.ntts.co.jp

Copying pg_control last in the backup matters only for bcakups taken from
standbys where you want to maximize the LSN position for minRecoveryPoint
so as you have a minimum amount of risks to face inconsistent data at
recovery. When taking a backup from a primary server, the WAL record
marking the end of the backup holds as guarantee that a consistent point
has been reached, so it does not matter to copy the control file first
or last in this case.

> What I think I've gleaned is:
>
> 1. The description in the doc ("at the start of recovery, the server
>    first reads pg_control and the checkpoint record") only applies to
>    the kind of recovery that happens in an unexpected restart, using
>    the files that are present; it's not the whole story for the kind
>    of recovery that begins with a base backup.

Yes, that's a crash recovery. But see the case I just described above
of an instance that crashing while an exclusive backup is running.

> 2. In the case of recovery from a backup (that was taken from a master),
>    both the start and end location in pg_control are disregarded, in
>    favor of the backup label file and the backup end WAL record,
>    respectively, so it doesn't matter a whit whether pg_control was
>    copied early or late.

Yes.

> 3. In recovery from a backup taken from a standby, there is a backup
>    label file but no backup end WAL record, so the 'minimum recovery
>    ending location' in pg_control has to be used, and that's why the
>    fuss about copying pg_control last when backing up from a standby.

Yes.

> Did I get that right? If so, would it be worth adding some words
> to that paragraph in "WAL Internals", to clarify that the pg_control
> checkpoint position is not relied on when starting recovery with
> a backup label present, and therefore it isn't scary to copy pg_control
> late in the backup?

I would be interested in seeing a patch about that, people tend to
remove backup_label files too easily, so hardening the documentation
a bit could be an idea to dig into.
--
Michael

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: autoprewarm is fogetting to register a tranche.
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Fix a typo in autoprewarm.c