Re: Detecting some cases of missing backup_label

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: Detecting some cases of missing backup_label
Дата
Msg-id ZYBZtWiR6cT/Vu83@tamriel.snowman.net
обсуждение исходный текст
Ответ на Re: Detecting some cases of missing backup_label  (Stephen Frost <sfrost@snowman.net>)
Ответы Re: Detecting some cases of missing backup_label  (David Steele <david@pgmasters.net>)
Список pgsql-hackers
Greetings,

* Stephen Frost (sfrost@snowman.net) wrote:
> * Andres Freund (andres@anarazel.de) wrote:
> > I recently mentioned to Robert (and also Heikki earlier), that I think I see a
> > way to detect an omitted backup_label in a relevant subset of the cases (it'd
> > apply to the pg_control as well, if we moved to that).  Robert encouraged me
> > to share the idea, even though it does not provide complete protection.
>
> That would certainly be nice.
>
> > The subset I think we can address is the following:
> >
> > a) An omitted backup_label would lead to corruption, i.e. without the
> >    backup_label we won't start recovery at the right position. Obviously it'd
> >    be better to also catch a wrong procedure when it'd not cause corruption -
> >    perhaps my idea can be extended to handle that, with a small bit of
> >    overhead.
> >
> > b) The backup has been taken from a primary. Unfortunately that probably can't
> >    be addressed - but the vast majority of backups are taken from a primary,
> >    so I think it's still a worthwhile protection.
>
> Agreed that this is a worthwhile set to try and address, even if we
> can't address other cases.
>
> > Here's my approach
> >
> > 1) We add a XLOG_BACKUP_START WAL record when starting a base backup on a
> >    primary, emitted just *after* the checkpoint completed
> >
> > 2) When replaying a base backup start record, we create a state file that
> >    includes the corresponding LSN in the filename
> >
> > 3) On the primary, the state file for XLOG_BACKUP_START is *not* created at
> >    that time. Instead the state file is created during pg_backup_stop().
> >
> > 4) When replaying a XLOG_BACKUP_END record, we verif that the state file
> >    created by XLOG_BACKUP_START is present, and error out if not.  Backups
> >    that started before the redo LSN from backup_label are ignored
> >    (necessitates remembering that LSN, but we've been discussing that anyway).
> >
> >
> > Because the backup state file on the primary is only created during
> > pg_backup_stop(), a copy of the data directory taken between pg_backup_start()
> > and pg_backup_stop() does *not* contain the corresponding "backup state
> > file". Because of this, an omitted backup_label is detected if recovery does
> > not start early enough - recovery won't encounter the XLOG_BACKUP_START record
> > and thus would not create the state file, leading to an error in 4).
>
> While I see the idea here, I think, doesn't it end up being an issue if
> things happen like this:
>
> pg_backup_start -> XLOG_BACKUP_START WAL written -> new checkpoint
> happens -> pg_backup_stop -> XLOG_BACKUP_STOP WAL written -> crash
>
> In that scenario, we'd go back to the new checkpoint (the one *after*
> the checkpoint that happened before we wrote XLOG_BACKUP_START), start
> replaying, and then hit the XLOG_BACKUP_STOP and then error out, right?
> Even though we're actually doing crash recovery and everything should be
> fine as long as we replay all of the WAL.

Andres and I discussed this in person at PGConf.eu and the idea is that
if we find a XLOG_BACKUP_STOP record then we can check if the state file
was written out and if so then we can conclude that we are doing crash
recovery and not restoring from a backup and therefore we don't error
out.  This also implies that we don't consider PG to be recovered at the
XLOG_BACKUP_STOP point, if the state file exists, but instead we have to
be sure to replay all WAL that's been written.  Perhaps we even
explicitly refuse to use restore_command in this case?

We do error out if we hit a XLOG_BACKUP_STOP and the state file
doesn't exist, as that implies that we started replaying from a point
after a XLOG_BACKUP_START record was written but are working from a copy
of the data directory which didn't include the state file.

Of course, we need to actually implement and test these different cases
to make sure it all works but I'm at least feeling better about the idea
and wanted to share that here.

Thanks,

Stephen

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Emre Hasegeli
Дата:
Сообщение: Re: "pgoutput" options missing on documentation
Следующее
От: Christoph Berg
Дата:
Сообщение: psql JSON output format