Re: [BUG] non archived WAL removed during production crash recovery
От | Jehan-Guillaume de Rorthais |
---|---|
Тема | Re: [BUG] non archived WAL removed during production crash recovery |
Дата | |
Msg-id | 20200403182625.0fccc6fd@firost обсуждение исходный текст |
Ответ на | Re: [BUG] non archived WAL removed during production crash recovery (Fujii Masao <masao.fujii@oss.nttdata.com>) |
Ответы |
Re: [BUG] non archived WAL removed during production crash recovery
|
Список | pgsql-bugs |
On Thu, 2 Apr 2020 23:55:46 +0900 Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > On 2020/04/02 22:49, Jehan-Guillaume de Rorthais wrote: >> On Thu, 2 Apr 2020 19:38:59 +0900 >> Fujii Masao <masao.fujii@oss.nttdata.com> wrote: [...] >>> That is, WAL files with .ready files are removed when either >>> archive_mode!=always in standby mode or archive_mode=off. >> >> sounds fine to me. To some extends, it appears to me this sentence was relatively close to my previous patch, as far as you exclude IN_CRASH_RECOVERY from the shortcut: XLogArchiveCheckDone(const char *xlog) { [...] if ( (inRecoveryState != IN_CRASH_RECOVERY) && ( (inRecoveryState == NOT_IN_RECOVERY && !XLogArchivingActive()) || (inRecoveryState == IN_ARCHIVE_RECOVERY && !XLogArchivingAlways()))) return true; Which means that only .done cleanup would occurs during CRASH_RECOVERY and .ready files might be created if no .done exists. No matter the futur status of the cluster: primary or standby. Normal shortcut will apply during first checkpoint after the crash recovery step. This should handle the case where a backup without backup_label (taken from a snapshot or after a shutdown with immediate) is restored to build a standby. Please, find in attachment a third version of my patch "0001-v3-Fix-WAL-retention-during-production-crash-recovery.patch". "0002-v1-Add-test-on-non-archived-WAL-during-crash-recovery.patch" is left untouched. But I'm considering adding some more tests relative to this discussion. > BTW, now I'm thinking that the flag in shmem should be updated when > the startup process sets StandbyModeRequested to true at the beginning > of the recovery. That is, > > - Add something like SharedStandbyModeRequested into XLogCtl. This field > should be initialized with false; > - Set XLogCtl->SharedStandbyModeRequested to true when the startup > process detects the standby.signal file and sets the local variable > StandbyModeRequested to true. > - Make XLogArchiveCheckDone() use XLogCtl->SharedStandbyModeRequested > to know whether the server is in standby mode or not. > > Thought? I try to avoid a new flag in memory with the proposal in attachment of this email. It seems to me various combinaisons of booleans with subtle differences around the same subject makes it a bit trappy and complicated to understand. If my proposal is rejected, I'll be happy to volunteer to add XLogCtl->SharedStandbyModeRequested though. It seems like a simple enough fix as well. Regards,
Вложения
В списке pgsql-bugs по дате отправления: