Re: [BUG] non archived WAL removed during production crash recovery

Поиск
Список
Период
Сортировка
От Jehan-Guillaume de Rorthais
Тема Re: [BUG] non archived WAL removed during production crash recovery
Дата
Msg-id 20200401181735.11100908@firost
обсуждение исходный текст
Ответ на Re: [BUG] non archived WAL removed during production crash recovery  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Ответы Re: [BUG] non archived WAL removed during production crash recovery
Список pgsql-bugs
On Wed, 1 Apr 2020 17:27:22 +0900
Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
[...]
> >    bool inRecovery = RecoveryInProgress();
> >    
> >    /*
> >     * The file is always deletable if archive_mode is "off".  On standbys
> >     * archiving is disabled if archive_mode is "on", and enabled with
> >     * "always".  On a primary, archiving is enabled if archive_mode is "on"
> >     * or "always".
> >     */
> >    if (!((XLogArchivingActive() && !inRecovery) ||
> >          (XLogArchivingAlways() && inRecovery)))
> >        return true;
> > 
> > Please find in attachment a patch that fix this issue using the following
> > test instead:
> > 
> >    if (!((XLogArchivingActive() && !StandbyModeRequested) ||
> >          (XLogArchivingAlways() && inRecovery)))
> >        return true;
> > 
> > I'm not sure if we should rely on StandbyModeRequested for the second part
> > of the test as well thought. What was the point to rely on
> > RecoveryInProgress() to get the recovery status from shared mem?  
> 
> Since StandbyModeRequested is the startup process-local variable,
> it basically cannot be used in XLogArchiveCheckDone() that other process
> may call.

Ok, you answered my wondering about using recovery status from shared mem. This
was obvious. Thanks for your help!

I was wondering if we could use "ControlFile->state != DB_IN_CRASH_RECOVERY".
It seems fine during crash recovery as the control file is updated before the
checkpoint, but it doesn't feel right for other code paths where the control
file might not be up-to-date on filesystem, right ?

> So another approach would be necessary... One straight idea is to
> add new shmem flag indicating whether we are in standby mode or not.

I was thinking about setting XLogCtlData->SharedRecoveryInProgress as an enum
using:

  enum RecoveryState
  {
    NOT_IN_RECOVERY = 0
    IN_CRASH_RECOVERY,
    IN_ARCHIVE_RECOVERY
  }

Please, find in attachment a patch implementing this.

Plus, I added a second commit to add one test in regard with this bug.

> Another is to make the startup process remove .ready file if necessary.

I'm not sure to understand this one.

Regards,

Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: BUG #16332: The xmltable function returns 'comment()' as unusablexml appended together and having no xml tags
Следующее
От: Julien Rouhaud
Дата:
Сообщение: Re: BUG #16331: segfault in checkpointer with full disk