Re: Allow some recovery parameters to be changed with reload

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: Allow some recovery parameters to be changed with reload
Дата
Msg-id 20201110.205321.1681552377835477837.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на Re: Allow some recovery parameters to be changed with reload  (Sergei Kornilov <sk@zsrv.org>)
Ответы Re: Allow some recovery parameters to be changed with reload
Список pgsql-hackers
Hello.

At Tue, 10 Nov 2020 14:13:17 +0300, Sergei Kornilov <sk@zsrv.org> wrote in
> Hello
>
> > If someone changes restore_command to '' then reload while crash
> > recovery is running, the server stops for no valid reason.
>
> While *crash* recovery is running? It's possible only during Point-in-Time Recovery, no?

Even if PITR is commanded, crash recovery can run before starting
archive recovery if the server was not gracefully shut down.
Parameter reload can happen while crash recovery. And
validateRecoveryParameters() calls "ereport(FATAL" in that case.

> At the beginning of validateRecoveryParameters we check ArchiveRecoveryRequested, which can be set in two cases:

That does not prevent crash recovery from running.

> * if recovery.signal found - same check on recovery start. Otherwise it is possible to early end recovery because of
emptyrestore_command. So we want to protect the user from such misconfiguration? I am fine if we decide that no
additionalhandling is needed. 
> * if standby.signal found - this FATAL is not reachable because StandbyModeRequested is also set.
>
> During crash recovery validateRecoveryParameters does nothing.



> > If restore_command is set to 'hoge' (literally:p, that is, anything
> > unexecutable) and send SIGHUP while archive recovery is running, the
> > server stops. I think we need to handle these cases more gracefully,
>
> I think we can not perform such check reliable. As in my example earlier:
>
> > restore_command = '. /etc/wal-g/WALG_AWS_ENV; wal-g wal-fetch "%f" "%p"'
>
> How do we find the commands first? For any shell? And even: we learned that the binary is unexecutable. But what to
donext? 

I don't suggest to check if the command actually works, I suggested to
avoid server stop even if the parameters failed to run after a
config-reload.

> > If someone changes restore_command by mistake to something executable
> > but fails to offer the specfied file even if it exists, the running
> > archive recovery finishes then switches timeline unexpectedly.
>
> Or executable file was just removed. Which is clearly a pilot error. Is this differs from changing restore_command?

I don't know. I just think that it is not proper that "ALTER SYSTEM" +
config-reload causes server stop.

> >>  I do not know the history of this fatal ereport. It looks like "must specify restore_command when standby mode is
notenabled" check is only intended to protect the user from misconfiguration and the rest code will treat empty
restore_commandcorrectly, just like /bin/false. Did not notice anything around StandbyMode conditions. 
> >
> > If restore_command is not changable after server-start, it would be
> > valid for startup to stop for inexecutable content for the variable
> > since there's no way to proceed recovery.
>
> Why not use local pg_wal? There may be already enough WAL.

Mmm. If the file to read is in pg_wal, restore_command won't be
executed in the first place?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kyotaro Horiguchi
Дата:
Сообщение: Re: Autovacuum on partitioned table (autoanalyze)
Следующее
От: Magnus Hagander
Дата:
Сообщение: Re: -O switch