Re: when the startup process doesn't (logging startup delays)

Поиск
Список
Период
Сортировка
От Nitin Jadhav
Тема Re: when the startup process doesn't (logging startup delays)
Дата
Msg-id CAMm1aWZi5FV=m2hvkTj5bjQ03DYece3HMJGngdXbRiSuaY2CWw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: when the startup process doesn't (logging startup delays)  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: when the startup process doesn't (logging startup delays)  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
> I was fooling around with a test setup today, working on an unrelated
> problem, and this happened:
>
> 2021-10-28 14:21:23.145 EDT [92010] LOG:  resetting unlogged relations
> (init), elapsed time: 0.00 s, current path: base/13020

Nice catch and interesting case.

> That's not supposed to happen. I assume the problem is that the
> timeout for the previous phase fired just as we were beginning a new
> one, and the code got confused. I think we probably need to do
> something like this to make sure that the timeout from one operation
> can't trigger a log message for the next:
>
> diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
> index 28e68dd871..47ec737888 100644
> --- a/src/backend/postmaster/startup.c
> +++ b/src/backend/postmaster/startup.c
> @@ -320,6 +320,8 @@ begin_startup_progress_phase(void)
>     if (log_startup_progress_interval == 0)
>         return;
>
> +    disable_timeout(STARTUP_PROGRESS_TIMEOUT, false);
> +    startup_progress_timer_expired = false;
>     startup_progress_phase_start_time = GetCurrentTimestamp();
>     fin_time = TimestampTzPlusMilliseconds(startup_progress_phase_start_time,
>                                            log_startup_progress_interval);
>
> Thoughts?

Yes. I agree that this is not an expected behaviour and I do agree
that, probably the timeout for the previous phase fired just as we
were beginning a new one. For each operation, we call
begin_startup_progress_phase() before starting the operation and
one/multiple calls to ereport_startup_progress() to report the
progress and intentionally we have not added any functionality to
disable the timer at the end of the operation. The timer remains
active and there may be multiple calls to
startup_progress_timeout_handler() which sets a flag to true. So
whenever we call a begin_startup_progress_phase() for the next
operation, we do enable the timer (In my understanding it will
internally disable the old timer and schedule a new one) but the flag
is already set to true by the previous timer. Hence the next call to
ereport_startup_progress() logs a message. So I feel just setting
'startup_progress_timer_expired' to false in
begin_startup_progress_phase() would work. Please correct me if I am
wrong.

Thanks & Regards,
Nitin Jadhav

On Thu, Oct 28, 2021 at 11:59 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Mon, Oct 25, 2021 at 11:56 AM Robert Haas <robertmhaas@gmail.com> wrote:
> > This version looks fine, so I have committed it (and my
> > enable_timeout_every patch also, as a necessary prerequisite).
>
> I was fooling around with a test setup today, working on an unrelated
> problem, and this happened:
>
> 2021-10-28 14:21:23.145 EDT [92010] LOG:  resetting unlogged relations
> (init), elapsed time: 0.00 s, current path: base/13020
>
> That's not supposed to happen. I assume the problem is that the
> timeout for the previous phase fired just as we were beginning a new
> one, and the code got confused. I think we probably need to do
> something like this to make sure that the timeout from one operation
> can't trigger a log message for the next:
>
> diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
> index 28e68dd871..47ec737888 100644
> --- a/src/backend/postmaster/startup.c
> +++ b/src/backend/postmaster/startup.c
> @@ -320,6 +320,8 @@ begin_startup_progress_phase(void)
>      if (log_startup_progress_interval == 0)
>          return;
>
> +    disable_timeout(STARTUP_PROGRESS_TIMEOUT, false);
> +    startup_progress_timer_expired = false;
>      startup_progress_phase_start_time = GetCurrentTimestamp();
>      fin_time = TimestampTzPlusMilliseconds(startup_progress_phase_start_time,
>                                             log_startup_progress_interval);
>
> Thoughts?
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Teach pg_receivewal to use lz4 compression
Следующее
От: Daniel Gustafsson
Дата:
Сообщение: Re: Support for NSS as a libpq TLS backend