Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)

Поиск
Список
Период
Сортировка
От Matthias van de Meent
Тема Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
Дата
Msg-id CAEze2Wh3dmGhsnkvv9D3krU9m+o86SEXnBdRQB=krjFgX2ANRg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)  (Nitin Jadhav <nitinjadhavpostgres@gmail.com>)
Ответы Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
Список pgsql-hackers
On Tue, 22 Feb 2022 at 07:39, Nitin Jadhav
<nitinjadhavpostgres@gmail.com> wrote:
>
> > > Thank you for sharing the information.  'triggering backend PID' (int)
> > > - can be stored without any problem. 'checkpoint or restartpoint?'
> > > (boolean) - can be stored as a integer value like
> > > PROGRESS_CHECKPOINT_TYPE_CHECKPOINT(0) and
> > > PROGRESS_CHECKPOINT_TYPE_RESTARTPOINT(1). 'elapsed time' (store as
> > > start time in stat_progress, timestamp fits in 64 bits) - As
> > > Timestamptz is of type int64 internally, so we can store the timestamp
> > > value in the progres parameter and then expose a function like
> > > 'pg_stat_get_progress_checkpoint_elapsed' which takes int64 (not
> > > Timestamptz) as argument and then returns string representing the
> > > elapsed time.
> >
> > No need to use a string there; I think exposing the checkpoint start
> > time is good enough. The conversion of int64 to timestamp[tz] can be
> > done in SQL (although I'm not sure that exposing the internal bitwise
> > representation of Interval should be exposed to that extent) [0].
> > Users can then extract the duration interval using now() - start_time,
> > which also allows the user to use their own preferred formatting.
>
> The reason for showing the elapsed time rather than exposing the
> timestamp directly is in case of checkpoint during shutdown and
> end-of-recovery, I am planning to log a message in server logs using
> 'log_startup_progress_interval' infrastructure which displays elapsed
> time. So just to match both of the behaviour I am displaying elapsed
> time here. I feel that elapsed time gives a quicker feel of the
> progress. Kindly let me know if you still feel just exposing the
> timestamp is better than showing the elapsed time.

At least for pg_stat_progress_checkpoint, storing only a timestamp in
the pg_stat storage (instead of repeatedly updating the field as a
duration) seems to provide much more precise measures of 'time
elapsed' for other sessions if one step of the checkpoint is taking a
long time.

I understand the want to integrate the log-based reporting in the same
API, but I don't think that is necessarily the right approach:
pg_stat_progress_* has low-overhead infrastructure specifically to
ensure that most tasks will not run much slower while reporting, never
waiting for locks. Logging, however, needs to take locks (if only to
prevent concurrent writes to the output file at a kernel level) and
thus has a not insignificant overhead and thus is not very useful for
precise and very frequent statistics updates.

So, although similar in nature, I don't think it is smart to use the
exact same infrastructure between pgstat_progress*-based reporting and
log-based progress reporting, especially if your logging-based
progress reporting is not intended to be a debugging-only
configuration option similar to log_min_messages=DEBUG[1..5].

- Matthias



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: bailing out in tap tests nearly always a bad idea
Следующее
От: "Imseih (AWS), Sami"
Дата:
Сообщение: [BUG] Panic due to incorrect missingContrecPtr after promotion