Re: persist logical slots to disk during shutdown checkpoint

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: persist logical slots to disk during shutdown checkpoint
Дата
Msg-id CAA4eK1+P=U8L8Vjor_bh6W1+mJq-E9zCoODzqez3wPX6uxtrPA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: persist logical slots to disk during shutdown checkpoint  (Michael Paquier <michael@paquier.xyz>)
Ответы Re: persist logical slots to disk during shutdown checkpoint  (Michael Paquier <michael@paquier.xyz>)
Список pgsql-hackers
On Fri, Sep 8, 2023 at 10:08 AM Michael Paquier <michael@paquier.xyz> wrote:
>
>
> +    /*
> +     * This is used to track the last persisted confirmed_flush LSN value to
> +     * detect any divergence in the in-memory and on-disk values for the same.
> +     */
>
> "This value tracks is the last LSN where this slot's data has been
> flushed to disk.
>

This makes the comment vague as this sounds like we are saving a slot
corresponding to some LSN which is not the case. If you prefer this
tone then we can instead say: "This value tracks the last
confirmed_flush LSN flushed which is used during a checkpoint shutdown
to decide if a logical slot's data should be forcibly flushed or not."

>
> This is used during a checkpoint shutdown to decide
> if a logical slot's data should be forcibly flushed or not."
>
> Hmm.  WAL senders are shut down *after* the checkpointer and *after*
> the shutdown checkpoint is finished (see PM_SHUTDOWN and
> PM_SHUTDOWN_2) because we want the WAL senders to acknowledge the
> checkpoint record before shutting down the primary.
>

As per my understanding, this is not true for logical walsenders. As
per code, while HandleCheckpointerInterrupts(), we call ShutdownXLOG()
which sends a signal to walsender to stop and waits for it to stop.
And only after that, did it write a shutdown checkpoint WAL record.
After getting the InitStopping signal, walsender sets got_STOPPING
flag. Then *logical* walsender ensures that it sends all the pending
WAL and exits. What you have quoted is probably true for physical
walsenders.

>
>  In order to limit
> the number of records to work on after a restart, what this patch is
> proposing is an improvement.  Perhaps it would be better to document
> that we don't care about the potential concurrent activity of logical
> WAL senders in this case and that the LSN we are saving at is a best
> effort, meaning that last_saved_confirmed_flush is just here to reduce
> the damage on a follow-up restart?
>

Unless I am wrong, there shouldn't be any concurrent activity for
logical walsenders. IIRC, it is a mandatory requirement for logical
walsenders to stop before shutdown checkpointer to avoid panic error.
We do handle logical walsnders differently because they can generate
WAL during decoding.

>
>  The comment added in
> CheckPointReplicationSlots() goes in this direction, but perhaps this
> potential concurrent activity should be mentioned?
>

Sure, we can change it if required.

--
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: Eager page freeze criteria clarification
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Avoid a possible null pointer (src/backend/utils/adt/pg_locale.c)