Re: persist logical slots to disk during shutdown checkpoint

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: persist logical slots to disk during shutdown checkpoint
Дата
Msg-id ZPqlRqEeue5JQyLh@paquier.xyz
обсуждение исходный текст
Ответ на Re: persist logical slots to disk during shutdown checkpoint  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: persist logical slots to disk during shutdown checkpoint  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Fri, Sep 08, 2023 at 09:04:43AM +0530, Amit Kapila wrote:
> I have added something on these lines and also changed the other
> comment pointed out by you. In the passing, I made minor cosmetic
> changes as well.

+ * We can flush dirty replication slots at regular intervals by any
+ * background process like bgwriter but checkpoint is a convenient location.

I don't see a need to refer to the bgwriter here.  On the contrary, it
can be confusing as one could think that this flush happens in the
bgwriter, but that's not the case currently as only the checkpointer
does that.

+        * We won't ensure that the slot is persisted after the

s/persisted/flushed/?  Or just refer to the "slot's data being
flushed", or refer to "the slot's data is made durable" instead?  The
use of "persist" here is confusing, because a slot's persistence
refers to it as being a *candidate* for flush (compared to an
ephemeral slot), and it does not refer to the *fact* of flushing its
data to make sure that it survives a crash.  In the context of this
patch, the LSN value tracked in the slot's in-memory data refers to
the last point where the slot's data has been flushed.

+    /*
+     * This is used to track the last persisted confirmed_flush LSN value to
+     * detect any divergence in the in-memory and on-disk values for the same.
+     */

"This value tracks is the last LSN where this slot's data has been
flushed to disk.  This is used during a checkpoint shutdown to decide
if a logical slot's data should be forcibly flushed or not."

Hmm.  WAL senders are shut down *after* the checkpointer and *after*
the shutdown checkpoint is finished (see PM_SHUTDOWN and
PM_SHUTDOWN_2) because we want the WAL senders to acknowledge the
checkpoint record before shutting down the primary.  In order to limit
the number of records to work on after a restart, what this patch is
proposing is an improvement.  Perhaps it would be better to document
that we don't care about the potential concurrent activity of logical
WAL senders in this case and that the LSN we are saving at is a best
effort, meaning that last_saved_confirmed_flush is just here to reduce
the damage on a follow-up restart?  The comment added in
CheckPointReplicationSlots() goes in this direction, but perhaps this
potential concurrent activity should be mentioned?
--
Michael

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jingtang Zhang
Дата:
Сообщение: Suspicious redundant assignment in COPY FROM
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Impact of checkpointer during pg_upgrade