Re: Mention idle_replication_slot_timeout in pg_replication_slots docs

Поиск
Список
Период
Сортировка
От Nisha Moond
Тема Re: Mention idle_replication_slot_timeout in pg_replication_slots docs
Дата
Msg-id CABdArM4KqFVoSzZBYs4jPBEo=4e1kjcOpCF8rWZSthxZFcnWdw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Mention idle_replication_slot_timeout in pg_replication_slots docs  (Nisha Moond <nisha.moond412@gmail.com>)
Ответы Re: Mention idle_replication_slot_timeout in pg_replication_slots docs
Список pgsql-docs
On Fri, Jun 27, 2025 at 5:40 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
>
>
> On 2025/06/27 15:32, Nisha Moond wrote:
> > On Thu, Jun 26, 2025 at 1:33 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> >>
> >>
> >>
> >> On 2025/06/26 15:46, Nisha Moond wrote:
> >>> On Wed, Jun 25, 2025 at 9:56 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> The pg_replication_slots documentation mentions only max_slot_wal_keep_size
> >>>> as a condition under which the wal_status column can show unreserved or lost.
> >>>> However, since commit ac0e33136ab, idle_replication_slot_timeout can also
> >>>> cause this behavior when it is set. This has not been documented yet.
> >>>> https://www.postgresql.org/docs/devel/view-pg-replication-slots.html
> >>>>
> >>>
> >>> +1 to the doc update.
> >>
> >> Thanks for the review!
> >>
> >>
> >>>> So, how about updating the documentation to also mention
> >>>> idle_replication_slot_timeout as a factor that can cause wal_status to
> >>>> become unreserved or lost? Patch attached.
> >>>>
> >>>
> >>> Since idle_replication_slot_timeout can only cause wal_status to
> >>> become 'lost' and not 'unreserved', perhaps we can reword the sentence
> >>> slightly for clarity, suggestion -
> >>> "The last two states are seen when max_slot_wal_keep_size is
> >>> non-negative and, the 'lost' state may also appear when
> >>> idle_replication_slot_timeout is greater than zero."
> >>
> >> I was thinking that when idle_replication_slot_timeout triggers,
> >> the following functions are called, and that wal_status can become
> >> "unreserved" before ReplicationSlotRelease() runs. It's very short
> >> period, though. Am I wrong?
> >>
> >> ReplicationSlotMarkDirty();
> >> ReplicationSlotSave();
> >> ReplicationSlotRelease();
> >>
> >
> > Thank you for pointing it out.
> > You are correct that while the checkpointer is in the process of
> > invalidating a slot, it sets its PID as the slot’s active_pid. During
> > this short window, if a user queries pg_replication_slot, the
> > underlying function pg_get_replication_slots will compute the
> > wal_status as 'unreserved' for the invalidated slot because the slot
> > has a valid active_pid.
> >
> > That said, it's reasonable to mention in the doc that 'unreserved' may
> > appear when idle_replication_slot_timeout is greater than zero, as
> > this can indeed happen. So, let's retain the current description.
> >
> > However, this behavior isn’t specific to
> > idle_replication_slot_timeout. For example, when a slot is being
> > invalidated due to a different cause "wal_level_insufficient",
> > 'unreserved' may also briefly appear in wal_status.
>
> Yes, and "lost" can appear for various reasons, including wal_level_insufficient,
> so it seems odd to highlight max_slot_wal_keep_size as the cause of the "lost"
> status in the note. It would probably be better to remove the mention of "lost"
> from that note.
>

+1

> As for "unreserved", it can also occur for different reasons, but typically,
> it happens when max_slot_wal_keep_size is set to a non-negative value.
> So it might make sense to keep the explanation focused just on "unreserved"
> and max_slot_wal_keep_size. For example:
>
> ----------------------
>           <listitem>
>            <para>
>             <literal>unreserved</literal> means that the slot no longer
>             retains the required WAL files and some of them are to be removed at
> -          the next checkpoint.  This state can return
> +          the next checkpoint.  This can occur when
> +          <xref linkend="guc-max-slot-wal-keep-size"/> is set to
> +          a non-negative value.  This state can return
>             to <literal>reserved</literal> or <literal>extended</literal>.
>            </para>
>           </listitem>
>           <listitem>
> ----------------------
>
> What do you think?
>

The change LGTM, only a minor suggestion to add "typically", as “This
can typically occur when…” to indicate that max_slot_wal_keep_size is
one possible reason, not the only one.

>
> Also, I noticed the note that says “If <structfield>restart_lsn</structfield>
> is NULL, this field is null” seems inaccurate. For example, when "wal_removed"
> happens, restart_lsn is NULL but wal_status is "lost". So maybe we should remove
> that note as well?

You're right, the statement is not accurate.
We could rephrase it as: "If <structfield>restart_lsn</structfield> is
NULL, this field is either null or lost." But since 'unreserved' can
also appear briefly during invalidation, it might be better to remove
it altogether.

--
Thanks,
Nisha



В списке pgsql-docs по дате отправления: