Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Дата
Msg-id CAD21AoBcZk9xQhHKOA4EsS97zSHveMrDJprqOjTXuKar+ony8A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: POC: enable logical decoding when wal_level = 'replica' without a server restart  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Fri, Nov 14, 2025 at 3:12 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Nov 14, 2025 at 4:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Nov 14, 2025 at 1:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Nov 14, 2025 at 5:01 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > I can look at the patch but this inprogress flag is another part which
> > > I wanted to avoid if possible again due to its additional complexity.
> > > So, I came up with an alternative locking scheme to enable/disable
> > > decoding. You can compare both the ideas and share your thoughts:
> > >
> > > During promotion code path:
> > > 1. Acquire LogicalDecodingControlLock, then check and remember whether
> > > we need to enable or disable the xlog_logical_info and
> > > logical_decoding_enabled. Release LogicalDecodingControlLock.
> > > 2. Set xlog_logical_info and mark SharedRecoveryState =
> > > RECOVERY_STATE_DONE under one spinlock.
> > > 3. After the spinlock and controlfile lock are released, wait for
> > > other backends to reflect the xlog_logical_info via
> > > WaitForProcSignalBarrier(EmitProcSignalBarrier(PROCSIGNAL_BARRIER_UPDATE_XLOG_LOGICAL_INFO)),
> > > 4. Acquire LogicalDecodingControlLock in X mode, then there are two
> > > cases to deal with:
> > >   (a) As part of step-1, the decision was to enable logical decoding.
> > > So, we first check if some backend has already enabled it by checking
> > > logical_decoding_enabled, if so, then we don't need to do anything.
> > > Otherwise, once again count_slots to ensure that the concurrent
> > > backend hasn't removed them, and if there still exist any, then set
> > > logical_decoding_enabled, write a new WAL record, and release the
> > > LogicalDecodingControlLock.
> > >   (b) As part of step-1, the decision was to disable logical decoding.
> > > So, we first check if some backend has already enabled it by checking
> > > logical_decoding_enabled, if so, then we don't need to do anything.
> > > Otherwise, set logical_decoding_enabled to false, write WAL record,
> > > and release the LogicalDecodingControlLock.
> >
> > I'm not sure it works in cases where we need to disable logical
> > decoding at the end-of-recovery. Suppose that the decision made in
> > step-1 was to disable logical decoding, it's possible that non-logical
> > WAL records are written as soon as step-3 finishes while the logical
> > decoding is still enabled. This is because the backend processes who
> > started after step-3 see xlog_logical_info = false. This ends up with
> > logical decoding decoding non-logical WAL records.
> >
>
> If the startup process decides to disable decoding, this means there
> doesn't exist any logical slot and wal_level is 'replica', right? If
> so, then when we create the first slot before decoding, we should try
> to first enable xlog_logical_info, if not already enabled, wait for
> all backends to reflect that state. So, that should be sufficient.

Right. But I think its (cascaded) standby could have logical slots and
decode non-logical WAL records.

>
> > And there is a small window between step-2 and step-4(a) where
> > wal_level shows 'logical' but logical decoding is not enabled.
> >
>
> True but does that really matter?

No, it could confuse users but isn't a problem in practice.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



В списке pgsql-hackers по дате отправления: