Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
| От | Amit Kapila |
|---|---|
| Тема | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
| Дата | |
| Msg-id | CAA4eK1+_oAUcFkL9uLb4tG234rUoBoKAVc5DVbJnmjmrWZde4Q@mail.gmail.com обсуждение исходный текст |
| Ответ на | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart (Masahiko Sawada <sawada.mshk@gmail.com>) |
| Ответы |
Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
|
| Список | pgsql-hackers |
On Fri, Nov 14, 2025 at 4:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Fri, Nov 14, 2025 at 1:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Nov 14, 2025 at 5:01 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > I can look at the patch but this inprogress flag is another part which > > I wanted to avoid if possible again due to its additional complexity. > > So, I came up with an alternative locking scheme to enable/disable > > decoding. You can compare both the ideas and share your thoughts: > > > > During promotion code path: > > 1. Acquire LogicalDecodingControlLock, then check and remember whether > > we need to enable or disable the xlog_logical_info and > > logical_decoding_enabled. Release LogicalDecodingControlLock. > > 2. Set xlog_logical_info and mark SharedRecoveryState = > > RECOVERY_STATE_DONE under one spinlock. > > 3. After the spinlock and controlfile lock are released, wait for > > other backends to reflect the xlog_logical_info via > > WaitForProcSignalBarrier(EmitProcSignalBarrier(PROCSIGNAL_BARRIER_UPDATE_XLOG_LOGICAL_INFO)), > > 4. Acquire LogicalDecodingControlLock in X mode, then there are two > > cases to deal with: > > (a) As part of step-1, the decision was to enable logical decoding. > > So, we first check if some backend has already enabled it by checking > > logical_decoding_enabled, if so, then we don't need to do anything. > > Otherwise, once again count_slots to ensure that the concurrent > > backend hasn't removed them, and if there still exist any, then set > > logical_decoding_enabled, write a new WAL record, and release the > > LogicalDecodingControlLock. > > (b) As part of step-1, the decision was to disable logical decoding. > > So, we first check if some backend has already enabled it by checking > > logical_decoding_enabled, if so, then we don't need to do anything. > > Otherwise, set logical_decoding_enabled to false, write WAL record, > > and release the LogicalDecodingControlLock. > > I'm not sure it works in cases where we need to disable logical > decoding at the end-of-recovery. Suppose that the decision made in > step-1 was to disable logical decoding, it's possible that non-logical > WAL records are written as soon as step-3 finishes while the logical > decoding is still enabled. This is because the backend processes who > started after step-3 see xlog_logical_info = false. This ends up with > logical decoding decoding non-logical WAL records. > If the startup process decides to disable decoding, this means there doesn't exist any logical slot and wal_level is 'replica', right? If so, then when we create the first slot before decoding, we should try to first enable xlog_logical_info, if not already enabled, wait for all backends to reflect that state. So, that should be sufficient. > And there is a small window between step-2 and step-4(a) where > wal_level shows 'logical' but logical decoding is not enabled. > True but does that really matter? -- With Regards, Amit Kapila.
В списке pgsql-hackers по дате отправления: