Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
От | Masahiko Sawada |
---|---|
Тема | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
Дата | |
Msg-id | CAD21AoDjdeqwTHa5nL-3nfEnNA4SfrP4k0yR90kq68=JOLRWxg@mail.gmail.com обсуждение исходный текст |
Ответ на | RE: POC: enable logical decoding when wal_level = 'replica' without a server restart ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>) |
Список | pgsql-hackers |
On Fri, Aug 29, 2025 at 5:31 AM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > Dear Sawada-san, > > > My understanding of where the synced slot starts to move was not > > right; it starts from the remote slot's restart_lsn, which could be > > far ahead from the STATUS_CHANGE record that the startup process is > > applying but where logical decoding should be enabled. It doesn't > > happen that the slotsync worker tries to decode non-logical WAL > > records even if it advances the slot after the startup disabled > > logical decoding. > > Let me confirm your point. If the situation, which the slot is dropped and then > created while the startup process processing, happens, the WAL records would be > aligned like below. Your point is that the restart_lsn of the created slot is > beginning of (b) so that all records can be decoded, right? > > ``` > STATUS_CHANGE true > RUNNING_XACTS // (a) - generated by the first slot > ... > STATUS_CHANGE false // due to the slot drop > ... > STATUS_CHANGE true // from here all records are decode-safe > RUNNING_XACTS // (b) - generated by the second slot, restart_lsn can set here > ``` Yes. If I understand it correctly, even when the startup is processing the second STATUS_CHANGE record (i.e., disabling logical decoding), the synced slot uses the corresponding remote slot's restart_lsn, i.e., (b). I believe that if the standby has not received the RUNNING_XACT(b) yet at that point, the slotsync worker skips to sync the slot (see the check at the top of synchronize_one_slot()). > > > how efficiently to fix it. I've considered a simple idea that the > > slotsync worker checks IsLogicalDecodingEnabled() before trying to > > sync one logical slot. However, it doesn't solve the race condition; > > the startup process can disable logical decoding right after the > > slotsync passed the check, in which case users would see the logical > > slot is created after logical decoding is disabled. > > So... even if we can add check in decoding functions, the startup process can > disable the logical decoding after that, is it also right? I think so. I think IsLogicalDecodingEnabled() check is a check whether a process can start logical decoding, but doesn't cover already running logical decoding processes. The slot invalidation mechanism is responsible for that. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
В списке pgsql-hackers по дате отправления: