Re: Clear logical slot's 'synced' flag on promotion of standby
От | Ashutosh Sharma |
---|---|
Тема | Re: Clear logical slot's 'synced' flag on promotion of standby |
Дата | |
Msg-id | CAE9k0P=ODwH5aB-skBgffvDS010Jo1h=wGpLpE0aCqnqfx2+xg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Clear logical slot's 'synced' flag on promotion of standby (shveta malik <shveta.malik@gmail.com>) |
Ответы |
Re: Clear logical slot's 'synced' flag on promotion of standby
|
Список | pgsql-hackers |
On Thu, Sep 11, 2025 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Tue, Sep 9, 2025 at 2:19 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > > > Hi, > > > > > > + * required resources. Clear any leftover 'synced' flags on replication > > + * slots when in crash recovery on the primary. The DB_IN_CRASH_RECOVERY > > + * state check ensures that this code is only reached when a standby > > + * server crashes during promotion. > > */ > > StartupReplicationSlots(); > > + if (ControlFile->state == DB_IN_CRASH_RECOVERY) > > > > I believe the primary server can also enter the DB_IN_CRASH_RECOVERY > > state. For example, if the primary is already in crash recovery and > > crashes again while in crash recovery, it will restart in the > > DB_IN_CRASH_RECOVERY state, no? > > > > Yes, good point. I think we can differentiate the two cases based on > the timeline change. A regular primary won't have a timeline change, > whereas a promoted standby that failed during promotion will show a > timeline change immediately upon restart. Thoughts? > Will there be any issues if we clear the sync status immediately after the standby.signal file is removed from the standby server? We could maybe introduce a temporary "promote.inprogress" marker file on disk before removing standby.signal. The sequence would be: 1) Create promote.inprogress. 2) Unlink standby.signal 3) Clear the sync slot status. 4) Remove promote.inprogress. This way, if the server crashes after standby.signal is removed but before the sync status is cleared, the presence of promote.inprogress would indicate that the standby was in the middle of promotion and crashed before slot cleanup. On restart, we could use that marker to detect the incomplete promotion and finish clearing the sync flags. If the crash happens at a later stage, the server will no longer start as a standby anyway, and by then the sync flags would already have been reset. This is just a thought and it may sound a bit naive. Let me know if I am overlooking something. -- With Regards, Ashutosh Sharma.
В списке pgsql-hackers по дате отправления: