Re: Clear logical slot's 'synced' flag on promotion of standby
От | shveta malik |
---|---|
Тема | Re: Clear logical slot's 'synced' flag on promotion of standby |
Дата | |
Msg-id | CAJpy0uCqDM_AX3mL38PotB4M2ahoPYCfYeH3pT0kbYXsQ9ga4w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Clear logical slot's 'synced' flag on promotion of standby (Ashutosh Sharma <ashu.coek88@gmail.com>) |
Ответы |
Re: Clear logical slot's 'synced' flag on promotion of standby
Re: Clear logical slot's 'synced' flag on promotion of standby |
Список | pgsql-hackers |
On Tue, Sep 9, 2025 at 2:19 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > Hi, > > > + * required resources. Clear any leftover 'synced' flags on replication > + * slots when in crash recovery on the primary. The DB_IN_CRASH_RECOVERY > + * state check ensures that this code is only reached when a standby > + * server crashes during promotion. > */ > StartupReplicationSlots(); > + if (ControlFile->state == DB_IN_CRASH_RECOVERY) > > I believe the primary server can also enter the DB_IN_CRASH_RECOVERY > state. For example, if the primary is already in crash recovery and > crashes again while in crash recovery, it will restart in the > DB_IN_CRASH_RECOVERY state, no? > Yes, good point. I think we can differentiate the two cases based on the timeline change. A regular primary won't have a timeline change, whereas a promoted standby that failed during promotion will show a timeline change immediately upon restart. Thoughts? In the worst-case scenario, even if we end up running the Reset function during a regular primary's crash recovery, it shouldn't cause any harm. (That said, I'm not suggesting we shouldn't fix it). What concerns me more is the possibility of running it on a regular standby, as it could disrupt slot synchronization. I attempted to simulate a scenario where a regular standby ends up in DB_IN_CRASH_RECOVERY after a crash, but I couldn't reproduce it. Do you know of any situation where this could happen? The absence of comments for these states makes it challenging to follow the flow. > -- > > With this change are we saying that on primary the synced flag must be > always false. Because the postgres doc on pg_replication_slots says: > > "The value of this column has no meaning on the primary server; the > column value on the primary is default false for all slots but may (if > leftover from a promoted standby) also be true." > The doc needs change. thanks Shveta
В списке pgsql-hackers по дате отправления: