Re: Introduce XID age and inactive timeout based replication slot invalidation
От | shveta malik |
---|---|
Тема | Re: Introduce XID age and inactive timeout based replication slot invalidation |
Дата | |
Msg-id | CAJpy0uCQb7Mt+975CHDbrObyWo+o855QRjaBw_KmZMLkS2PEFA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Introduce XID age and inactive timeout based replication slot invalidation (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>) |
Список | pgsql-hackers |
On Tue, Mar 26, 2024 at 11:08 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Tue, Mar 26, 2024 at 9:30 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > I have one concern, for synced slots on standby, how do we disallow > > > invalidation due to inactive-timeout immediately after promotion? > > > > > > For synced slots, last_inactive_time and inactive_timeout are both > > > set. Let's say I bring down primary for promotion of standby and then > > > promote standby, there are chances that it may end up invalidating > > > synced slots (considering standby is not brought down during promotion > > > and thus inactive_timeout may already be past 'last_inactive_time'). > > > > > > > On standby, if we decide to maintain valid last_inactive_time for > > synced slots, then invalidation is correctly restricted in > > InvalidateSlotForInactiveTimeout() for synced slots using the check: > > > > if (RecoveryInProgress() && slot->data.synced) > > return false; > > > > But immediately after promotion, we can not rely on the above check > > and thus possibility of synced slots invalidation is there. To > > maintain consistent behavior regarding the setting of > > last_inactive_time for synced slots, similar to user slots, one > > potential solution to prevent this invalidation issue is to update the > > last_inactive_time of all synced slots within the ShutDownSlotSync() > > function during FinishWalRecovery(). This approach ensures that > > promotion doesn't immediately invalidate slots, and henceforth, we > > possess a correct last_inactive_time as a basis for invalidation going > > forward. This will be equivalent to updating last_inactive_time during > > restart (but without actual restart during promotion). > > The plus point of maintaining last_inactive_time for synced slots > > could be, this can provide data to the user on when last time the sync > > was attempted on that particular slot by background slot sync worker > > or SQl function. Thoughts? > > Please find the attached v21 patch implementing the above idea. It > also has changes for renaming last_inactive_time to inactive_since. > Thanks for the patch. I have tested this patch alone, and it does what it says. One additional thing which I noticed is that now it sets inactive_since for temp slots as well, but that idea looks fine to me. I could not test 'invalidation on promotion bug' with this change, as that needed rebasing of the rest of the patches. Few trivial things: 1) Commti msg: ensures the value is set to current timestamp during the shutdown to help correctly interpret the time if the standby gets promoted without a restart. shutdown --> shutdown of slot sync worker (as it was not clear if it is instance shutdown or something else) 2) 'The time since the slot has became inactive'. has became-->has become or just became Please check it in all the files. There are multiple places. thanks Shveta
В списке pgsql-hackers по дате отправления: