Re: Synchronizing slots from primary to standby
От | Drouvot, Bertrand |
---|---|
Тема | Re: Synchronizing slots from primary to standby |
Дата | |
Msg-id | 538ddca6-cf74-4a9c-95d6-dd05af24070c@gmail.com обсуждение исходный текст |
Ответ на | Re: Synchronizing slots from primary to standby (Amit Kapila <amit.kapila16@gmail.com>) |
Ответы |
Re: Synchronizing slots from primary to standby
|
Список | pgsql-hackers |
Hi, On 11/10/23 6:41 AM, Amit Kapila wrote: > On Thu, Nov 9, 2023 at 7:29 PM Drouvot, Bertrand > <bertranddrouvot.pg@gmail.com> wrote: > > Are you saying that we change the state of the already existing slot > on standby? Yes. > And, such a state would indicate that we are trying to > sync the slot with the same name from the primary. Is that what you > have in mind? Yes. > If so, it appears quite odd to me to have such a state > and also set it in some unrelated slot that just has the same name. > > I understand your point that we can allow other slots to proceed but > it is also important to not create any sort of inconsistency that can > surprise user after failover. But even if we ERROR out instead of emitting a WARNING, the user would still need to be notified/monitor such errors. I agree that then probably they will come to know earlier because the slot sync mechanism would be stopped but still it is not "guaranteed" (specially if there is no others "working" synced slots around.) And if they do not, then there is still a risk to use this slot after a failover thinking this is a "synced" slot. Giving more thoughts, what about using a dedicated/reserved naming convention for synced slot like synced_<primary_slot_name> or such and then: - prevent user to create sync_<whatever> slots on standby - sync <slot> on primary to sync_<slot> on standby - during failover, rename sync_<slot> to <slot> and if <slot> exists then emit a WARNING and keep sync_<slot> in place. That way both slots are still in place (the manually created <slot> and the sync_<slot<) and one could decide what to do with them. I don't think we'd need to worry about the cases where sync_ slot could be already created before we "prevent" such slots creation. Indeed I think they would not survive pg_upgrade before 17 -> 18 upgrades. So it looks like we'd be good as long as we are able to prevent sync_ slots creation on 17. Thoughts? > Also, the current coding doesn't ensure > we will always give WARNING. If we see the below code that deals with > this WARNING, > > + /* User created slot with the same name exists, emit WARNING. */ > + else if (found && s->data.sync_state == SYNCSLOT_STATE_NONE) > + { > + ereport(WARNING, > + errmsg("not synchronizing slot %s; it is a user created slot", > + remote_slot->name)); > + } > + /* Otherwise create the slot first. */ > + else > + { > + TransactionId xmin_horizon = InvalidTransactionId; > + ReplicationSlot *slot; > + > + ReplicationSlotCreate(remote_slot->name, true, RS_EPHEMERAL, > + remote_slot->two_phase, false); > > I think this is not a solid check to ensure that the slot existed > before. Because it could be created as soon as the slot sync worker > invokes ReplicationSlotCreate() here. Agree. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
В списке pgsql-hackers по дате отправления: