Re: Issue with logical replication slot during switchover
| От | Fabrice Chapuis |
|---|---|
| Тема | Re: Issue with logical replication slot during switchover |
| Дата | |
| Msg-id | CAA5-nLC=3QMehrktEB1MOrAk_ynG_u=pD_iUOup4MmkN2Xv0sA@mail.gmail.com обсуждение исходный текст |
| Ответ на | Re: Issue with logical replication slot during switchover (Alexander Kukushkin <cyberdemn@gmail.com>) |
| Ответы |
Re: Issue with logical replication slot during switchover
|
| Список | pgsql-hackers |
Hi Alexander,
I understand your use case but I think it won't work like this because of how is implemented get_local_synced_slots()
this function will return slots which synced flag to true only.
in point 4 Start node1: synced flag is false and failover flag is true then it won't enter the loop and the slot will not be dropped
static void
drop_local_obsolete_slots(List *remote_slot_list)
{
List *local_slots = get_local_synced_slots();
foreach_ptr(ReplicationSlot, local_slot, local_slots)
{
/* Drop the local slot if it is not required to be retained. */
if (!local_sync_slot_required(local_slot, remote_slot_list)) => will check of remote slot exist and if local slot is not invalidated
{
bool synced_slot;
...
We could change the function get_local_synced_slots to make it work
get_local_synced_slots(void)
{
List *local_slots = NIL;
LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
for (int i = 0; i < max_replication_slots; i++)
{
ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
/* Check if it is a synchronized slot */
if (s->in_use && (s->data.failover || s->data.synced))
{
Assert(SlotIsLogical(s));
local_slots = lappend(local_slots, s);
Thanks for your feedback
Regards,
Fabrice
....
On Mon, Nov 3, 2025 at 8:51 AM Alexander Kukushkin <cyberdemn@gmail.com> wrote:
Hi Fabrice,On Fri, 31 Oct 2025 at 17:45, Fabrice Chapuis <fabrice636861@gmail.com> wrote:Thanks for your large analyze and explanation Alexander. If we go in the direction you propose and leave the option to use a suplementary flag allow_overwrite, may I propose you some modifications in the patch v0 you have attached:Why modifying this function?drop_local_obsolete_slots(List *remote_slot_list)Think about the following case:1. We have node1 (primary) and node2 (standby), with slot foo(failover=true)2. We stop node1 for maintenance and promote node23. DROP replication slot on node24. Start node1In this scenario the logical slot "foo" will be left unattended and it needs to be dropped, because it doesn't exist on the primary.As I said earlier, IMO it is only the "failover" flag that effectively indicates the purpose of the slot. The sync flag is purely informative.Regards,--Alexander Kukushkin
В списке pgsql-hackers по дате отправления: