Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Дата
Msg-id CAA4eK1LrUX5F1Bj8vAW-tvVt3isn+xVj8JdRxQA2xeCYz0FB8w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication  (Melih Mutlu <m.melihmutlu@gmail.com>)
Ответы Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication  (Melih Mutlu <m.melihmutlu@gmail.com>)
Список pgsql-hackers
On Fri, Jul 8, 2022 at 10:26 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
>> I think that won't work because each time on restart the slot won't be
>> fixed. Now, it is possible that we may drop the wrong slot if that
>> state of copying rel is SUBREL_STATE_DATASYNC. Also, it is possible
>> that while creating a slot, we fail because the same name slot already
>> exists due to some other worker which has created that slot has been
>> restarted. Also, what about origin_name, won't that have similar
>> problems? Also, if the state is already SUBREL_STATE_FINISHEDCOPY, if
>> the slot is not the same as we have used in the previous run of a
>> particular worker, it may start WAL streaming from a different point
>> based on the slot's confirmed_flush_location.
>
>
> You're right Amit. In case of a failure, tablesync phase of a relation may continue with different worker and
replicationslot due to this change in naming. 
> Seems like the same replication slot should be used from start to end for a relation during tablesync. However,
creating/droppingreplication slots can be a major overhead in some cases. 
> It would be nice if these slots are somehow reused.
>
> To overcome this issue, I've been thinking about making some changes in my patch.
> So far, my proposal would be as follows:
>
> Slot naming can be like pg_<sub_id>_<worker_pid> instead of pg_<sub_id>_<slot_index>. This way each worker can use
thesame replication slot during their lifetime. 
> But if a worker is restarted, then it will switch to a new replication slot since its pid has changed.
>

I think using worker_pid also has similar risks of mixing slots from
different workers because after restart same worker_pid could be
assigned to a totally different worker. Can we think of using a unique
64-bit number instead? This will be allocated when each workers
started for the very first time and after that we can refer catalog to
find it as suggested in the idea below.

> pg_subscription_rel catalog can store replication slot name for each non-ready relation. Then we can find the slot
neededfor that particular relation to complete tablesync. 
>

Yeah, this is worth investigating. However, instead of storing the
slot_name, we can store just the unique number (as suggested above).
We should use the same for the origin name as well.

> If a worker syncs a relation without any error, everything works well and this new replication slot column from the
catalogwill not be needed. 
> However if a worker is restarted due to a failure, the previous run of that worker left its slot behind since it did
notexit properly. 
> And the restarted worker (with a different pid) will see that the relation is actually in  SUBREL_STATE_FINISHEDCOPY
andwant to proceed for the catchup step. 
> Then the worker can look for that particular relation's replication slot from pg_subscription_rel catalog (slot name
shouldbe there since relation state is not ready). And tablesync can proceed with that slot. 
>
> There might be some cases where some replication slots are left behind. An example to such cases would be when the
slotis removed from pg_subscription_rel catalog and detached from any relation, but tha slot actually couldn't be
droppedfor some reason. For such cases, a slot cleanup logic is needed. This cleanup can also be done by tablesync
workers.
> Whenever a tablesync worker is created, it can look for existing replication slots that do not belong to any relation
andany worker (slot name has pid for that), and drop those slots if it finds any. 
>

This sounds tricky. Why not first drop slot/origin and then detach it
from pg_subscription_rel? On restarts, it is possible that we may
error out after dropping the slot or origin but before updating the
catalog entry but in such case we can ignore missing slot/origin and
detach them from pg_subscription_rel. Also, if we use the unique
number as suggested above, I think even if we don't remove it after
the relation state is ready, it should be okay.

> What do you think about this new way of handling slots? Do you see any points of concern?
>
> I'm currently working on adding this change into the patch. And would appreciate any comment.
>

Thanks for making progress!

--
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Aleksander Alekseev
Дата:
Сообщение: Re: [PATCH] Compression dictionaries for JSONB
Следующее
От: John Naylor
Дата:
Сообщение: Re: proposal: Allocate work_mem From Pool