Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

Поиск
Список
Период
Сортировка
От Melih Mutlu
Тема Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Дата
Msg-id CAGPVpCRMVr0F7DAErbB9fnaqGpGV4t1yf0KWD+WPFe9bz80ykQ@mail.gmail.com
обсуждение исходный текст
Ответ на RE: [PATCH] Reuse Workers and Replication Slots during Logical Replication  ("wangw.fnst@fujitsu.com" <wangw.fnst@fujitsu.com>)
Ответы Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication  (shveta malik <shveta.malik@gmail.com>)
RE: [PATCH] Reuse Workers and Replication Slots during Logical Replication  ("wangw.fnst@fujitsu.com" <wangw.fnst@fujitsu.com>)
Список pgsql-hackers
Hi,

Thanks for your review. 
Attached updated versions of the patches.

wangw.fnst@fujitsu.com <wangw.fnst@fujitsu.com>, 17 Oca 2023 Sal, 14:15 tarihinde şunu yazdı:
On Wed, Jan 11, 2023 4:31 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
v3-0001* patch
===============

1. typedefs.list
I think we also need to add "walrcv_slot_snapshot_fn" to this file.

Done.
 
v7-0002* patch
===============
1. About function ReplicationOriginNameForLogicalRep()
Do we need to modify the API of this function? I think the original API could
also meet the current needs. Since this is not a static function, I think it
seems better to keep the original API if there is no reason. Please let me know
if I'm missing something.

You're right. 
I still need to modify ReplicationOriginNameForLogicalRep. Origin names are not tied to relations anymore, so their name doesn't need to include relation id.
But I didn't really need to change the function signature. I reverted that part of the change in the updated version of the patch.
 
2. Comment atop the function GetSubscriptionRelReplicationSlot

Done
 
3. typo
+                        * At this point, there shouldn't be any existing replication
+                        * origin wit the same name.

Done.
 
4. In function CreateSubscription
+       values[Anum_pg_subscription_sublastusedid - 1] = Int64GetDatum(1);

I think it might be better to initialize this field to NULL or 0 here.
Because in the patch, we always ignore the initialized value when launching
the sync worker in the function process_syncing_tables_for_apply. And I think
we could document in pg-doc that this value means that no tables have been
synced yet.

I changed it to start from 0 and added a line into the related doc to indicate that 0 means that no table has been synced yet.
 
5. New member "created_slot" in structure LogicalRepWorker
+       /*
+        * Indicates if the sync worker created a replication slot or it reuses an
+        * existing one created by another worker.
+        */
+       bool            created_slot;

I think the second half of the sentence looks inaccurate.
Because I think this flag could be false even when we reuse an existing slot
created by another worker. Assuming the first run for the worker tries to sync
a table which is synced by another sync worker before, and the relstate is set
to SUBREL_STATE_FINISHEDCOPY by another sync worker, I think this flag will not
be set to true. (see function LogicalRepSyncTableStart)

So, what if we simplify the description here and just say that this worker
already has it's default slot?

If I'm not missing something and you agree with this, please also kindly modify
the relevant comment atop the if-statement (!MyLogicalRepWorker->created_slot)
in the function LogicalRepSyncTableStart.

This "created_slot" indicates whether the current worker has created a replication slot for its own use. If so, created_slot will be true, otherwise false.
Let's say the tablesync worker has not created its own slot yet in its previous runs or this is its first run. And the worker decides to reuse an existing replication slot (which created by another tablesync worker). Then created_slot is expected to be false. Because this particular tablesync worker has not created its own slot yet in either of its runs. 

In your example, the worker is in its first run and begin to sync a table whose state is FINISHEDCOPY. If the table's state is FINISHEDCOPY then the table should already have a replication slot created for its own sync process. The worker will want to reuse that existing replication slot for this particular table and it will not create a new replication slot. So created_slot will be false, because the worker has not actually created any replication slot yet. 

Basically, created_slot is set to true only if "walrcv_create_slot" is called by the tablesync worker any time during its lifetime. Otherwise, it's possible that the worker can use existing replication slots for each table it syncs. (e.g. if all the tables that the worker has synced were in FINISHEDCOPY  state, then the worker will not need to create a new slot).

Does it make sense now? Maybe I need to improve comments to make it clearer.

Best,
--
Melih Mutlu
Microsoft
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Hayato Kuroda (Fujitsu)"
Дата:
Сообщение: RE: Perform streaming logical transactions by background workers and parallel apply
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: Generating code for query jumbling through gen_node_support.pl