Re: Handle infinite recursion in logical replication setup

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Handle infinite recursion in logical replication setup
Дата
Msg-id CAA4eK1KRDVqFkA4Z_ywTPhBNAnPKAnE-baQ2Q2taXUx_LSB8_A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Handle infinite recursion in logical replication setup  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Handle infinite recursion in logical replication setup  (vignesh C <vignesh21@gmail.com>)
Список pgsql-hackers
On Wed, May 18, 2022 at 10:29 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, May 4, 2022 at 12:17 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the comments, the attached v13 patch has the changes for the same.
> >
>
> Few comments on v13-0001
> ======================
>

Few comments on v13-0002
===========================
1.
The steps
+     to create a two-node bidirectional replication are given below:
+   </para>

The steps given after this will be successful only when there is no
data in any of the nodes and that is not clear by reading docs.

2.
+  <sect2 id="add-new-node-data-in-existing-node">
+   <title>Adding a new node when data is present in the existing nodes</title>
+    <para>
+     Adding a new node <literal>node3</literal> to the existing
+     <literal>node1</literal> and <literal>node2</literal> when data is present
+     in existing nodes <literal>node1</literal> and <literal>node2</literal>
+     needs similar steps. The only change required here is that
+     <literal>node3</literal> should create a subscription with
+     <literal>copy_data = force</literal> to one of the existing nodes to
+     receive the existing data during initial data synchronization.
+   </para>

I think the steps for these require the user to lock the required
tables/database (or in some other way hold the operations on required
tables) for node-2 till the time setup is complete, otherwise, node-3
might miss some data. It seems the same is also missing in the
section: "Adding a new node when data is present in the new node".

3.
+
+  <sect2 id="add-node-data-present-in-new-node">
+   <title>Adding a new node when data is present in the new node</title>
...
...

+   <para>
+    Create a subscription in <literal>node1</literal> to subscribe to
+    <literal>node3</literal>. Use <literal>copy_data</literal> specified as
+    <literal>force</literal> so that the existing table data is copied during
+    initial sync:
+<programlisting>
+node1=# CREATE SUBSCRIPTION sub_node1_node3
+node1-# CONNECTION 'dbname=foo host=node3 user=repuser'
+node1-# PUBLICATION pub_node3
+node1-# WITH (copy_data = force, local_only = on);
+CREATE SUBSCRIPTION
+</programlisting></para>
+
+   <para>
+    Create a subscription in <literal>node2</literal> to subscribe to
+    <literal>node3</literal>. Use <literal>copy_data</literal> specified as
+    <literal>force</literal> so that the existing table data is copied during
+    initial sync:
+<programlisting>
+node2=# CREATE SUBSCRIPTION sub_node2_node3
+node2-# CONNECTION 'dbname=foo host=node3 user=repuser'
+node2-# PUBLICATION pub_node3
+node2-# WITH (copy_data = force, local_only = on);
+CREATE SUBSCRIPTION
+</programlisting></para>

Why do we need to use "copy_data = force" here? AFAIU, unless, we
create any subscription on node-3, we don't need the 'force' option.

4. We should have a generic section to explain how users can add a new
node using the new options to the existing set of nodes in all cases.
For example, say when the existing set of nodes has some data and the
new node also has some pre-existing data. I think the basic steps are
something like: a. create a required publication(s) on the new node.
(b) create subscriptions on existing nodes pointing to publication on
the new node with the local_only option as true and copy_data = on.
(c) wait for data to be copied from the new node to existing nodes.
(d) Truncate the data on the new node. (e) create subscriptions
corresponding to each of the existing publisher nodes on the new node
with local_only as true and copy_data = force for one of the nodes.
(f) One needs to ensure that there is no new activity on required
tables/database (either by locking or in some other way) in all the
nodes for which copy_data option is kept as false while creating
subscriptions in the previous step to avoid any data loss.

We also need to mention in Notes that as all operations are not
transactional, user is advised to take backup of existing data to
avoid any inconsistency.

5.
 * It is quite possible that subscriber has not yet pulled data to
+ * the tables, but in ideal cases the table data will be subscribed.
+ * To keep the code simple it is not checked if the subscriber table
+ * has pulled the data or not.
+ */
+ if (copydata == COPY_DATA_ON && local_only && !slot_attisnull(slot, 3))

Sorry, but I don't understand what you intend to say by the above
comment. Can you please explain?

6. I feel we can explain the case with only two nodes in the commit
message, why do we need to use a three-node case?

-- 
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Antonin Houska
Дата:
Сообщение: Re: Privileges on PUBLICATION
Следующее
От: Ranier Vilela
Дата:
Сообщение: Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)