Re: Handle infinite recursion in logical replication setup

Поиск
Список
Период
Сортировка
От Peter Smith
Тема Re: Handle infinite recursion in logical replication setup
Дата
Msg-id CAHut+PuOAYuP2RoObntYp_wsB+xsP13heDaQyN_-dCVUx1GeEA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Handle infinite recursion in logical replication setup  (vignesh C <vignesh21@gmail.com>)
Ответы Re: Handle infinite recursion in logical replication setup  (vignesh C <vignesh21@gmail.com>)
Список pgsql-hackers
Below are some review comments for the patch v18-0003

1. Commit message

This patch does a couple of things:
change 1) Checks and throws an error if the publication tables were also
subscribing data in the publisher from other publishers when copy_data
parameter is specified as 'ON' and origin parameter is specified as
'local'.
change 2) Adds force value for copy_data parameter.

SUGGESTION
This patch does a couple of things:
change 1) Checks and throws an error if 'copy_data = on' and 'origin =
local' but the publication tables were also subscribing data in the
publisher from other publishers.
change 2) Adds 'force' value for copy_data parameter.

~~~

2. Commit message - about the example

All my following review comments for the commit message are assuming
that the example steps are as they are written in the patch, but
actually I felt that the example might be more complex than it needed
to be: e.g
- You don’t really need the node2 to have data
- Therefore you don’t need all the added TRUNCATE complications

E.g. I think you only need node1 (with data) and node2 (no data).

Then node1 subscribes node2 with (origin=local, copy_data=off).
Then node2 subscribes node1 with (origin=local, copy_data=on).
- Demonstrates exception happens because node1 already had a subscription
- Demonstrates need for the copy_data=force to override that exception

So please consider using a simpler example for this commit message

~~~

3. Commit message

The below help us understand how the first change will be useful:

If copy_data parameter was used with 'on' in step 5, then an error
will be thrown
to alert the user to prevent inconsistent data being populated:
CREATE SUBSCRIPTION sub_node2_node1 CONNECTION '<node1 details>'
PUBLICATION pub_node1 WITH (copy_data = on, origin = local);
ERROR:  CREATE/ALTER SUBSCRIPTION with origin and copy_data as true is not
allowed when the publisher might have replicated data

SUGGESTION
The steps below help to demonstrate how the new exception is useful:

The initial copy phase has no way to know the origin of the row data,
so if 'copy_data = on' in the step 5 below, then an error will be
thrown to prevent any potentially non-local data from being copied:
<blank line>
e.g
CREATE SUBSCRIPTION ...

~~~

4. Commit message

The following will help us understand how the second change will be useful:
Let's take a simple case where user is trying to setup bidirectional logical
replication between node1 and node2 where the two nodes have some pre-existing
data like below:
node1: Table t1 (c1 int) has data 11, 12, 13, 14
node2: Table t1 (c1 int) has data 21, 22, 23, 24

SUGGESTION
The following steps help to demonstrate how the 'copy_data = force'
change will be useful:
<blank line>
Let's take a scenario where the user wants to set up bidirectional
logical replication between node1 and node2 where the same table on
each node has pre-existing data. e.g.
node1: Table t1 (c1 int) has data 11, 12, 13, 14
node2: Table t1 (c1 int) has data 21, 22, 23, 24

~~~

5. Commit message

step 4:
node2=# CREATE SUBSCRIPTION sub_node2_node1 Connection '<node1 details>'
node2-# PUBLICATION pub_node1;
CREATE SUBSCRIPTION

"Connection" => "CONNECTION"

~~~

6. Commit message

If table t1 has a unique key, it will cause a unique key
violation and replication won't proceed.

SUGGESTION
In case, table t1 has a unique key, it will lead to a unique key
violation and replication won't proceed.

~~~

7. Commit message

step 3: Create a subscription in node1 to subscribe to node2. Use
copy_data specified as on so that the existing table data is copied during
initial sync:

SUGGESTION
step 3: Create a subscription in node1 to subscribe to node2. Use
'copy_data = on' so that the existing table data is copied during
initial sync:

~~~

8. Commit message

step 4: Adjust the publication publish settings so that truncate is not
published to the subscribers and truncate the table data in node2:

SUGGESTION (only added a comma)
step 4: Adjust the publication publish settings so that truncate is
not published to the subscribers, and truncate the table data in
node2:

~~~

9. Commit message

step 5: Create a subscription in node2 to subscribe to node1. Use copy_data
specified as force when creating a subscription to node1 so that the existing
table data is copied during initial sync:

SUGGESTION
step 5: Create a subscription in node2 to subscribe to node1. Use
'copy_data = force' when creating a subscription to node1 so that the
existing table data is copied during initial sync:

======

10. doc/src/sgml/ref/create_subscription.sgml

@@ -383,6 +398,15 @@ CREATE SUBSCRIPTION <replaceable
class="parameter">subscription_name</replaceabl
    can have non-existent publications.
   </para>

+  <para>
+   If subscription is created with <literal>origin = local</literal> and
+   <literal>copy_data = on</literal>, it will check if the publisher tables are
+   being subscribed to any other publisher and throw an error to prevent
+   inconsistent data in the subscription. The user can continue with the copy
+   operation without throwing any error in this case by specifying
+   <literal>copy_data = force</literal>.
+  </para>

SUGGESTION (minor rewording)
If the subscription is created with <literal>origin = local</literal>
and <literal>copy_data = on</literal>, it will check if the publisher
tables are being subscribed to any other publisher and, if so, then
throw an error to prevent possible non-local data from being copied.
The user can override this check and continue with the copy operation
by specifying <literal>copy_data = force</literal>.

======

11. src/backend/commands/subscriptioncmds.c - parse_subscription_options

From [1]:
>> What about also allowing copy_data = 2, and making it equivalent to "force"?
> Vignesh: I felt the existing looks ok, no need to support 2. It might confuse the user.

I don't think it would be confusing, but I also don't feel strongly
enough to debate it. Anyway, I could not find a similar precedent, so
your decision is fine.

~~~

12. src/backend/commands/subscriptioncmds.c - parse_subscription_options

@@ -339,17 +406,16 @@ parse_subscription_options(ParseState *pstate,
List *stmt_options,
  errmsg("%s and %s are mutually exclusive options",
  "connect = false", "create_slot = true")));

- if (opts->copy_data &&
- IsSet(opts->specified_opts, SUBOPT_COPY_DATA))
+ if (opts->copy_data && IsSet(opts->specified_opts, SUBOPT_COPY_DATA))
  ereport(ERROR,

This is just a formatting change. Is it needed for this patch? patch.

~~~

13. src/backend/commands/subscriptioncmds.c - AlterSubscription_refresh

@@ -730,7 +798,8 @@ CreateSubscription(ParseState *pstate,
CreateSubscriptionStmt *stmt,
  * PENDING, to allow ALTER SUBSCRIPTION ... REFRESH
  * PUBLICATION to work.
  */
- if (opts.twophase && !opts.copy_data && tables != NIL)
+ if (opts.twophase && opts.copy_data == COPY_DATA_OFF &&
+ tables != NIL)
  twophase_enabled = true;

Why is this change needed? I thought the original code is OK now since
COPY_DATA_OFF = 0

~~~

14. src/backend/commands/subscriptioncmds.c - AlterSubscription

@@ -1265,7 +1337,8 @@ AlterSubscription(ParseState *pstate,
AlterSubscriptionStmt *stmt,
  *
  * For more details see comments atop worker.c.
  */
- if (sub->twophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED && opts.copy_data)
+ if (sub->twophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED &&
+ opts.copy_data)

This is just a formatting change. Is it needed for this patch?

~~~

15. src/backend/commands/subscriptioncmds.c - check_pub_table_subscribed

+
+ if (!origin || (strcmp(origin, "local") != 0) || copydata != COPY_DATA_ON)
+ return;
+

This condition could be rearranged to put the strcmp last so it is not
called unless absolutely necessary.

~~~

16. src/backend/commands/subscriptioncmds.c - check_pub_table_subscribed

+ appendStringInfoString(&cmd,
+    "SELECT DISTINCT N.nspname AS schemaname, C.relname AS tablename,
PS.srrelid as replicated\n"
+    "FROM pg_publication P,\n"

The line is too long; needs wrapping.

~~~

17. src/backend/commands/subscriptioncmds.c - check_pub_table_subscribed

+ if (!slot_attisnull(slot, 3))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("table:%s.%s might have replicated data in the publisher",
+    nspname, relname),
+ errdetail("CREATE/ALTER SUBSCRIPTION with origin as local and
copy_data as true is not allowed when the publisher might have
replicated data."),
+ errhint("Use CREATE/ALTER SUBSCRIPTION with copy_data = off or force."));

I felt the errmsg may be easier to read using "=" instead of "as".
Anyway, it would be more consistent with the errhint. Also, change the
"true" to "on" to be consistent with the errhint.

SUGGESTION
errdetail("CREATE/ALTER SUBSCRIPTION with origin = local and copy_data
= on is not allowed when the publisher might have replicated data."),


------
[1] https://www.postgresql.org/message-id/CALDaNm3iLLxP4OV%2ByQHs-c71P6zQ9W8D30DGsve1SQs_1pFsSQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: Collation version tracking for macOS
Следующее
От: David Rowley
Дата:
Сообщение: Re: Allow foreign keys to reference a superset of unique columns