Re: Handle infinite recursion in logical replication setup

Поиск
Список
Период
Сортировка
От Peter Smith
Тема Re: Handle infinite recursion in logical replication setup
Дата
Msg-id CAHut+PvJHJ+D065iPgW9mqeJjjoToc=OYqBujfRhR6NC_KNsAA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Handle infinite recursion in logical replication setup  (vignesh C <vignesh21@gmail.com>)
Ответы Re: Handle infinite recursion in logical replication setup  (vignesh C <vignesh21@gmail.com>)
Список pgsql-hackers
Here are my review comments for the v22* patch set.

========
v22-0001
========

No comments. LGTM

========
V22-0002
========

2.1 doc/src/sgml/catalogs.sgml

+       Possible origin values are <literal>local</literal> or
+       <literal>any</literal>. The default is <literal>any</literal>.

IMO the word "Possible" here is giving a sense of vagueness.

SUGGESTION
The origin value must be either <literal>local</literal> or
<literal>any</literal>.

~~~

2.2 src/backend/commands/subscriptioncmds.c

@@ -265,6 +271,29 @@ parse_subscription_options(ParseState *pstate,
List *stmt_options,
  opts->specified_opts |= SUBOPT_DISABLE_ON_ERR;
  opts->disableonerr = defGetBoolean(defel);
  }
+ else if (IsSet(supported_opts, SUBOPT_ORIGIN) &&
+ strcmp(defel->defname, "origin") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_ORIGIN))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_ORIGIN;
+
+ /*
+ * Even though "origin" parameter allows only "local" and "any"
+ * values, the "origin" parameter type is implemented as string
+ * type instead of boolean to extend the "origin" parameter to
+ * support filtering of origin name specified by the user in the
+ * later versions.
+ */
+ opts->origin = defGetString(defel);
+
+ if ((strcmp(opts->origin, LOGICALREP_ORIGIN_LOCAL) != 0) &&
+ (strcmp(opts->origin, LOGICALREP_ORIGIN_ANY) != 0))
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unrecognized origin value: \"%s\"", opts->origin));
+ }

I was wondering if it might be wise now to do a pfree(opts->origin)
here before setting the new option value which overwrites the
strdup-ed default "any". OTOH maybe it is overkill to worry about the
tiny leak? I am not sure what is the convention for this.

~~~

2.3 src/backend/replication/pgoutput/pgoutput.c

@@ -1698,12 +1719,16 @@ pgoutput_message(LogicalDecodingContext *ctx,
ReorderBufferTXN *txn,
 }

 /*
- * Currently we always forward.
+ * Return true if the data source (origin) is remote and user has requested
+ * only local data, false otherwise.
  */
 static bool
 pgoutput_origin_filter(LogicalDecodingContext *ctx,
     RepOriginId origin_id)

"user" -> "the user"

~~~

2.4 src/include/catalog/pg_subscription.h

+/*
+ * The subscription will request the publisher to send any changes regardless
+ * of their origin
+ */
+#define LOGICALREP_ORIGIN_ANY "any"

SUGGESTION (remove 'any'; add period)
The subscription will request the publisher to send changes regardless
of their origin.

========
v22-0003
========

3.1 Commit message

change 1) Checks and throws an error if 'copy_data = on' and 'origin =
local' but the publication tables were also subscribing from other publishers.

"also subscribing from other publishers." -> "also replicating from
other publishers."

~~~

3.2 doc/src/sgml/ref/alter_subscription.sgml

+         <para>
+          There is some interaction between the <literal>origin</literal>
+          parameter and <literal>copy_data</literal> parameter.

"and copy_data parameter." -> "and the copy_data parameter."

~~~

3.3 doc/src/sgml/ref/create_subscription.sgml

+         <para>
+          There is some interaction between the <literal>origin</literal>
+          parameter and <literal>copy_data</literal> parameter.

"and copy_data parameter." -> "and the copy_data parameter."

~~~

3.4 doc/src/sgml/ref/create_subscription.sgml

+         <para>
+          There is some interaction between the <literal>origin</literal>
+          parameter and <literal>copy_data</literal> parameter. Refer to the
+          <xref linkend="sql-createsubscription-notes" /> for details.
+         </para>

"and copy_data parameter." -> "and the copy_data parameter."

~~~

3.5 doc/src/sgml/ref/create_subscription.sgml

@@ -383,6 +398,15 @@ CREATE SUBSCRIPTION <replaceable
class="parameter">subscription_name</replaceabl
    can have non-existent publications.
   </para>

+  <para>
+   If the subscription is created with <literal>origin = local</literal> and
+   <literal>copy_data = true</literal>, it will check if the
publisher tables are
+   being subscribed to any other publisher and, if so, then throw an error to
+   prevent possible non-local data from being copied. The user can override
+   this check and continue with the copy operation by specifying
+   <literal>copy_data = force</literal>.
+  </para>
+

For the part "it will check if the publisher tables are being
subscribed to any other publisher...", the "being subscribed to"
sounds a bit strange. Maybe it is better to reword this like you
already worded some of the code comments.

SUGGESTION
-> "... it will check if the publisher has subscribed to the same
table from other publishers and ..."

~~~

3.6 src/backend/commands/subscriptioncmds.c

+ /*
+ * Throw an error if the publisher has subscribed to the same table
+ * from some other publisher. We cannot differentiate between the
+ * local and non-local data that is present in the HEAP during the
+ * initial sync. Identification of local data can be done only from
+ * the WAL by using the origin id. XXX: For simplicity, we don't check
+ * whether the table has any data or not. If the table doesn't have
+ * any data then we don't need to distinguish between local and
+ * non-local data so we can avoid throwing error in that case.
+ */

I think the XXX part should be after a blank like so it is more
prominent, instead of being buried in the other text.

e.g.

+ /*
+ * Throw an error if the publisher has subscribed to the same table
+ * from some other publisher. We cannot differentiate between the
+ * local and non-local data that is present in the HEAP during the
+ * initial sync. Identification of local data can be done only from
+ * the WAL by using the origin id.
+ *
+ * XXX: For simplicity, we don't check
+ * whether the table has any data or not. If the table doesn't have
+ * any data then we don't need to distinguish between local and
+ * non-local data so we can avoid throwing error in that case.
+ */

~~~

3.7 src/test/regress/sql/subscription.sql

Elsewhere, there was code in this patch that apparently accepts
'copy_data' parameter but with no parameter value specified. But this
combination has no test case.

========
v22-0004
========

4.1 doc/src/sgml/logical-replication.sgml

+  <sect2 id="add-node-data-present-on-new-node">
+   <title>Adding a new node when data is present on the new node</title>

"when data is present" -> "when table data is present"

~~~

4.2 doc/src/sgml/logical-replication.sgml

+   <note>
+    <para>
+     Adding a new node when data is present on the new node tables is not
+     supported.
+    </para>
+   </note>

I think the note text should match the title text (see #4.1)

SUGGESTION
Adding a new node when table data is present on the new node is not supported.

~~~

4.3

+   <para>
+    Step-2: Lock the required tables of the new node in
+    <literal>EXCLUSIVE</literal> mode untilthe setup is complete. (This lock is
+    necessary to prevent any modifications from happening on the new node
+    because if data modifications occurred after Step-3, there is a chance that
+    the modifications will be published to the first node and then synchronized
+    back to the new node while creating the subscription in Step-5. This would
+    result in inconsistent data).
+   </para>
+   <para>

4.3.a
typo: "untilthe" -> "until the"

4.3.b
SUGGESTION (just for the 2nd sentence)
This lock is necessary to prevent any modifications from happening on
the new node. If data modifications occurred after Step-3, there is a
chance they could be published to the first node and then synchronized
back to the new node while creating the subscription in Step-5.


------
Kind Regards,
Peter Smith.
Fujitsu Australia



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: Missing reference to pgstat_replslot.c in pgstat.c
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: pg_page_repair: a tool/extension to repair corrupted pages in postgres with streaming/physical replication