Re: Handle infinite recursion in logical replication setup

Поиск
Список
Период
Сортировка
От vignesh C
Тема Re: Handle infinite recursion in logical replication setup
Дата
Msg-id CALDaNm1-ZrG=haAoiB2yFKYc+ckcd1NLaU8QB3SWs32wPsph4w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Handle infinite recursion in logical replication setup  (Peter Smith <smithpb2250@gmail.com>)
Ответы Re: Handle infinite recursion in logical replication setup  (Peter Smith <smithpb2250@gmail.com>)
Список pgsql-hackers
On Wed, Jun 22, 2022 at 12:16 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are my review comments for the v22* patch set.
>
> ========
> v22-0001
> ========
>
> No comments. LGTM
>
> ========
> V22-0002
> ========
>
> 2.1 doc/src/sgml/catalogs.sgml
>
> +       Possible origin values are <literal>local</literal> or
> +       <literal>any</literal>. The default is <literal>any</literal>.
>
> IMO the word "Possible" here is giving a sense of vagueness.
>
> SUGGESTION
> The origin value must be either <literal>local</literal> or
> <literal>any</literal>.

Modified

> ~~~
>
> 2.2 src/backend/commands/subscriptioncmds.c
>
> @@ -265,6 +271,29 @@ parse_subscription_options(ParseState *pstate,
> List *stmt_options,
>   opts->specified_opts |= SUBOPT_DISABLE_ON_ERR;
>   opts->disableonerr = defGetBoolean(defel);
>   }
> + else if (IsSet(supported_opts, SUBOPT_ORIGIN) &&
> + strcmp(defel->defname, "origin") == 0)
> + {
> + if (IsSet(opts->specified_opts, SUBOPT_ORIGIN))
> + errorConflictingDefElem(defel, pstate);
> +
> + opts->specified_opts |= SUBOPT_ORIGIN;
> +
> + /*
> + * Even though "origin" parameter allows only "local" and "any"
> + * values, the "origin" parameter type is implemented as string
> + * type instead of boolean to extend the "origin" parameter to
> + * support filtering of origin name specified by the user in the
> + * later versions.
> + */
> + opts->origin = defGetString(defel);
> +
> + if ((strcmp(opts->origin, LOGICALREP_ORIGIN_LOCAL) != 0) &&
> + (strcmp(opts->origin, LOGICALREP_ORIGIN_ANY) != 0))
> + ereport(ERROR,
> + errcode(ERRCODE_SYNTAX_ERROR),
> + errmsg("unrecognized origin value: \"%s\"", opts->origin));
> + }
>
> I was wondering if it might be wise now to do a pfree(opts->origin)
> here before setting the new option value which overwrites the
> strdup-ed default "any". OTOH maybe it is overkill to worry about the
> tiny leak? I am not sure what is the convention for this.

I have freed the memory to avoid the leak,  I felt it is better to
clean the memory as it is a positive flow.

> ~~~
>
> 2.3 src/backend/replication/pgoutput/pgoutput.c
>
> @@ -1698,12 +1719,16 @@ pgoutput_message(LogicalDecodingContext *ctx,
> ReorderBufferTXN *txn,
>  }
>
>  /*
> - * Currently we always forward.
> + * Return true if the data source (origin) is remote and user has requested
> + * only local data, false otherwise.
>   */
>  static bool
>  pgoutput_origin_filter(LogicalDecodingContext *ctx,
>      RepOriginId origin_id)
>
> "user" -> "the user"

Modified

> ~~~
>
> 2.4 src/include/catalog/pg_subscription.h
>
> +/*
> + * The subscription will request the publisher to send any changes regardless
> + * of their origin
> + */
> +#define LOGICALREP_ORIGIN_ANY "any"
>
> SUGGESTION (remove 'any'; add period)
> The subscription will request the publisher to send changes regardless
> of their origin.

Modified

> ========
> v22-0003
> ========
>
> 3.1 Commit message
>
> change 1) Checks and throws an error if 'copy_data = on' and 'origin =
> local' but the publication tables were also subscribing from other publishers.
>
> "also subscribing from other publishers." -> "also replicating from
> other publishers."

Modified

> ~~~
>
> 3.2 doc/src/sgml/ref/alter_subscription.sgml
>
> +         <para>
> +          There is some interaction between the <literal>origin</literal>
> +          parameter and <literal>copy_data</literal> parameter.
>
> "and copy_data parameter." -> "and the copy_data parameter."

Modified

> ~~~
>
> 3.3 doc/src/sgml/ref/create_subscription.sgml
>
> +         <para>
> +          There is some interaction between the <literal>origin</literal>
> +          parameter and <literal>copy_data</literal> parameter.
>
> "and copy_data parameter." -> "and the copy_data parameter."

Modified

> ~~~
>
> 3.4 doc/src/sgml/ref/create_subscription.sgml
>
> +         <para>
> +          There is some interaction between the <literal>origin</literal>
> +          parameter and <literal>copy_data</literal> parameter. Refer to the
> +          <xref linkend="sql-createsubscription-notes" /> for details.
> +         </para>
>
> "and copy_data parameter." -> "and the copy_data parameter."

Modified

> ~~~
>
> 3.5 doc/src/sgml/ref/create_subscription.sgml
>
> @@ -383,6 +398,15 @@ CREATE SUBSCRIPTION <replaceable
> class="parameter">subscription_name</replaceabl
>     can have non-existent publications.
>    </para>
>
> +  <para>
> +   If the subscription is created with <literal>origin = local</literal> and
> +   <literal>copy_data = true</literal>, it will check if the
> publisher tables are
> +   being subscribed to any other publisher and, if so, then throw an error to
> +   prevent possible non-local data from being copied. The user can override
> +   this check and continue with the copy operation by specifying
> +   <literal>copy_data = force</literal>.
> +  </para>
> +
>
> For the part "it will check if the publisher tables are being
> subscribed to any other publisher...", the "being subscribed to"
> sounds a bit strange. Maybe it is better to reword this like you
> already worded some of the code comments.
>
> SUGGESTION
> -> "... it will check if the publisher has subscribed to the same
> table from other publishers and ..."

Modified

> ~~~
>
> 3.6 src/backend/commands/subscriptioncmds.c
>
> + /*
> + * Throw an error if the publisher has subscribed to the same table
> + * from some other publisher. We cannot differentiate between the
> + * local and non-local data that is present in the HEAP during the
> + * initial sync. Identification of local data can be done only from
> + * the WAL by using the origin id. XXX: For simplicity, we don't check
> + * whether the table has any data or not. If the table doesn't have
> + * any data then we don't need to distinguish between local and
> + * non-local data so we can avoid throwing error in that case.
> + */
>
> I think the XXX part should be after a blank like so it is more
> prominent, instead of being buried in the other text.
>
> e.g.
>
> + /*
> + * Throw an error if the publisher has subscribed to the same table
> + * from some other publisher. We cannot differentiate between the
> + * local and non-local data that is present in the HEAP during the
> + * initial sync. Identification of local data can be done only from
> + * the WAL by using the origin id.
> + *
> + * XXX: For simplicity, we don't check
> + * whether the table has any data or not. If the table doesn't have
> + * any data then we don't need to distinguish between local and
> + * non-local data so we can avoid throwing error in that case.
> + */

Modified

> ~~~
>
> 3.7 src/test/regress/sql/subscription.sql
>
> Elsewhere, there was code in this patch that apparently accepts
> 'copy_data' parameter but with no parameter value specified. But this
> combination has no test case.

I have added a test for the same

> ========
> v22-0004
> ========
>
> 4.1 doc/src/sgml/logical-replication.sgml
>
> +  <sect2 id="add-node-data-present-on-new-node">
> +   <title>Adding a new node when data is present on the new node</title>
>
> "when data is present" -> "when table data is present"

Modified

> ~~~
>
> 4.2 doc/src/sgml/logical-replication.sgml
>
> +   <note>
> +    <para>
> +     Adding a new node when data is present on the new node tables is not
> +     supported.
> +    </para>
> +   </note>
>
> I think the note text should match the title text (see #4.1)
>
> SUGGESTION
> Adding a new node when table data is present on the new node is not supported.

Modified

> ~~~
>
> 4.3
>
> +   <para>
> +    Step-2: Lock the required tables of the new node in
> +    <literal>EXCLUSIVE</literal> mode untilthe setup is complete. (This lock is
> +    necessary to prevent any modifications from happening on the new node
> +    because if data modifications occurred after Step-3, there is a chance that
> +    the modifications will be published to the first node and then synchronized
> +    back to the new node while creating the subscription in Step-5. This would
> +    result in inconsistent data).
> +   </para>
> +   <para>
>
> 4.3.a
> typo: "untilthe" -> "until the"

Modified

> 4.3.b
> SUGGESTION (just for the 2nd sentence)
> This lock is necessary to prevent any modifications from happening on
> the new node. If data modifications occurred after Step-3, there is a
> chance they could be published to the first node and then synchronized
> back to the new node while creating the subscription in Step-5.

Modified

Thanks for the comments, the attached v23 patch has the changes for the same.

Regards,
Vignesh

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Support logical replication of DDLs
Следующее
От: vignesh C
Дата:
Сообщение: Re: Handle infinite recursion in logical replication setup