Re: Pgoutput not capturing the generated columns

Поиск
Список
Период
Сортировка
От Shubham Khanna
Тема Re: Pgoutput not capturing the generated columns
Дата
Msg-id CAHv8Rj+JSyiXYrv6f8O+aqc9Ew6wrK9yNzjnwmqy11Zgubs7Ow@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Pgoutput not capturing the generated columns  (Peter Smith <smithpb2250@gmail.com>)
Список pgsql-hackers
On Tue, Jun 4, 2024 at 8:12 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi,
>
> Here are some review comments for patch v5-0001.
>
> ======
> GENERAL G.1
>
> The patch changes HEAD behaviour for PUBLICATION col-lists right? e.g.
> maybe before they were always ignored, but now they are not?
>
> OTOH, when 'include_generated_columns' is false then the PUBLICATION
> col-list will ignore any generated cols even when they are present in
> a PUBLICATION col-list, right?
>
> These kinds of points should be noted in the commit message and in the
> (col-list?) documentation.

Fixed.

> ======
> Commit message
>
> General 1a.
> IMO the commit message needs some background to say something like:
> "Currently generated column values are not replicated because it is
> assumed that the corresponding subscriber-side table will generate its
> own values for those columns."
>
> ~
>
> General 1b.
> Somewhere in this commit message, you need to give all the other
> special rules --- e.g. the docs says "If the subscriber-side column is
> also a generated column then this option has no effect"
>
> ~~~

Fixed.

> 2.
> This commit enables support for the 'include_generated_columns' option
> in logical replication, allowing the transmission of generated column
> information and data alongside regular table changes. This option is
> particularly useful for scenarios where applications require access to
> generated column values for downstream processing or synchronization.
>
> ~
>
> I don't think the sentence "This option is particularly useful..." is
> helpful. It seems like just saying "This commit supports option XXX.
> This is particularly useful if you want XXX".
>

Fixed.

>
> 3.
> CREATE SUBSCRIPTION test1 connection 'dbname=postgres host=localhost port=9999
> 'publication pub1;
>
> ~
>
> What is this CREATE SUBSCRIPTION for? Shouldn't it have an example of
> the new parameter being used in it?
>

Added the description for this in the Patch.

>
> 4.
> Currently copy_data option with include_generated_columns option is
> not supported. A future patch will remove this limitation.
>
> ~
>
> Suggest to single-quote those parameter names for better readability.
>

Fixed.

>
> 5.
> This commit aims to enhance the flexibility and utility of logical
> replication by allowing users to include generated column information
> in replication streams, paving the way for more robust data
> synchronization and processing workflows.
>
> ~
>
> IMO this paragraph can be omitted.

Fixed.

> ======
> .../test_decoding/sql/decoding_into_rel.sql
>
> 6.
> +-- check include-generated-columns option with generated column
> +CREATE TABLE gencoltable (a int PRIMARY KEY, b int GENERATED ALWAYS
> AS (a * 2) STORED);
> +INSERT INTO gencoltable (a) VALUES (1), (2), (3);
> +SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL,
> NULL, 'include-xids', '0', 'skip-empty-xacts', '1',
> 'include-generated-columns', '1');
> +INSERT INTO gencoltable (a) VALUES (4), (5), (6);
> +SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL,
> NULL, 'include-xids', '0', 'skip-empty-xacts', '1',
> 'include-generated-columns', '0');
> +DROP TABLE gencoltable;
> +
>
> 6a.
> I felt some additional explicit comments might help the readabilty of
> the output file.
>
> e.g.1
> -- When 'include-generated=columns' = '1' the generated column 'b'
> values will be replicated
> SELECT data FROM pg_logical_slot_get_changes...
>
> e.g.2
> -- When 'include-generated=columns' = '0' the generated column 'b'
> values will not be replicated
> SELECT data FROM pg_logical_slot_get_changes...

Added the required description for this.

> 6b.
> Suggest adding one more test case (where 'include-generated=columns'
> is not set) to confirm/demonstrate the default behaviour for
> replicated generated cols.

Added the required Test case.

> ======
> doc/src/sgml/protocol.sgml
>
> 7.
> +    <varlistentry>
> +     <term><replaceable
> class="parameter">include-generated-columns</replaceable></term>
> +      <listitem>
> +       <para>
> +        Boolean option to enable generated columns.
> +        The include-generated-columns option controls whether generated
> +        columns should be included in the string representation of tuples
> +        during logical decoding in PostgreSQL. This allows users to
> +        customize the output format based on whether they want to include
> +        these columns or not. The default is false.
> +       </para>
> +      </listitem>
> +    </varlistentry>
>
> 7a.
> It doesn't render properly. e.g. Should not be bold italic (probably
> the class is wrong?), because none of the nearby parameters look this
> way.
>
> ~
>
> 7b.
> The name here should NOT have hyphens. It needs underscores same as
> all other nearby protocol parameters.
>
> ~
>
> 7c.
> The description seems overly verbose.
>
> SUGGESTION
> Boolean option to enable generated columns. This option controls
> whether generated columns should be included in the string
> representation of tuples during logical decoding in PostgreSQL. The
> default is false.

Fixed.

> ======
> doc/src/sgml/ref/create_subscription.sgml
>
> 8.
> +
> +       <varlistentry
> id="sql-createsubscription-params-with-include-generated-column">
> +        <term><literal>include_generated_column</literal>
> (<type>boolean</type>)</term>
> +        <listitem>
> +         <para>
> +          Specifies whether the generated columns present in the tables
> +          associated with the subscription should be replicated. The default is
> +          <literal>false</literal>.
> +         </para>
>
> The parameter name should be plural (include_generated_columns).

Fixed.

> ======
> src/backend/commands/subscriptioncmds.c
>
> 9.
>  #define SUBOPT_ORIGIN 0x00008000
> +#define SUBOPT_INCLUDE_GENERATED_COLUMN 0x00010000
>
> Should be plural COLUMNS
>
Fixed.

> 10.
> + else if (IsSet(supported_opts, SUBOPT_INCLUDE_GENERATED_COLUMN) &&
> + strcmp(defel->defname, "include_generated_column") == 0)
>
> The new subscription parameter should be plural ("include_generated_columns").

Fixed.

> 11.
> +
> + /*
> + * Do additional checking for disallowed combination when copy_data and
> + * include_generated_column are true. COPY of generated columns is
> not supported
> + * yet.
> + */
> + if (opts->copy_data && opts->include_generated_column)
> + {
> + ereport(ERROR,
> + (errcode(ERRCODE_SYNTAX_ERROR),
> + /*- translator: both %s are strings of the form "option = value" */
> + errmsg("%s and %s are mutually exclusive options",
> + "copy_data = true", "include_generated_column = true")));
> + }
>
> /combination/combinations/
>
> The parameter name should be plural in the comment and also in the
> error message.

Fixed.

> ======
> src/bin/psql/tab-complete.c
>
> 12.
>   COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
>     "disable_on_error", "enabled", "failover", "origin",
>     "password_required", "run_as_owner", "slot_name",
> -   "streaming", "synchronous_commit", "two_phase");
> +   "streaming", "synchronous_commit", "two_phase","include_generated_columns");
>
> The new param should be added in alphabetical order same as all the others.

Fixed.

> ======
> src/include/catalog/pg_subscription.h
>
> 13.
> + bool subincludegeneratedcolumn; /* True if generated columns must be
> published */
> +
>
> The field name should be plural.

Fixed.

>
> 14.
> + bool includegeneratedcolumn; /* publish generated column data */
>  } Subscription;
>
> The field name should be plural.

Fixed.

> ======
> src/include/replication/walreceiver.h
>
> 15.
>   * prepare time */
>   char    *origin; /* Only publish data originating from the
>   * specified origin */
> + bool include_generated_column; /* publish generated columns */
>   } logical;
>   } proto;
>  } WalRcvStreamOptions;
>
> ~
>
> This new field name should be plural.

Fixed.

> ======
> src/test/subscription/t/011_generated.pl
>
> 16.
> +my ($cmdret, $stdout, $stderr) = $node_subscriber->psql('postgres', qq(
> + CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION
> pub2 WITH (include_generated_column = true)
> +));
> +ok( $stderr =~
> +   qr/copy_data = true and include_generated_column = true are
> mutually exclusive options/,
> + 'cannot use both include_generated_column and copy_data as true');
>
> Isn't this mutual exclusiveness of options something that could have
> been tested in the regress test suite instead of TAP tests? e.g. AFAIK
> you won't require a connection to test this case.


> 17. Missing test?
>
> IIUC there is a missing test scenario. You can add another subscriber
> table TAB3 which *already* has generated cols (e.g. generating
> different values to the publisher) so then you can verify they are NOT
> overwritten, even when the 'include_generated_cols' is true.
>
> ======

Moved this test case to the Regression test.

Patch v6-0001 contains all the changes required. See [1] for the changes added.

[1] https://www.postgresql.org/message-id/CAHv8RjJn6EiyAitJbbvkvVV2d45fV3Wjr2VmWFugm3RsbaU%2BRg%40mail.gmail.com

Thanks and Regards,
Shubham Khanna.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Conflict Detection and Resolution
Следующее
От: "Winter Loo"
Дата:
Сообщение: Re:Re: may be a buffer overflow problem