Обсуждение: Data is copied twice when specifying both child and parent table in publication

Поиск
Список
Период
Сортировка

Data is copied twice when specifying both child and parent table in publication

От
"houzj.fnst@fujitsu.com"
Дата:
Hi,

In another logical replication related thread[1], my colleague Greg found that
if publish_via_partition_root is true, then the child table's data will be
copied twice when adding both child and parent table to the publication.

Example:

-----
Pub:
create table tbl1 (a int) partition by range (a);
create table tbl1_part1 partition of tbl1 for values from (1) to (10);
create table tbl1_part2 partition of tbl1 for values from (10) to (20);
create publication pub for table tbl1, tbl1_part1 with (publish_via_partition_root=on);

insert into tbl1_part1 values(1);

Sub:
create table tbl1 (a int) partition by range (a);
create table tbl1_part1 partition of tbl1 for values from (1) to (10);
create table tbl1_part2 partition of tbl1 for values from (10) to (20);
create subscription sub CONNECTION 'dbname=postgres port=10000' publication pub;

-- data is copied twice
select * from tbl1_part1;
 a
---
 1
 1
-----

The reason is that the subscriber will fetch the table list from publisher
using the following sql[2] and the subscriber will execute table
synchronization for each table in the query results in this case. But
tbl1_part1 is a partition of tbl1, so the data of tbl1_part1 was copied twice.

[2]
select * from pg_publication_tables;
 pubname | schemaname | tablename
---------+------------+------------
 pub     | public     | tbl1
 pub     | public     | tbl1_part1

IMO, it looks like a bug and it's more natural to only execute the table
synchronization for the parent table in the above case. Because as the document
said: if publish_via_partition_root is true, "changes in a partitioned table
(or on its partitions) contained in the publication will be published using the
identity and schema of the partitioned table rather than that of the individual
partitions that are actually changed;"

To fix it, I think we should fix function GetPublicationRelations which
generate data for the view pg_publication_tables and make it only show the
parent table if publish_via_partition_root is true. And for other future
feature like schema level publication, we can also follow this to exclude
partitions if their parent is specified by FOR TABLE in the same publication.

Attach a patch to fix it.
Thoughts ?

[1] https://www.postgresql.org/message-id/CAJcOf-eBhDUT2J5zs8Z0qEMiZUdhinX%2BbuGX3GN4V83fPnZV3Q%40mail.gmail.com

Best regards,
Hou zhijie


Вложения

RE: Data is copied twice when specifying both child and parent table in publication

От
"houzj.fnst@fujitsu.com"
Дата:
On Friday, October 15, 2021 7:23 PM houzj.fnst@fujitsu.com wrote:
> Attach a patch to fix it.
Attach a new version patch which refactor the fix code in a cleaner way.

Best regards,
Hou zj

Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Greg Nancarrow
Дата:
On Sat, Oct 16, 2021 at 5:30 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Friday, October 15, 2021 7:23 PM houzj.fnst@fujitsu.com wrote:
> > Attach a patch to fix it.
> Attach a new version patch which refactor the fix code in a cleaner way.
>

I have not debugged it yet to find out why, but with the patch
applied, the original double-publish problem that I reported
(converted to just use TABLE rather than ALL TABLES IN SCHEMA) still
occurs.

The steps are below:


CREATE SCHEMA sch;
CREATE SCHEMA sch1;
CREATE TABLE sch.sale (sale_date date not null, country_code text,
product_sku text, units integer) PARTITION BY RANGE (sale_date);
CREATE TABLE sch1.sale_201901 PARTITION OF sch.sale FOR VALUES FROM
('2019-01-01') TO ('2019-02-01');
CREATE TABLE sch1.sale_201902 PARTITION OF sch.sale FOR VALUES FROM
('2019-02-01') TO ('2019-03-01');

(1) PUB:  CREATE PUBLICATION pub FOR TABLE sch1.sale_201901,
sch1.sale_201902 WITH (publish_via_partition_root=true);
(2) SUB:  CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres
host=localhost port=5432' PUBLICATION pub;
(3) PUB:  INSERT INTO sch.sale VALUES('2019-01-01', 'AU', 'cpu', 5),
('2019-01-02', 'AU', 'disk', 8);
(4) SUB:  SELECT * FROM sch.sale;
(5) PUB:  ALTER PUBLICATION pub ADD TABLE sch.sale;
(6) SUB:  ALTER SUBSCRIPTION sub REFRESH PUBLICATION;
(7) SUB:  SELECT * FROM sch.sale;

Regards,
Greg Nancarrow
Fujitsu Australia



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Mon, Oct 18, 2021 at 8:27 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Sat, Oct 16, 2021 at 5:30 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Friday, October 15, 2021 7:23 PM houzj.fnst@fujitsu.com wrote:
> > > Attach a patch to fix it.
> > Attach a new version patch which refactor the fix code in a cleaner way.
> >
>
> I have not debugged it yet to find out why, but with the patch
> applied, the original double-publish problem that I reported
> (converted to just use TABLE rather than ALL TABLES IN SCHEMA) still
> occurs.
>

Yeah, I think this is a variant of the problem being fixed by
Hou-San's patch. I think one possible idea to investigate is that on
the subscriber-side, after fetching tables, we check the already
subscribed tables and if the child tables already exist then we ignore
the parent table and vice versa. We might want to consider the case
where a user has toggled the "publish_via_partition_root" parameter.

It seems both these behaviours/problems exist since commit 17b9e7f9
(Support adding partitioned tables to publication). Adding Amit L and
Peter E (people involved in this work) to know their opinion?

-- 
With Regards,
Amit Kapila.



RE: Data is copied twice when specifying both child and parent table in publication

От
"houzj.fnst@fujitsu.com"
Дата:
On Saturday, October 16, 2021 2:31 PM houzj.fnst@fujitsu.com wrote:
> On Friday, October 15, 2021 7:23 PM houzj.fnst@fujitsu.com wrote:
> > Attach a patch to fix it.
> Attach a new version patch which refactor the fix code in a cleaner way.

Although the discussion about the partition behavior[1] is going on,
attach a refactored fix patch which make the pg_publication_tables view be
consistent for FOR TABLE and FOR ALL TABLES here in case someone want
to have a look.

[1] https://www.postgresql.org/message-id/CAA4eK1JC5sy5M_UVoGdgubHN2--peYqApOJkT%3DFLCq%2BVUxqerQ%40mail.gmail.com

Best regards,
Hou zj

Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Langote
Дата:
On Mon, Oct 18, 2021 at 3:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Mon, Oct 18, 2021 at 8:27 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
> > On Sat, Oct 16, 2021 at 5:30 PM houzj.fnst@fujitsu.com
> > <houzj.fnst@fujitsu.com> wrote:
> > >
> > > On Friday, October 15, 2021 7:23 PM houzj.fnst@fujitsu.com wrote:
> > > > Attach a patch to fix it.
> > > Attach a new version patch which refactor the fix code in a cleaner way.
> > >
> >
> > I have not debugged it yet to find out why, but with the patch
> > applied, the original double-publish problem that I reported
> > (converted to just use TABLE rather than ALL TABLES IN SCHEMA) still
> > occurs.
> >
>
> Yeah, I think this is a variant of the problem being fixed by
> Hou-San's patch. I think one possible idea to investigate is that on
> the subscriber-side, after fetching tables, we check the already
> subscribed tables and if the child tables already exist then we ignore
> the parent table and vice versa. We might want to consider the case
> where a user has toggled the "publish_via_partition_root" parameter.
>
> It seems both these behaviours/problems exist since commit 17b9e7f9
> (Support adding partitioned tables to publication). Adding Amit L and
> Peter E (people involved in this work) to know their opinion?

I can imagine that the behavior seen here may look surprising, but not
sure if I would call it a bug as such.  I do remember thinking about
this case and the current behavior is how I may have coded it to be.

Looking at this command in Hou-san's email:

  create publication pub for table tbl1, tbl1_part1 with
(publish_via_partition_root=on);

It's adding both the root partitioned table and the leaf partition
*explicitly*, and it's not clear to me if the latter's inclusion in
the publication should be assumed because the former is found to have
been added to the publication, that is, as far as the latter's
visibility to the subscriber is concerned.  It's not a stretch to
imagine that a user may write the command this way to account for a
subscriber node on which tbl1 and tbl1_part1 are unrelated tables.

I don't think we assume anything on the publisher side regarding the
state/configuration of tables on the subscriber side, at least with
publication commands where tables are added to a publication
explicitly, so it is up to the user to make sure that the tables are
not added duplicatively.  One may however argue that the way we've
decided to handle FOR ALL TABLES does assume something about
partitions where it skips advertising them to subscribers when
publish_via_partition_root flag is set to true, but that is exactly to
avoid the duplication of data that goes to a subscriber.

-- 
Amit Langote
EDB: http://www.enterprisedb.com



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Mon, Oct 18, 2021 at 2:32 PM Amit Langote <amitlangote09@gmail.com> wrote:
>
> On Mon, Oct 18, 2021 at 3:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Mon, Oct 18, 2021 at 8:27 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > >
> > > On Sat, Oct 16, 2021 at 5:30 PM houzj.fnst@fujitsu.com
> > > <houzj.fnst@fujitsu.com> wrote:
> > > >
> > > > On Friday, October 15, 2021 7:23 PM houzj.fnst@fujitsu.com wrote:
> > > > > Attach a patch to fix it.
> > > > Attach a new version patch which refactor the fix code in a cleaner way.
> > > >
> > >
> > > I have not debugged it yet to find out why, but with the patch
> > > applied, the original double-publish problem that I reported
> > > (converted to just use TABLE rather than ALL TABLES IN SCHEMA) still
> > > occurs.
> > >
> >
> > Yeah, I think this is a variant of the problem being fixed by
> > Hou-San's patch. I think one possible idea to investigate is that on
> > the subscriber-side, after fetching tables, we check the already
> > subscribed tables and if the child tables already exist then we ignore
> > the parent table and vice versa. We might want to consider the case
> > where a user has toggled the "publish_via_partition_root" parameter.
> >
> > It seems both these behaviours/problems exist since commit 17b9e7f9
> > (Support adding partitioned tables to publication). Adding Amit L and
> > Peter E (people involved in this work) to know their opinion?
>
> I can imagine that the behavior seen here may look surprising, but not
> sure if I would call it a bug as such.  I do remember thinking about
> this case and the current behavior is how I may have coded it to be.
>
> Looking at this command in Hou-san's email:
>
>   create publication pub for table tbl1, tbl1_part1 with
> (publish_via_partition_root=on);
>
> It's adding both the root partitioned table and the leaf partition
> *explicitly*, and it's not clear to me if the latter's inclusion in
> the publication should be assumed because the former is found to have
> been added to the publication, that is, as far as the latter's
> visibility to the subscriber is concerned.  It's not a stretch to
> imagine that a user may write the command this way to account for a
> subscriber node on which tbl1 and tbl1_part1 are unrelated tables.
>
> I don't think we assume anything on the publisher side regarding the
> state/configuration of tables on the subscriber side, at least with
> publication commands where tables are added to a publication
> explicitly, so it is up to the user to make sure that the tables are
> not added duplicatively.  One may however argue that the way we've
> decided to handle FOR ALL TABLES does assume something about
> partitions where it skips advertising them to subscribers when
> publish_via_partition_root flag is set to true, but that is exactly to
> avoid the duplication of data that goes to a subscriber.
>

I think the same confusion will then apply to the new feature (For All
Tables In Schema) being discussed [1] (that is a bit long thread so
shared the email where the latest patch version is posted). There
also, the partitioned table and partition can be in a different
schema. We either need to follow "For All Tables" or "For Table"
behavior. Then, there is also an argument that such behavior is not
documented, and by reading "publish_via_partition_root", it is not
clear why would the user expect the current behavior?

Also, what about Greg's case [2], where I think it is clear that the
subscriber also has partitions?

[1] -
https://www.postgresql.org/message-id/OS0PR01MB5716B523961FE338EB9B3F9A94BC9%40OS0PR01MB5716.jpnprd01.prod.outlook.com
[2] - https://www.postgresql.org/message-id/CAJcOf-eQR_%3Dq0f4ZVHd342QdLvBd_995peSr4xCU05hrS3TeTg%40mail.gmail.com

-- 
With Regards,
Amit Kapila.



RE: Data is copied twice when specifying both child and parent table in publication

От
"houzj.fnst@fujitsu.com"
Дата:
On Monday, October 18, 2021 5:03 PM Amit Langote <amitlangote09@gmail.com> wrote:
> I can imagine that the behavior seen here may look surprising, but not
> sure if I would call it a bug as such.  I do remember thinking about
> this case and the current behavior is how I may have coded it to be.
> 
> Looking at this command in Hou-san's email:
> 
>   create publication pub for table tbl1, tbl1_part1 with
> (publish_via_partition_root=on);
> 
> It's adding both the root partitioned table and the leaf partition
> *explicitly*, and it's not clear to me if the latter's inclusion in
> the publication should be assumed because the former is found to have
> been added to the publication, that is, as far as the latter's
> visibility to the subscriber is concerned.  It's not a stretch to
> imagine that a user may write the command this way to account for a
> subscriber node on which tbl1 and tbl1_part1 are unrelated tables.
> 
> I don't think we assume anything on the publisher side regarding the
> state/configuration of tables on the subscriber side, at least with
> publication commands where tables are added to a publication
> explicitly, so it is up to the user to make sure that the tables are
> not added duplicatively.  One may however argue that the way we've
> decided to handle FOR ALL TABLES does assume something about
> partitions where it skips advertising them to subscribers when
> publish_via_partition_root flag is set to true, but that is exactly to
> avoid the duplication of data that goes to a subscriber.

Hi,

Thanks for the explanation.

I think one reason that I consider this behavior a bug is that: If we add
both the root partitioned table and the leaf partition explicitly to the
publication (and set publish_via_partition_root = on), the behavior of the
apply worker is inconsistent with the behavior of table sync worker.

In this case, all changes in the leaf the partition will be applied using the
identity and schema of the partitioned(root) table. But for the table sync, it
will execute table sync for both the leaf and the root table which cause
duplication of data.

Wouldn't it be better to make the behavior consistent here ?

Best regards,
Hou zj



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Mon, Oct 18, 2021 at 2:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Oct 18, 2021 at 2:32 PM Amit Langote <amitlangote09@gmail.com> wrote:
> >
> > Looking at this command in Hou-san's email:
> >
> >   create publication pub for table tbl1, tbl1_part1 with
> > (publish_via_partition_root=on);
> >
> > It's adding both the root partitioned table and the leaf partition
> > *explicitly*, and it's not clear to me if the latter's inclusion in
> > the publication should be assumed because the former is found to have
> > been added to the publication, that is, as far as the latter's
> > visibility to the subscriber is concerned.  It's not a stretch to
> > imagine that a user may write the command this way to account for a
> > subscriber node on which tbl1 and tbl1_part1 are unrelated tables.
> >
> > I don't think we assume anything on the publisher side regarding the
> > state/configuration of tables on the subscriber side, at least with
> > publication commands where tables are added to a publication
> > explicitly, so it is up to the user to make sure that the tables are
> > not added duplicatively.  One may however argue that the way we've
> > decided to handle FOR ALL TABLES does assume something about
> > partitions where it skips advertising them to subscribers when
> > publish_via_partition_root flag is set to true, but that is exactly to
> > avoid the duplication of data that goes to a subscriber.
> >
>
> I think the same confusion will then apply to the new feature (For All
> Tables In Schema) being discussed [1] (that is a bit long thread so
> shared the email where the latest patch version is posted). There
> also, the partitioned table and partition can be in a different
> schema.
>

Sorry, I wanted to say that table partition and partitioned table can
be in the same schema. Now, if the user publishes all tables in a
schema, if we want to follow the "For All Tables" behavior then we
should skip the leaf table and publish only the parent table, OTOH, if
want to follow "For Table" behavior, we need to publish both
partitioned table and partition table. I feel it is better to be
consistent here in all three cases ("For Table", "For All Tables", and
"For All Tables In Schema") as it will be easier to explain and
document it.

-- 
With Regards,
Amit Kapila.



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Tue, Oct 19, 2021 at 8:45 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Oct 18, 2021 at 2:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Oct 18, 2021 at 2:32 PM Amit Langote <amitlangote09@gmail.com> wrote:
> > >
> > > Looking at this command in Hou-san's email:
> > >
> > >   create publication pub for table tbl1, tbl1_part1 with
> > > (publish_via_partition_root=on);
> > >
> > > It's adding both the root partitioned table and the leaf partition
> > > *explicitly*, and it's not clear to me if the latter's inclusion in
> > > the publication should be assumed because the former is found to have
> > > been added to the publication, that is, as far as the latter's
> > > visibility to the subscriber is concerned.  It's not a stretch to
> > > imagine that a user may write the command this way to account for a
> > > subscriber node on which tbl1 and tbl1_part1 are unrelated tables.
> > >
> > > I don't think we assume anything on the publisher side regarding the
> > > state/configuration of tables on the subscriber side, at least with
> > > publication commands where tables are added to a publication
> > > explicitly, so it is up to the user to make sure that the tables are
> > > not added duplicatively.  One may however argue that the way we've
> > > decided to handle FOR ALL TABLES does assume something about
> > > partitions where it skips advertising them to subscribers when
> > > publish_via_partition_root flag is set to true, but that is exactly to
> > > avoid the duplication of data that goes to a subscriber.
> > >
> >
> > I think the same confusion will then apply to the new feature (For All
> > Tables In Schema) being discussed [1] (that is a bit long thread so
> > shared the email where the latest patch version is posted). There
> > also, the partitioned table and partition can be in a different
> > schema.
> >
>
> Sorry, I wanted to say that table partition and partitioned table can
> be in the same schema. Now, if the user publishes all tables in a
> schema, if we want to follow the "For All Tables" behavior then we
> should skip the leaf table and publish only the parent table, OTOH, if
> want to follow "For Table" behavior, we need to publish both
> partitioned table and partition table. I feel it is better to be
> consistent here in all three cases ("For Table", "For All Tables", and
> "For All Tables In Schema") as it will be easier to explain and
> document it.
>

Thinking some more about it, I think we also have a problem when the
partitioned and partition tables are in different schemas especially
when the user created a publication having a combination of "For
Table" and "For All Tables In Schema", see below:

create schema sch1;
create schema sch2;

create table sch1.tbl1 (a int) partition by range ( a );
create table sch2.tbl1_part1 partition of sch1.tbl1 for values from
(1) to (101);
create table sch1.tbl1_part2 partition of sch1.tbl1 for values from
(101) to (200);

create publication mypub for table sch2.tbl1_part1, all tables in
schema sch1 WITH (publish_via_partition_root = true);

Now, here if we follow the rules of "For Table", then we should get
both partitioned and partition tables which will be different from the
case when both are in the same schema considering we follow "For All
Tables" behavior in "For All Tables In Schema" case.

The point is that as we extend the current feature, I think the
complications will increase if we don't have a consistent behavior for
all cases and it will also be difficult to explain it to users.

-- 
With Regards,
Amit Kapila.



Re: Data is copied twice when specifying both child and parent table in publication

От
Dilip Kumar
Дата:
On Tue, Oct 19, 2021 at 8:17 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:

> Thanks for the explanation.
>
> I think one reason that I consider this behavior a bug is that: If we add
> both the root partitioned table and the leaf partition explicitly to the
> publication (and set publish_via_partition_root = on), the behavior of the
> apply worker is inconsistent with the behavior of table sync worker.
>
> In this case, all changes in the leaf the partition will be applied using the
> identity and schema of the partitioned(root) table. But for the table sync, it
> will execute table sync for both the leaf and the root table which cause
> duplication of data.
>
> Wouldn't it be better to make the behavior consistent here ?

I agree with the point, whether we are doing the initial sync or we
are doing transaction streaming the behavior should be the same.  I
think the right behavior should be that even if user has given both
parent table and the child table in the published table list, it
should sync it only once,  because consider the case where we add a
same table twice e.g (CREATE PUBLICATION mypub FOR TABLE t1,t1;) but
in that case also we consider this table only once and there will be
no duplicate data.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Data is copied twice when specifying both child and parent table in publication

От
Greg Nancarrow
Дата:
On Mon, Oct 18, 2021 at 5:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > I have not debugged it yet to find out why, but with the patch
> > applied, the original double-publish problem that I reported
> > (converted to just use TABLE rather than ALL TABLES IN SCHEMA) still
> > occurs.
> >
>
> Yeah, I think this is a variant of the problem being fixed by
> Hou-San's patch. I think one possible idea to investigate is that on
> the subscriber-side, after fetching tables, we check the already
> subscribed tables and if the child tables already exist then we ignore
> the parent table and vice versa. We might want to consider the case
> where a user has toggled the "publish_via_partition_root" parameter.
>
> It seems both these behaviours/problems exist since commit 17b9e7f9
> (Support adding partitioned tables to publication). Adding Amit L and
> Peter E (people involved in this work) to know their opinion?
>

Actually, at least with the scenario I gave steps for, after looking
at it again and debugging, I think that the behavior is understandable
and not a bug.
The reason is that the INSERTed data is first published though the
partitions, since initially there is no partitioned table in the
publication (so publish_via_partition_root=true doesn't have any
effect). But then adding the partitioned table to the publication and
refreshing the publication in the subscriber, the data is then
published "using the identity and schema of the partitioned table" due
to publish_via_partition_root=true. Note that the corresponding table
in the subscriber may well be a non-partitioned table (or the
partitions arranged differently) so the data does need to be
replicated again.

Regards,
Greg Nancarrow
Fujitsu Australia



Re: Data is copied twice when specifying both child and parent table in publication

От
Dilip Kumar
Дата:
On Wed, Oct 20, 2021 at 12:44 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Mon, Oct 18, 2021 at 5:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > I have not debugged it yet to find out why, but with the patch
> > > applied, the original double-publish problem that I reported
> > > (converted to just use TABLE rather than ALL TABLES IN SCHEMA) still
> > > occurs.
> > >
> >
> > Yeah, I think this is a variant of the problem being fixed by
> > Hou-San's patch. I think one possible idea to investigate is that on
> > the subscriber-side, after fetching tables, we check the already
> > subscribed tables and if the child tables already exist then we ignore
> > the parent table and vice versa. We might want to consider the case
> > where a user has toggled the "publish_via_partition_root" parameter.
> >
> > It seems both these behaviours/problems exist since commit 17b9e7f9
> > (Support adding partitioned tables to publication). Adding Amit L and
> > Peter E (people involved in this work) to know their opinion?
> >
>
> Actually, at least with the scenario I gave steps for, after looking
> at it again and debugging, I think that the behavior is understandable
> and not a bug.
> The reason is that the INSERTed data is first published though the
> partitions, since initially there is no partitioned table in the
> publication (so publish_via_partition_root=true doesn't have any
> effect). But then adding the partitioned table to the publication and
> refreshing the publication in the subscriber, the data is then
> published "using the identity and schema of the partitioned table" due
> to publish_via_partition_root=true. Note that the corresponding table
> in the subscriber may well be a non-partitioned table (or the
> partitions arranged differently) so the data does need to be
> replicated again.

I don't think this behavior is consistent, I mean for the initial sync
we will replicate the duplicate data, whereas for later streaming we
will only replicate it once.  From the user POW, this behavior doesn't
look correct.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Wed, Oct 20, 2021 at 1:32 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Oct 20, 2021 at 12:44 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
> > On Mon, Oct 18, 2021 at 5:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > > I have not debugged it yet to find out why, but with the patch
> > > > applied, the original double-publish problem that I reported
> > > > (converted to just use TABLE rather than ALL TABLES IN SCHEMA) still
> > > > occurs.
> > > >
> > >
> > > Yeah, I think this is a variant of the problem being fixed by
> > > Hou-San's patch. I think one possible idea to investigate is that on
> > > the subscriber-side, after fetching tables, we check the already
> > > subscribed tables and if the child tables already exist then we ignore
> > > the parent table and vice versa. We might want to consider the case
> > > where a user has toggled the "publish_via_partition_root" parameter.
> > >
> > > It seems both these behaviours/problems exist since commit 17b9e7f9
> > > (Support adding partitioned tables to publication). Adding Amit L and
> > > Peter E (people involved in this work) to know their opinion?
> > >
> >
> > Actually, at least with the scenario I gave steps for, after looking
> > at it again and debugging, I think that the behavior is understandable
> > and not a bug.
> > The reason is that the INSERTed data is first published though the
> > partitions, since initially there is no partitioned table in the
> > publication (so publish_via_partition_root=true doesn't have any
> > effect). But then adding the partitioned table to the publication and
> > refreshing the publication in the subscriber, the data is then
> > published "using the identity and schema of the partitioned table" due
> > to publish_via_partition_root=true. Note that the corresponding table
> > in the subscriber may well be a non-partitioned table (or the
> > partitions arranged differently) so the data does need to be
> > replicated again.
>

Even if the partitions are arranged differently why would the user
expect the same data to be replicated twice?

> I don't think this behavior is consistent, I mean for the initial sync
> we will replicate the duplicate data, whereas for later streaming we
> will only replicate it once.  From the user POW, this behavior doesn't
> look correct.
>

+1.

-- 
With Regards,
Amit Kapila.



Re: Data is copied twice when specifying both child and parent table in publication

От
Greg Nancarrow
Дата:
On Wed, Oct 20, 2021 at 7:02 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> > Actually, at least with the scenario I gave steps for, after looking
> > at it again and debugging, I think that the behavior is understandable
> > and not a bug.
> > The reason is that the INSERTed data is first published though the
> > partitions, since initially there is no partitioned table in the
> > publication (so publish_via_partition_root=true doesn't have any
> > effect). But then adding the partitioned table to the publication and
> > refreshing the publication in the subscriber, the data is then
> > published "using the identity and schema of the partitioned table" due
> > to publish_via_partition_root=true. Note that the corresponding table
> > in the subscriber may well be a non-partitioned table (or the
> > partitions arranged differently) so the data does need to be
> > replicated again.
>
> I don't think this behavior is consistent, I mean for the initial sync
> we will replicate the duplicate data, whereas for later streaming we
> will only replicate it once.  From the user POW, this behavior doesn't
> look correct.
>

The scenario I gave steps for didn't have any table data when the
subscription was made, so the initial sync did not replicate any data.
I was referring to the double-publish that occurs when
publish_via_partition_root=true and then the partitioned table is
added to the publication and the subscriber does ALTER SUBSCRIPTION
... REFRESH PUBLICATION.
If I modify my example to include both the partitioned table and
(explicitly) its child partitions in the publication, and insert some
data on the publisher side prior to the subscription, then I am seeing
duplicate data on the initial sync on the subscriber side, and I would
agree that this doesn't seem correct.

Regards,
Greg Nancarrow
Fujitsu Australia



Re: Data is copied twice when specifying both child and parent table in publication

От
Greg Nancarrow
Дата:
On Wed, Oct 20, 2021 at 7:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > > Actually, at least with the scenario I gave steps for, after looking
> > > at it again and debugging, I think that the behavior is understandable
> > > and not a bug.
> > > The reason is that the INSERTed data is first published though the
> > > partitions, since initially there is no partitioned table in the
> > > publication (so publish_via_partition_root=true doesn't have any
> > > effect). But then adding the partitioned table to the publication and
> > > refreshing the publication in the subscriber, the data is then
> > > published "using the identity and schema of the partitioned table" due
> > > to publish_via_partition_root=true. Note that the corresponding table
> > > in the subscriber may well be a non-partitioned table (or the
> > > partitions arranged differently) so the data does need to be
> > > replicated again.
> >
>
> Even if the partitions are arranged differently why would the user
> expect the same data to be replicated twice?
>

It's the same data, but published in different ways because of changes
the user made to the publication.
I am not talking in general, I am specifically referring to the
scenario I gave steps for.
In the example scenario I gave, initially when the subscription was
made, the publication just explicitly included the partitions, but
publish_via_partition_root was true. So in this case it publishes
through the individual partitions (as no partitioned table is present
in the publication). Then on the publisher side, the partitioned table
was then added to the publication and then ALTER SUBSCRIPTION ...
REFRESH PUBLICATION done on the subscriber side. Now that the
partitioned table is present in the publication and
publish_via_partition_root is true, it is "published using the
identity and schema of the partitioned table rather than that of the
individual partitions that are actually changed". So the data is
replicated again.
This scenario didn't use initial table data, so initial table sync
didn't come into play (although as I previously posted, I can see a
double-publish issue on initial sync if data is put in the table prior
to subscription and partitions have been explicitly added to the
publication).

Regards,
Greg Nancarrow
Fujitsu Australia



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Wed, Oct 20, 2021 at 3:03 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Wed, Oct 20, 2021 at 7:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > > Actually, at least with the scenario I gave steps for, after looking
> > > > at it again and debugging, I think that the behavior is understandable
> > > > and not a bug.
> > > > The reason is that the INSERTed data is first published though the
> > > > partitions, since initially there is no partitioned table in the
> > > > publication (so publish_via_partition_root=true doesn't have any
> > > > effect). But then adding the partitioned table to the publication and
> > > > refreshing the publication in the subscriber, the data is then
> > > > published "using the identity and schema of the partitioned table" due
> > > > to publish_via_partition_root=true. Note that the corresponding table
> > > > in the subscriber may well be a non-partitioned table (or the
> > > > partitions arranged differently) so the data does need to be
> > > > replicated again.
> > >
> >
> > Even if the partitions are arranged differently why would the user
> > expect the same data to be replicated twice?
> >
>
> It's the same data, but published in different ways because of changes
> the user made to the publication.
> I am not talking in general, I am specifically referring to the
> scenario I gave steps for.
> In the example scenario I gave, initially when the subscription was
> made, the publication just explicitly included the partitions, but
> publish_via_partition_root was true. So in this case it publishes
> through the individual partitions (as no partitioned table is present
> in the publication). Then on the publisher side, the partitioned table
> was then added to the publication and then ALTER SUBSCRIPTION ...
> REFRESH PUBLICATION done on the subscriber side. Now that the
> partitioned table is present in the publication and
> publish_via_partition_root is true, it is "published using the
> identity and schema of the partitioned table rather than that of the
> individual partitions that are actually changed". So the data is
> replicated again.
>

I don't see why data need to be replicated again even in that case.
Can you see any such duplicate data replicated for non-partitioned
tables?

> This scenario didn't use initial table data, so initial table sync
> didn't come into play
>

It will be equivalent to initial sync because the tablesync worker
would copy the entire data again in this case unless during refresh we
pass copy_data as false.

-- 
With Regards,
Amit Kapila.



Re: Data is copied twice when specifying both child and parent table in publication

От
Greg Nancarrow
Дата:
On Wed, Oct 20, 2021 at 9:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> I don't see why data need to be replicated again even in that case.
> Can you see any such duplicate data replicated for non-partitioned
> tables?
>

If my example is slightly modified to use the same-named tables on the
subscriber side, but without partitioning, i.e.:

PUB:

CREATE SCHEMA sch;
CREATE SCHEMA sch1;
CREATE TABLE sch.sale (sale_date date not null, country_code text,
product_sku text, units integer) PARTITION BY RANGE (sale_date);
CREATE TABLE sch1.sale_201901 PARTITION OF sch.sale FOR VALUES FROM
('2019-01-01') TO ('2019-02-01');
CREATE TABLE sch1.sale_201902 PARTITION OF sch.sale FOR VALUES FROM
('2019-02-01') TO ('2019-03-01');


SUB:

CREATE SCHEMA sch;
CREATE SCHEMA sch1;
CREATE TABLE sch.sale (sale_date date not null, country_code text,
product_sku text, units integer);
CREATE TABLE sch1.sale_201901 (sale_date date not null, country_code
text, product_sku text, units integer);
CREATE TABLE sch1.sale_201902 (sale_date date not null, country_code
text, product_sku text, units integer);

then the INSERTed data on the publisher side gets replicated to the
subscriber's "sch1.sale_201901" and "sch1.sale_201902" tables (only),
depending on the date values.
Now if the partitioned table is then added to the publication and
ALTER SUBSCRIPTION ... REFRESH PUBLICATION done by the subscriber,
then the current functionality is that the existing sch.sale data is
replicated (only) to the subscriber's "sch.sale" table (even though
data had been replicated previously to the "sch1.sale_201901" and
"sch1.sale_201902" tables, only).
So, just to be clear, you think that this current functionality isn't
correct (i.e. no data should be replicated on the REFRESH in this
case)?
I think it's debatable because here copy_data=true and sch.sale was
not a previously-subscribed table (so pre-existing data in that table
should be copied, in accordance with the current documentation).

Regards,
Greg Nancarrow
Fujitsu Australia



Re: Data is copied twice when specifying both child and parent table in publication

От
Peter Smith
Дата:
Hi,

I just wanted to bring to your attention an earlier thread [1] in
which I had already encountered/reported behaviour that is either
exactly the same or is closely related to what is being discussed in
this current thread. If it is different please take that into account
also.

------
[1] https://www.postgresql.org/message-id/CAHut%2BPvJMRB-ZyC80we2kiUFv4cVjmA6jxXpEMhm1rmz%3D1ryeA%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Data is copied twice when specifying both child and parent table in publication

От
Greg Nancarrow
Дата:
On Thu, Oct 21, 2021 at 2:56 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> I just wanted to bring to your attention an earlier thread [1] in
> which I had already encountered/reported behaviour that is either
> exactly the same or is closely related to what is being discussed in
> this current thread. If it is different please take that into account
> also.
>
> ------
> [1] https://www.postgresql.org/message-id/CAHut%2BPvJMRB-ZyC80we2kiUFv4cVjmA6jxXpEMhm1rmz%3D1ryeA%40mail.gmail.com
>

Thanks, I was able to reproduce that behavior, which is similar (but
in that case the publish_via_partition_root flag is toggled with the
partitioned table present in the publication, whereas in the case
being discussed the presence of the partitioned table in the
publication is toggled with publish_via_partition_root always true).

What seems to happen internally when a partitioned table is published
is that when publish_via_partition_root=true the subscriber to that
publication is effectively subscribed to the parent partitioned table
but not the child partitions. If publish_via_partition_root is changed
to false and the subscriber refreshes, the partitioned table is
unsubscribed and it subscribes to the partitions. This explains why
data gets "re-copied" when this happens, because then it is
subscribing to a "new" table and copy_data=true by default.


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Wed, Oct 20, 2021 at 7:11 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Wed, Oct 20, 2021 at 9:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > I don't see why data need to be replicated again even in that case.
> > Can you see any such duplicate data replicated for non-partitioned
> > tables?
> >
>
> If my example is slightly modified to use the same-named tables on the
> subscriber side, but without partitioning, i.e.:
>
> PUB:
>
> CREATE SCHEMA sch;
> CREATE SCHEMA sch1;
> CREATE TABLE sch.sale (sale_date date not null, country_code text,
> product_sku text, units integer) PARTITION BY RANGE (sale_date);
> CREATE TABLE sch1.sale_201901 PARTITION OF sch.sale FOR VALUES FROM
> ('2019-01-01') TO ('2019-02-01');
> CREATE TABLE sch1.sale_201902 PARTITION OF sch.sale FOR VALUES FROM
> ('2019-02-01') TO ('2019-03-01');
>
>
> SUB:
>
> CREATE SCHEMA sch;
> CREATE SCHEMA sch1;
> CREATE TABLE sch.sale (sale_date date not null, country_code text,
> product_sku text, units integer);
> CREATE TABLE sch1.sale_201901 (sale_date date not null, country_code
> text, product_sku text, units integer);
> CREATE TABLE sch1.sale_201902 (sale_date date not null, country_code
> text, product_sku text, units integer);
>
> then the INSERTed data on the publisher side gets replicated to the
> subscriber's "sch1.sale_201901" and "sch1.sale_201902" tables (only),
> depending on the date values.
> Now if the partitioned table is then added to the publication and
> ALTER SUBSCRIPTION ... REFRESH PUBLICATION done by the subscriber,
> then the current functionality is that the existing sch.sale data is
> replicated (only) to the subscriber's "sch.sale" table (even though
> data had been replicated previously to the "sch1.sale_201901" and
> "sch1.sale_201902" tables, only).
> So, just to be clear, you think that this current functionality isn't
> correct (i.e. no data should be replicated on the REFRESH in this
> case)?
>

Right, I don't think it is correct because it will behave differently
when the tables on the subscriber are partitioned. Also, the idea I
speculated in one of my above emails should be able to deal with this
case.

> I think it's debatable because here copy_data=true and sch.sale was
> not a previously-subscribed table (so pre-existing data in that table
> should be copied, in accordance with the current documentation).
>

What about the partition (child) table? In this case, the same data
will be present in two tables sch.sale and sch1.sale_201901 after you
have refreshed the publication, and then any future insertions will
only be inserted into parent table sch.sale in this case which doesn't
sound consistent. The bigger problem is that it will lead to duplicate
data when tables are partitioned. I think if the user really wants to
do in a way you are describing, there is no need to keep sub-tables
(*_201901 and *__201902). I understand that it depends on the use case
but we should also behave sanely when tables/partitions are created in
the same way in both publisher and subscriber which I guess will most
likely be the case.

-- 
With Regards,
Amit Kapila.



RE: Data is copied twice when specifying both child and parent table in publication

От
"shiy.fnst@fujitsu.com"
Дата:
On Tuesday, October 19, 2021 10:47 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
> 
> On Monday, October 18, 2021 5:03 PM Amit Langote
> <amitlangote09@gmail.com> wrote:
> > I can imagine that the behavior seen here may look surprising, but not
> > sure if I would call it a bug as such.  I do remember thinking about
> > this case and the current behavior is how I may have coded it to be.
> >
> > Looking at this command in Hou-san's email:
> >
> >   create publication pub for table tbl1, tbl1_part1 with
> > (publish_via_partition_root=on);
> >
> > It's adding both the root partitioned table and the leaf partition
> > *explicitly*, and it's not clear to me if the latter's inclusion in
> > the publication should be assumed because the former is found to have
> > been added to the publication, that is, as far as the latter's
> > visibility to the subscriber is concerned.  It's not a stretch to
> > imagine that a user may write the command this way to account for a
> > subscriber node on which tbl1 and tbl1_part1 are unrelated tables.
> >
> > I don't think we assume anything on the publisher side regarding the
> > state/configuration of tables on the subscriber side, at least with
> > publication commands where tables are added to a publication
> > explicitly, so it is up to the user to make sure that the tables are
> > not added duplicatively.  One may however argue that the way we've
> > decided to handle FOR ALL TABLES does assume something about
> > partitions where it skips advertising them to subscribers when
> > publish_via_partition_root flag is set to true, but that is exactly to
> > avoid the duplication of data that goes to a subscriber.
> 
> Hi,
> 
> Thanks for the explanation.
> 
> I think one reason that I consider this behavior a bug is that: If we add
> both the root partitioned table and the leaf partition explicitly to the
> publication (and set publish_via_partition_root = on), the behavior of the
> apply worker is inconsistent with the behavior of table sync worker.
> 
> In this case, all changes in the leaf the partition will be applied using the
> identity and schema of the partitioned(root) table. But for the table sync, it
> will execute table sync for both the leaf and the root table which cause
> duplication of data.
> 
> Wouldn't it be better to make the behavior consistent here ?
> 

I agree with this point. 

About this case,

> >   create publication pub for table tbl1, tbl1_part1 with
> > (publish_via_partition_root=on);

As a user, although partitioned table includes the partition, publishing partitioned
table and its partition is allowed. So, I think we should take this case into
consideration. Initial data is copied once via the parent table seems reasonable.

Regards
Shi yu

RE: Data is copied twice when specifying both child and parent table in publication

От
"houzj.fnst@fujitsu.com"
Дата:
Hi,

As there are basically two separate issues mentioned in the thread, I tried to
summarize the discussion so far which might be helpful to others.

* The first issue[1]:

If we include both the partitioned table and (explicitly) its child partitions
in the publication when set publish_via_partition_root=true, like:
---
CREATE PUBLICATION pub FOR TABLE parent_table, child_table with (publish_via_partition_root=on);
---
It could execute initial sync for both the partitioned(parent_table) table and
(explicitly) its child partitions(child_table) which cause duplication of
data in partition(child_table) in subscriber side.

The reasons I considered this behavior a bug are:

a) In this case, the behavior of initial sync is inconsistent with the behavior
of transaction streaming. All changes in the leaf the partition will be applied
using the identity and schema of the partitioned(root) table. But for the
initial sync, it will execute sync for both the partitioned(root) table and
(explicitly) its child partitions which cause duplication of data.

b) The behavior of FOR TABLE is inconsistent with the behavior of FOR ALL TABLE.
If user create a FOR ALL TABLE publication and set publish_via_partition_root=true,
then only the top most partitioned(root) table will execute initial sync.

IIRC, most people in this thread agreed that the current behavior is not
expected. So, maybe it's time to try to fix it.

Attach my fix patch here. The patch try to fix this issue by making the
pg_publication_tables view only show partitioned table when
publish_via_partition_root is true.


* The second issue[2]:
-----
CREATE TABLE sale (sale_date date not null,country_code text, product_sku text,
units integer) PARTITION BY RANGE (sale_date);
CREATE TABLE sale_201901 PARTITION OF sale FOR VALUES FROM ('2019-01-01') TO
('2019-02-01');
CREATE TABLE sale_201902 PARTITION OF sale FOR VALUES FROM ('2019-02-01') TO
('2019-03-01');

(1) PUB:  CREATE PUBLICATION pub FOR TABLE sale_201901,
sale_201902 WITH (publish_via_partition_root=true);
(2) SUB:  CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres host=localhost port=5432' PUBLICATION pub;
(3) PUB:  INSERT INTO sale VALUES('2019-01-01', 'AU', 'cpu', 5), ('2019-01-02', 'AU', 'disk', 8);
(4) SUB:  SELECT * FROM sale;
(5) PUB:  ALTER PUBLICATION pub ADD TABLE sale;
(6) SUB:  ALTER SUBSCRIPTION sub REFRESH PUBLICATION;
(7) SUB:  SELECT * FROM sale;
-----

In step (7), we can see duplication of data.

The reason is that the INSERTed data is first published though the partitions,
since initially there is no partitioned table in the publication (so
publish_via_partition_root=true doesn't have any effect). But then adding the
partitioned table to the publication and refreshing the publication in the
subscriber, the data is then published "using the identity and schema of the
partitioned table" due to publish_via_partition_root=true.
(Copied from Greg's analysis).

Whether this behavior is correct is still under debate.


Overall, I think the second issue still needs further discussion while the
first issue seems clear that most people think it's unexpected. So, I think it
might be better to fix the first issue.

[1]
https://www.postgresql.org/message-id/OS0PR01MB57167F45D481F78CDC5986F794B99%40OS0PR01MB5716.jpnprd01.prod.outlook.com
[2]
https://www.postgresql.org/message-id/flat/CAJcOf-d8SWk3z3fJaLW9yuVux%3D2ESTsXOSdKzCq1O3AWBpgnMQ%40mail.gmail.com#fc96a42158b5e98ace26d077a6f7eac5

Best regards,
Hou zj

Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Langote
Дата:
On Thu, Oct 28, 2021 at 4:35 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
> As there are basically two separate issues mentioned in the thread, I tried to
> summarize the discussion so far which might be helpful to others.
>
> * The first issue[1]:
>
> If we include both the partitioned table and (explicitly) its child partitions
> in the publication when set publish_via_partition_root=true, like:
> ---
> CREATE PUBLICATION pub FOR TABLE parent_table, child_table with (publish_via_partition_root=on);
> ---
> It could execute initial sync for both the partitioned(parent_table) table and
> (explicitly) its child partitions(child_table) which cause duplication of
> data in partition(child_table) in subscriber side.
>
> The reasons I considered this behavior a bug are:
>
> a) In this case, the behavior of initial sync is inconsistent with the behavior
> of transaction streaming. All changes in the leaf the partition will be applied
> using the identity and schema of the partitioned(root) table. But for the
> initial sync, it will execute sync for both the partitioned(root) table and
> (explicitly) its child partitions which cause duplication of data.
>
> b) The behavior of FOR TABLE is inconsistent with the behavior of FOR ALL TABLE.
> If user create a FOR ALL TABLE publication and set publish_via_partition_root=true,
> then only the top most partitioned(root) table will execute initial sync.
>
> IIRC, most people in this thread agreed that the current behavior is not
> expected. So, maybe it's time to try to fix it.
>
> Attach my fix patch here. The patch try to fix this issue by making the
> pg_publication_tables view only show partitioned table when
> publish_via_partition_root is true.
>
>
> * The second issue[2]:
> -----
> CREATE TABLE sale (sale_date date not null,country_code text, product_sku text,
> units integer) PARTITION BY RANGE (sale_date);
> CREATE TABLE sale_201901 PARTITION OF sale FOR VALUES FROM ('2019-01-01') TO
> ('2019-02-01');
> CREATE TABLE sale_201902 PARTITION OF sale FOR VALUES FROM ('2019-02-01') TO
> ('2019-03-01');
>
> (1) PUB:  CREATE PUBLICATION pub FOR TABLE sale_201901,
> sale_201902 WITH (publish_via_partition_root=true);
> (2) SUB:  CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres host=localhost port=5432' PUBLICATION pub;
> (3) PUB:  INSERT INTO sale VALUES('2019-01-01', 'AU', 'cpu', 5), ('2019-01-02', 'AU', 'disk', 8);
> (4) SUB:  SELECT * FROM sale;
> (5) PUB:  ALTER PUBLICATION pub ADD TABLE sale;
> (6) SUB:  ALTER SUBSCRIPTION sub REFRESH PUBLICATION;
> (7) SUB:  SELECT * FROM sale;
> -----
>
> In step (7), we can see duplication of data.
>
> The reason is that the INSERTed data is first published though the partitions,
> since initially there is no partitioned table in the publication (so
> publish_via_partition_root=true doesn't have any effect). But then adding the
> partitioned table to the publication and refreshing the publication in the
> subscriber, the data is then published "using the identity and schema of the
> partitioned table" due to publish_via_partition_root=true.
> (Copied from Greg's analysis).
>
> Whether this behavior is correct is still under debate.
>
>
> Overall, I think the second issue still needs further discussion while the
> first issue seems clear that most people think it's unexpected. So, I think it
> might be better to fix the first issue.

Thanks for the summary, Hou-san, and sorry about my late reply.

I had thought about this some last week and I am coming around to
recognizing the confusing user experience of the current behavior.
Though, I am not sure if removing partitions from the result of
pg_publication_tables view for pubviaroot publications is acceptable
as a fix, because that prevents replicating into a subscriber node
where tables that are partition root and a partition respectively on
the publisher are independent tables on the subscriber.  ISTM that
what Amit K mentioned in his first reply may be closer to what we may
ultimately need to do, which is this:

"I think one possible idea to investigate is that on the
subscriber-side, after fetching tables, we check the already
subscribed tables and if the child tables already exist then we ignore
the parent table and vice versa."

I had also thought about a way to implement that a bit and part of
that is to make pg_publication_tables always expose leaf partitions,
that is, even if pubviaroot is true for a given publication.  So, when
puviaroot is true we include the partition root actually mentioned in
the publication as we do currently (changes streamed after initial
sync will always use its schema so the subscriber better know about
it), and also leaf partitions which we currently don't.  The latter
too so that a subscriber can determine which leaf partitions to sync
and which ones to not based on whether a given leaf partitions is
already known to the subscriber, which means their pg_subscription_rel
entries would need to be made which we don't currently.  Leaf
partition state entries will not be referred to after all of the known
ones are synced, because the subsequent changes will be streamed with
the root's schema.  The initial sync worker would need to be taught to
skip processing any partitioned tables that it sees, because now we'd
be handling the initial sync part via individual leaf partitions.

Though the thing that makes this a bit tricky to implement is that the
pg_publication_tables view exposes way less information to be able to
determine how the tables listed are related to each other.  So I think
we'd need to fix the view to return leaf partitions' owning root table
(if they're published through a pubviaroot publication) or rewrite the
query that fetch_table_list() uses to access the underlying catalogs
directly in the back-ported version.

That is roughly how I'd go about this.  Thoughts?

-- 
Amit Langote
EDB: http://www.enterprisedb.com



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Thu, Oct 28, 2021 at 1:55 PM Amit Langote <amitlangote09@gmail.com> wrote:
>
> On Thu, Oct 28, 2021 at 4:35 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> > As there are basically two separate issues mentioned in the thread, I tried to
> > summarize the discussion so far which might be helpful to others.
> >
> > * The first issue[1]:
> >
> > If we include both the partitioned table and (explicitly) its child partitions
> > in the publication when set publish_via_partition_root=true, like:
> > ---
> > CREATE PUBLICATION pub FOR TABLE parent_table, child_table with (publish_via_partition_root=on);
> > ---
> > It could execute initial sync for both the partitioned(parent_table) table and
> > (explicitly) its child partitions(child_table) which cause duplication of
> > data in partition(child_table) in subscriber side.
> >
> > The reasons I considered this behavior a bug are:
> >
> > a) In this case, the behavior of initial sync is inconsistent with the behavior
> > of transaction streaming. All changes in the leaf the partition will be applied
> > using the identity and schema of the partitioned(root) table. But for the
> > initial sync, it will execute sync for both the partitioned(root) table and
> > (explicitly) its child partitions which cause duplication of data.
> >
> > b) The behavior of FOR TABLE is inconsistent with the behavior of FOR ALL TABLE.
> > If user create a FOR ALL TABLE publication and set publish_via_partition_root=true,
> > then only the top most partitioned(root) table will execute initial sync.
> >
> > IIRC, most people in this thread agreed that the current behavior is not
> > expected. So, maybe it's time to try to fix it.
> >
> > Attach my fix patch here. The patch try to fix this issue by making the
> > pg_publication_tables view only show partitioned table when
> > publish_via_partition_root is true.
> >
> >
> > * The second issue[2]:
> > -----
> > CREATE TABLE sale (sale_date date not null,country_code text, product_sku text,
> > units integer) PARTITION BY RANGE (sale_date);
> > CREATE TABLE sale_201901 PARTITION OF sale FOR VALUES FROM ('2019-01-01') TO
> > ('2019-02-01');
> > CREATE TABLE sale_201902 PARTITION OF sale FOR VALUES FROM ('2019-02-01') TO
> > ('2019-03-01');
> >
> > (1) PUB:  CREATE PUBLICATION pub FOR TABLE sale_201901,
> > sale_201902 WITH (publish_via_partition_root=true);
> > (2) SUB:  CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres host=localhost port=5432' PUBLICATION pub;
> > (3) PUB:  INSERT INTO sale VALUES('2019-01-01', 'AU', 'cpu', 5), ('2019-01-02', 'AU', 'disk', 8);
> > (4) SUB:  SELECT * FROM sale;
> > (5) PUB:  ALTER PUBLICATION pub ADD TABLE sale;
> > (6) SUB:  ALTER SUBSCRIPTION sub REFRESH PUBLICATION;
> > (7) SUB:  SELECT * FROM sale;
> > -----
> >
> > In step (7), we can see duplication of data.
> >
> > The reason is that the INSERTed data is first published though the partitions,
> > since initially there is no partitioned table in the publication (so
> > publish_via_partition_root=true doesn't have any effect). But then adding the
> > partitioned table to the publication and refreshing the publication in the
> > subscriber, the data is then published "using the identity and schema of the
> > partitioned table" due to publish_via_partition_root=true.
> > (Copied from Greg's analysis).
> >
> > Whether this behavior is correct is still under debate.
> >
> >
> > Overall, I think the second issue still needs further discussion while the
> > first issue seems clear that most people think it's unexpected. So, I think it
> > might be better to fix the first issue.
>
> Thanks for the summary, Hou-san, and sorry about my late reply.
>
> I had thought about this some last week and I am coming around to
> recognizing the confusing user experience of the current behavior.
> Though, I am not sure if removing partitions from the result of
> pg_publication_tables view for pubviaroot publications is acceptable
> as a fix, because that prevents replicating into a subscriber node
> where tables that are partition root and a partition respectively on
> the publisher are independent tables on the subscriber.
>

But we already do that way when the publication is "For All Tables".
Anyway, for the purpose of initial sync, it will just replicate the
same data in two different tables if the corresponding tables on the
subscriber-side are non-partitioned which I am not sure is what the
user will be expecting.

>  ISTM that
> what Amit K mentioned in his first reply may be closer to what we may
> ultimately need to do, which is this:
>
> "I think one possible idea to investigate is that on the
> subscriber-side, after fetching tables, we check the already
> subscribed tables and if the child tables already exist then we ignore
> the parent table and vice versa."
>
> I had also thought about a way to implement that a bit and part of
> that is to make pg_publication_tables always expose leaf partitions,
> that is, even if pubviaroot is true for a given publication.  So, when
> puviaroot is true we include the partition root actually mentioned in
> the publication as we do currently (changes streamed after initial
> sync will always use its schema so the subscriber better know about
> it), and also leaf partitions which we currently don't.  The latter
> too so that a subscriber can determine which leaf partitions to sync
> and which ones to not based on whether a given leaf partitions is
> already known to the subscriber, which means their pg_subscription_rel
> entries would need to be made which we don't currently.  Leaf
> partition state entries will not be referred to after all of the known
> ones are synced, because the subsequent changes will be streamed with
> the root's schema.  The initial sync worker would need to be taught to
> skip processing any partitioned tables that it sees, because now we'd
> be handling the initial sync part via individual leaf partitions.
>

Yeah, we can do something like this as well but my guess is that this
will be a bit complicated. OTOH, if we adopt Hou-san's patch for the
first problem as described in his previous email then I think the
solution for the second problem could be simpler than what you
described above.

-- 
With Regards,
Amit Kapila.



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Wed, Nov 3, 2021 at 3:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Oct 28, 2021 at 1:55 PM Amit Langote <amitlangote09@gmail.com> wrote:
> >
> >
> > Thanks for the summary, Hou-san, and sorry about my late reply.
> >
> > I had thought about this some last week and I am coming around to
> > recognizing the confusing user experience of the current behavior.
> > Though, I am not sure if removing partitions from the result of
> > pg_publication_tables view for pubviaroot publications is acceptable
> > as a fix, because that prevents replicating into a subscriber node
> > where tables that are partition root and a partition respectively on
> > the publisher are independent tables on the subscriber.
> >
>
> But we already do that way when the publication is "For All Tables".
> Anyway, for the purpose of initial sync, it will just replicate the
> same data in two different tables if the corresponding tables on the
> subscriber-side are non-partitioned which I am not sure is what the
> user will be expecting.
>

On further thinking about this, I think we should define the behavior
of replication among partitioned (on the publisher) and
non-partitioned (on the subscriber) tables a bit more clearly.

- If the "publish_via_partition_root" is set for a publication then we
can always replicate to the table with the same name as the root table
in publisher.
- If the "publish_via_partition_root" is *not* set for a publication
then we can always replicate to the tables with the same name as the
non-root tables in publisher.

Thoughts?

-- 
With Regards,
Amit Kapila.



Re: Data is copied twice when specifying both child and parent table in publication

От
Greg Nancarrow
Дата:
On Thu, Nov 4, 2021 at 3:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On further thinking about this, I think we should define the behavior
> of replication among partitioned (on the publisher) and
> non-partitioned (on the subscriber) tables a bit more clearly.
>
> - If the "publish_via_partition_root" is set for a publication then we
> can always replicate to the table with the same name as the root table
> in publisher.
> - If the "publish_via_partition_root" is *not* set for a publication
> then we can always replicate to the tables with the same name as the
> non-root tables in publisher.
>
> Thoughts?
>

I'd adjust that wording slightly, because "we can always replicate to
..." sounds a bit vague, and saying that an option is set or not set
could be misinterpreted, as the option could be "set" to false.

How about:

- If "publish_via_partition_root" is true for a publication, then data
is replicated to the table with the same name as the root (i.e.
partitioned) table in the publisher.
- If "publish_via_partition_root" is false (the default) for a
publication, then data is replicated to tables with the same name as
the non-root (i.e. partition) tables in the publisher.

?

Regards,
Greg Nancarrow
Fujitsu Australia



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Thu, Nov 4, 2021 at 12:23 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Thu, Nov 4, 2021 at 3:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On further thinking about this, I think we should define the behavior
> > of replication among partitioned (on the publisher) and
> > non-partitioned (on the subscriber) tables a bit more clearly.
> >
> > - If the "publish_via_partition_root" is set for a publication then we
> > can always replicate to the table with the same name as the root table
> > in publisher.
> > - If the "publish_via_partition_root" is *not* set for a publication
> > then we can always replicate to the tables with the same name as the
> > non-root tables in publisher.
> >
> > Thoughts?
> >
>
> I'd adjust that wording slightly, because "we can always replicate to
> ..." sounds a bit vague, and saying that an option is set or not set
> could be misinterpreted, as the option could be "set" to false.
>
> How about:
>
> - If "publish_via_partition_root" is true for a publication, then data
> is replicated to the table with the same name as the root (i.e.
> partitioned) table in the publisher.
> - If "publish_via_partition_root" is false (the default) for a
> publication, then data is replicated to tables with the same name as
> the non-root (i.e. partition) tables in the publisher.
>

Sounds good to me. If we follow this then I think the patch by Hou-San
is good to solve the first problem as described in his last email [1]?

[1] -
https://www.postgresql.org/message-id/OS0PR01MB5716C756312959F293A822C794869%40OS0PR01MB5716.jpnprd01.prod.outlook.com

-- 
With Regards,
Amit Kapila.



Re: Data is copied twice when specifying both child and parent table in publication

От
Greg Nancarrow
Дата:

On Thu, Nov 4, 2021 at 7:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Nov 4, 2021 at 12:23 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
> > On Thu, Nov 4, 2021 at 3:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On further thinking about this, I think we should define the behavior
> > > of replication among partitioned (on the publisher) and
> > > non-partitioned (on the subscriber) tables a bit more clearly.
> > >
> > > - If the "publish_via_partition_root" is set for a publication then we
> > > can always replicate to the table with the same name as the root table
> > > in publisher.
> > > - If the "publish_via_partition_root" is *not* set for a publication
> > > then we can always replicate to the tables with the same name as the
> > > non-root tables in publisher.
> > >
> > > Thoughts?
> > >
> >
> > I'd adjust that wording slightly, because "we can always replicate to
> > ..." sounds a bit vague, and saying that an option is set or not set
> > could be misinterpreted, as the option could be "set" to false.
> >
> > How about:
> >
> > - If "publish_via_partition_root" is true for a publication, then data
> > is replicated to the table with the same name as the root (i.e.
> > partitioned) table in the publisher.
> > - If "publish_via_partition_root" is false (the default) for a
> > publication, then data is replicated to tables with the same name as
> > the non-root (i.e. partition) tables in the publisher.
> >
>
> Sounds good to me. If we follow this then I think the patch by Hou-San
> is good to solve the first problem as described in his last email [1]?
>
> [1] - https://www.postgresql.org/message-id/OS0PR01MB5716C756312959F293A822C794869%40OS0PR01MB5716.jpnprd01.prod.outlook.com
>

Almost.
The patch does seem to solve that first problem (double publish on tablesync).
I used the following test (taken from [2]), and variations of it:  

--- Setup
create schema sch1;
create schema sch2;
create table sch1.tbl1 (a int) partition by range (a);
create table sch2.tbl1_part1 partition of sch1.tbl1 for values from (1) to (10);
create table sch2.tbl1_part2 partition of sch1.tbl1 for values from
(10) to (20);
create schema sch3;
create table sch3.t1(c1 int);

--- Publication
create publication pub1 for all tables in schema sch3, table
sch1.tbl1, table sch2.tbl1_part1 with ( publish_via_partition_root=on);
insert into sch1.tbl1 values(1);
insert into sch1.tbl1 values(11);
insert into sch3.t1 values(1);

---- Subscription
CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres host=localhost
port=5432' PUBLICATION pub1;


[2] - https://postgr.es/m/CALDaNm3vxjPMMSrVDNK0f8UWP+EQ5ry14xfEukmXsVg_UcwZNA@mail.gmail.com


However, there did still seem to be a problem, if publish_via_partition_root is then set to false; it seems that can result in duplicate partition entries in the pg_publication_tables view, see below (this follows on from the test scenario given above):

postgres=# select * from pg_publication_tables;
 pubname | schemaname | tablename
---------+------------+-----------
 pub1    | sch1       | tbl1
 pub1    | sch3       | t1
(2 rows)

postgres=#  alter publication pub1 set (publish_via_partition_root=false);
ALTER PUBLICATION
postgres=# select * from pg_publication_tables;
 pubname | schemaname | tablename  
---------+------------+------------
 pub1    | sch2       | tbl1_part1
 pub1    | sch2       | tbl1_part2
 pub1    | sch2       | tbl1_part1
 pub1    | sch3       | t1
(4 rows)

So I think the patch would need to be updated to prevent that.

Regards,
Greg Nancarrow
Fujitsu Australia

RE: Data is copied twice when specifying both child and parent table in publication

От
"houzj.fnst@fujitsu.com"
Дата:
On Friday, November 5, 2021 11:20 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>On Thu, Nov 4, 2021 at 7:10 PM Amit Kapila <mailto:amit.kapila16@gmail.com> wrote:
>>
>> On Thu, Nov 4, 2021 at 12:23 PM Greg Nancarrow <mailto:gregn4422@gmail.com> wrote:
>> >
>> > On Thu, Nov 4, 2021 at 3:13 PM Amit Kapila <mailto:amit.kapila16@gmail.com> wrote:
>> > >
>> > > On further thinking about this, I think we should define the behavior
>> > > of replication among partitioned (on the publisher) and
>> > > non-partitioned (on the subscriber) tables a bit more clearly.
>> > >
>> > > - If the "publish_via_partition_root" is set for a publication then we
>> > > can always replicate to the table with the same name as the root table
>> > > in publisher.
>> > > - If the "publish_via_partition_root" is *not* set for a publication
>> > > then we can always replicate to the tables with the same name as the
>> > > non-root tables in publisher.
>> > >
>> > > Thoughts?
>> > >
>> >
>> > I'd adjust that wording slightly, because "we can always replicate to
>> > ..." sounds a bit vague, and saying that an option is set or not set
>> > could be misinterpreted, as the option could be "set" to false.
>> >
>> > How about:
>> >
>> > - If "publish_via_partition_root" is true for a publication, then data
>> > is replicated to the table with the same name as the root (i.e.
>> > partitioned) table in the publisher.
>> > - If "publish_via_partition_root" is false (the default) for a
>> > publication, then data is replicated to tables with the same name as
>> > the non-root (i.e. partition) tables in the publisher.
>> >
>>
>> Sounds good to me. If we follow this then I think the patch by Hou-San
>> is good to solve the first problem as described in his last email [1]?
>>
>> [1] -
https://www.postgresql.org/message-id/OS0PR01MB5716C756312959F293A822C794869%40OS0PR01MB5716.jpnprd01.prod.outlook.com
>>
>
>Almost.
>The patch does seem to solve that first problem (double publish on tablesync).
>I used the following test (taken from [2]), and variations of it:  
>
>However, there did still seem to be a problem, if publish_via_partition_root is then set to false; it seems that can
resultin 
 
>duplicate partition entries in the pg_publication_tables view, see below (this follows on from the test scenario given
above):
>
>postgres=# select * from pg_publication_tables;
> pubname | schemaname | tablename 
>---------+------------+-----------
> pub1    | sch1       | tbl1
> pub1    | sch3       | t1
>(2 rows)
>
>postgres=#  alter publication pub1 set (publish_via_partition_root=false);
>ALTER PUBLICATION
>postgres=# select * from pg_publication_tables;
> pubname | schemaname | tablename  
>---------+------------+------------
> pub1    | sch2       | tbl1_part1
> pub1    | sch2       | tbl1_part2
> pub1    | sch2       | tbl1_part1
> pub1    | sch3       | t1
>(4 rows)
>
>So I think the patch would need to be updated to prevent that.

Thanks for testing the patch.

The reason of the duplicate output is that:
The existing function GetPublicationRelations doesn't de-duplicate the output
oid list. So, when adding both child and parent table to the
publication(pubviaroot = false), the pg_publication_tables view will output
duplicate partition.

Attach the fix patch.
0001 fix data double publish(first issue in this thread)
0002 fix duplicate partition in view pg_publication_tables(reported by greg when testing the 0001 patch)

About the fix for second issue in this thread.
> "I think one possible idea to investigate is that on the 
> subscriber-side, after fetching tables, we check the already 
> subscribed tables and if the child tables already exist then we ignore 
> the parent table and vice versa."

When looking into how to fix the second issue, I have a question:

After changing publish_via_partition_root from false to true, the
subcriber will fetch the partitioned table from publisher when refreshing.

In subsriber side, If all the child tables of the partitioned table already
subscribed, then we can just skip the table sync for the partitioned table. But
if only some of the child tables(not all child tables) were already subscribed,
should we skip the partitioned table's table sync ? I am not sure about the
appropriate behavior here.

What do you think ?

Best regards,
Hou zj





Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Greg Nancarrow
Дата:
On Thu, Nov 11, 2021 at 5:52 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> When looking into how to fix the second issue, I have a question:
>
> After changing publish_via_partition_root from false to true, the
> subcriber will fetch the partitioned table from publisher when refreshing.
>
> In subsriber side, If all the child tables of the partitioned table already
> subscribed, then we can just skip the table sync for the partitioned table. But
> if only some of the child tables(not all child tables) were already subscribed,
> should we skip the partitioned table's table sync ? I am not sure about the
> appropriate behavior here.
>
> What do you think ?
>

I'm not sure you can skip the partitioned table's table sync as you
are suggesting, because on the subscriber side, the tables are mapped
by name, so what is a partitioned table on the publisher side might
not be a partitioned table on the subscriber side (e.g. might be an
ordinary table; and similarly for the partitions) or it might be
partitioned differently to that on the publisher side. (I might be
wrong here, and I don't have a good solution, but I can see the
potential for inconsistent data resulting in this case, unless say,
the subscriber "child tables" are first truncated on the refresh, if
they are in fact partitions of the root, and then the table sync
publishes the existing data via the root)

Regards,
Greg Nancarrow
Fujitsu Australia



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Thu, Nov 11, 2021 at 1:44 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Thu, Nov 11, 2021 at 5:52 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > When looking into how to fix the second issue, I have a question:
> >
> > After changing publish_via_partition_root from false to true, the
> > subcriber will fetch the partitioned table from publisher when refreshing.
> >
> > In subsriber side, If all the child tables of the partitioned table already
> > subscribed, then we can just skip the table sync for the partitioned table. But
> > if only some of the child tables(not all child tables) were already subscribed,
> > should we skip the partitioned table's table sync ? I am not sure about the
> > appropriate behavior here.
> >
> > What do you think ?
> >
>
> I'm not sure you can skip the partitioned table's table sync as you
> are suggesting, because on the subscriber side, the tables are mapped
> by name, so what is a partitioned table on the publisher side might
> not be a partitioned table on the subscriber side (e.g. might be an
> ordinary table; and similarly for the partitions) or it might be
> partitioned differently to that on the publisher side.
>

Sure, we don't know about that, or at least there is no such mapping
that is recorded. So, I think we should skip it even if any one of the
child table is present.

> (I might be
> wrong here, and I don't have a good solution, but I can see the
> potential for inconsistent data resulting in this case, unless say,
> the subscriber "child tables" are first truncated on the refresh, if
> they are in fact partitions of the root, and then the table sync
> publishes the existing data via the root)
>

Do you want to say current behavior for this case where data is copied
twice is okay? I think we need to find a way to handle this and
document it to set the expectations right.


-- 
With Regards,
Amit Kapila.



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Thu, Nov 11, 2021 at 12:22 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Friday, November 5, 2021 11:20 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >On Thu, Nov 4, 2021 at 7:10 PM Amit Kapila <mailto:amit.kapila16@gmail.com> wrote:
> >
> >Almost.
> >The patch does seem to solve that first problem (double publish on tablesync).
> >I used the following test (taken from [2]), and variations of it:
> >
> >However, there did still seem to be a problem, if publish_via_partition_root is then set to false; it seems that can
resultin
 
> >duplicate partition entries in the pg_publication_tables view, see below (this follows on from the test scenario
givenabove):
 
> >
> >postgres=# select * from pg_publication_tables;
> > pubname | schemaname | tablename
> >---------+------------+-----------
> > pub1    | sch1       | tbl1
> > pub1    | sch3       | t1
> >(2 rows)
> >
> >postgres=#  alter publication pub1 set (publish_via_partition_root=false);
> >ALTER PUBLICATION
> >postgres=# select * from pg_publication_tables;
> > pubname | schemaname | tablename
> >---------+------------+------------
> > pub1    | sch2       | tbl1_part1
> > pub1    | sch2       | tbl1_part2
> > pub1    | sch2       | tbl1_part1
> > pub1    | sch3       | t1
> >(4 rows)
> >
> >So I think the patch would need to be updated to prevent that.
>
> Thanks for testing the patch.
>
> The reason of the duplicate output is that:
> The existing function GetPublicationRelations doesn't de-duplicate the output
> oid list. So, when adding both child and parent table to the
> publication(pubviaroot = false), the pg_publication_tables view will output
> duplicate partition.
>
> Attach the fix patch.
> 0001 fix data double publish(first issue in this thread)
> 0002 fix duplicate partition in view pg_publication_tables(reported by greg when testing the 0001 patch)
>

Can we start a separate thread to discuss the 0002 patch as that
doesn't seem directly to duplicate data issues being discussed here?
Please specify the exact test in the email as that would make it
easier to understand the problem.

-- 
With Regards,
Amit Kapila.



RE: Data is copied twice when specifying both child and parent table in publication

От
"houzj.fnst@fujitsu.com"
Дата:
On Friday, November 12, 2021 12:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> On Thu, Nov 11, 2021 at 12:22 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Friday, November 5, 2021 11:20 AM Greg Nancarrow
> <gregn4422@gmail.com> wrote:
> > >On Thu, Nov 4, 2021 at 7:10 PM Amit Kapila
> <mailto:amit.kapila16@gmail.com> wrote:
> > >
> > >Almost.
> > >The patch does seem to solve that first problem (double publish on
> tablesync).
> > >I used the following test (taken from [2]), and variations of it:
> > >
> > >However, there did still seem to be a problem, if
> > >publish_via_partition_root is then set to false; it seems that can result in
> duplicate partition entries in the pg_publication_tables view, see below (this
> follows on from the test scenario given above):
> > >
> > >postgres=# select * from pg_publication_tables;  pubname | schemaname
> > >| tablename
> > >---------+------------+-----------
> > > pub1    | sch1       | tbl1
> > > pub1    | sch3       | t1
> > >(2 rows)
> > >
> > >postgres=#  alter publication pub1 set
> > >(publish_via_partition_root=false);
> > >ALTER PUBLICATION
> > >postgres=# select * from pg_publication_tables;  pubname | schemaname
> > >| tablename
> > >---------+------------+------------
> > > pub1    | sch2       | tbl1_part1
> > > pub1    | sch2       | tbl1_part2
> > > pub1    | sch2       | tbl1_part1
> > > pub1    | sch3       | t1
> > >(4 rows)
> > >
> > >So I think the patch would need to be updated to prevent that.
> >
> > Thanks for testing the patch.
> >
> > The reason of the duplicate output is that:
> > The existing function GetPublicationRelations doesn't de-duplicate the
> > output oid list. So, when adding both child and parent table to the
> > publication(pubviaroot = false), the pg_publication_tables view will
> > output duplicate partition.
> >
> > Attach the fix patch.
> > 0001 fix data double publish(first issue in this thread)
> > 0002 fix duplicate partition in view pg_publication_tables(reported by
> > greg when testing the 0001 patch)
> >
> 
> Can we start a separate thread to discuss the 0002 patch as that doesn't seem
> directly to duplicate data issues being discussed here?
> Please specify the exact test in the email as that would make it easier to
> understand the problem.

Thanks for the suggestion.
I have started a new thread about this issue[1].

[1]
https://www.postgresql.org/message-id/OS0PR01MB5716E97F00732B52DC2BBC2594989%40OS0PR01MB5716.jpnprd01.prod.outlook.com

Best regards,
Hou zj

RE: Data is copied twice when specifying both child and parent table in publication

От
"houzj.fnst@fujitsu.com"
Дата:
On Thursday, November 11, 2021 2:53 PM houzj.fnst@fujitsu.com wrote:
> Attach the fix patch.
> 0001 fix data double publish(first issue in this thread)

In another thread[1], Amit L suggested that it'd be nice to add a testcase in
src/test/subscription/. So, attach a new version patch which add a testcase in
t/013_partition.pl.

[1] https://www.postgresql.org/message-id/CA%2BHiwqEjV%3D7iEW8hxnr73pWsDQuonDPLgsxXTYDQzDA7W9vrmw%40mail.gmail.com

Best regards,
Hou zj


Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Mon, Nov 29, 2021 at 2:21 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Thursday, November 11, 2021 2:53 PM houzj.fnst@fujitsu.com wrote:
> > Attach the fix patch.
> > 0001 fix data double publish(first issue in this thread)
>
> In another thread[1], Amit L suggested that it'd be nice to add a testcase in
> src/test/subscription/. So, attach a new version patch which add a testcase in
> t/013_partition.pl.
>

Thanks, your patch looks good to me. I have slightly changed the
comments and commit message in the attached.

I think we should back-patch this but I am slightly worried that if
someone is dependent on the view pg_publication_tables to return both
parent and child tables for publications that have both of those
tables and published with publish_via_partition_root as true then this
might break his usage. But OTOH, I don't see why someone would do like
that and she might face some problems like what we are trying to solve
here.

Thoughts?

-- 
With Regards,
Amit Kapila.

Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Greg Nancarrow
Дата:
On Wed, Dec 1, 2021 at 10:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> Thanks, your patch looks good to me. I have slightly changed the
> comments and commit message in the attached.
>

I'd suggest tidying the patch comment a bit:

"We publish the child table's data twice for a publication that has both
child and parent tables and is published with publish_via_partition_root
as true. This happens because subscribers will initiate synchronization
using both parent and child tables, since it gets both as separate tables
in the initial table list."

Also, perhaps the following additional comment (or similar) could be
added to the pg_publication_tables documentation in catalogs.sgml:

For publications of partitioned tables with publish_via_partition_root
set to true, the partitioned table itself (rather than the individual
partitions) is included in the view.

> I think we should back-patch this but I am slightly worried ...

I'd be in favor of back-patching this.


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Thu, Dec 2, 2021 at 4:51 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Wed, Dec 1, 2021 at 10:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> Also, perhaps the following additional comment (or similar) could be
> added to the pg_publication_tables documentation in catalogs.sgml:
>
> For publications of partitioned tables with publish_via_partition_root
> set to true, the partitioned table itself (rather than the individual
> partitions) is included in the view.
>

Okay, but I think it is better to add the behavior both when
publish_via_partition_root is set to true and false. As in the case of
false, it won't include the partitioned table itself.

-- 
With Regards,
Amit Kapila.



Re: Data is copied twice when specifying both child and parent table in publication

От
Greg Nancarrow
Дата:
On Thu, Dec 2, 2021 at 1:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > For publications of partitioned tables with publish_via_partition_root
> > set to true, the partitioned table itself (rather than the individual
> > partitions) is included in the view.
> >
>
> Okay, but I think it is better to add the behavior both when
> publish_via_partition_root is set to true and false. As in the case of
> false, it won't include the partitioned table itself.
>

If you updated my original description to say "(instead of just the
individual partitions)", it would imply the same I think.
But I don't mind if you want to explicitly state both cases to make it clear.

Regards,
Greg Nancarrow
Fujitsu Australia



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Langote
Дата:
On Wed, Dec 1, 2021 at 8:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Mon, Nov 29, 2021 at 2:21 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Thursday, November 11, 2021 2:53 PM houzj.fnst@fujitsu.com wrote:
> > > Attach the fix patch.
> > > 0001 fix data double publish(first issue in this thread)
> >
> > In another thread[1], Amit L suggested that it'd be nice to add a testcase in
> > src/test/subscription/. So, attach a new version patch which add a testcase in
> > t/013_partition.pl.
> >
>
> Thanks, your patch looks good to me. I have slightly changed the
> comments and commit message in the attached.

Patch looks good to me too.  I confirmed that the newly added
subscription test fails with HEAD.

> I think we should back-patch this but I am slightly worried that if
> someone is dependent on the view pg_publication_tables to return both
> parent and child tables for publications that have both of those
> tables and published with publish_via_partition_root as true then this
> might break his usage. But OTOH, I don't see why someone would do like
> that and she might face some problems like what we are trying to solve
> here.

Yeah, back-patching may not be such a bad idea.

Thank you.

-- 
Amit Langote
EDB: http://www.enterprisedb.com



Re: Data is copied twice when specifying both child and parent table in publication

От
Greg Nancarrow
Дата:
On Thu, Dec 2, 2021 at 1:48 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> If you updated my original description to say "(instead of just the
> individual partitions)", it would imply the same I think.
> But I don't mind if you want to explicitly state both cases to make it clear.
>

For example, something like:

For publications of partitioned tables with publish_via_partition_root
set to true, only the partitioned table (and not its partitions) is
included in the view, whereas if publish_via_partition_root is set to
false, only the individual partitions are included in the view.


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Thu, Dec 2, 2021 at 9:41 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Thu, Dec 2, 2021 at 1:48 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
> > If you updated my original description to say "(instead of just the
> > individual partitions)", it would imply the same I think.
> > But I don't mind if you want to explicitly state both cases to make it clear.
> >
>
> For example, something like:
>
> For publications of partitioned tables with publish_via_partition_root
> set to true, only the partitioned table (and not its partitions) is
> included in the view, whereas if publish_via_partition_root is set to
> false, only the individual partitions are included in the view.
>

Yeah, that sounds good to me.

-- 
With Regards,
Amit Kapila.



RE: Data is copied twice when specifying both child and parent table in publication

От
"houzj.fnst@fujitsu.com"
Дата:
On Thursday, December 2, 2021 12:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Thu, Dec 2, 2021 at 9:41 AM Greg Nancarrow <gregn4422@gmail.com>
> wrote:
> >
> > On Thu, Dec 2, 2021 at 1:48 PM Greg Nancarrow <gregn4422@gmail.com>
> wrote:
> > >
> > > If you updated my original description to say "(instead of just the
> > > individual partitions)", it would imply the same I think.
> > > But I don't mind if you want to explicitly state both cases to make it clear.
> > >
> >
> > For example, something like:
> >
> > For publications of partitioned tables with publish_via_partition_root
> > set to true, only the partitioned table (and not its partitions) is
> > included in the view, whereas if publish_via_partition_root is set to
> > false, only the individual partitions are included in the view.
> >
> 
> Yeah, that sounds good to me.

It looks good to me as well.
Attach the patches for (HEAD~13) which merge the suggested doc change. I
prepared the code patch and test patch separately to make it easier for committer 
to confirm.

Best regards,
Hou zj



Вложения

RE: Data is copied twice when specifying both child and parent table in publication

От
"houzj.fnst@fujitsu.com"
Дата:
On Thursday, December 2, 2021 4:54 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
> On Thursday, December 2, 2021 12:50 PM Amit Kapila
> <amit.kapila16@gmail.com> wrote:
> > On Thu, Dec 2, 2021 at 9:41 AM Greg Nancarrow <gregn4422@gmail.com>
> > wrote:
> > >
> > > On Thu, Dec 2, 2021 at 1:48 PM Greg Nancarrow <gregn4422@gmail.com>
> > wrote:
> > > >
> > > > If you updated my original description to say "(instead of just
> > > > the individual partitions)", it would imply the same I think.
> > > > But I don't mind if you want to explicitly state both cases to make it clear.
> > > >
> > >
> > > For example, something like:
> > >
> > > For publications of partitioned tables with
> > > publish_via_partition_root set to true, only the partitioned table
> > > (and not its partitions) is included in the view, whereas if
> > > publish_via_partition_root is set to false, only the individual partitions are
> included in the view.
> > >
> >
> > Yeah, that sounds good to me.
> 
> It looks good to me as well.
> Attach the patches for (HEAD~13) which merge the suggested doc change. I
> prepared the code patch and test patch separately to make it easier for
> committer to confirm.

It seems we might not need to backpatch the doc change, so
attach another version which remove the doc changes from backpatch patches.

Best regards,
Hou zj


Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
vignesh C
Дата:
On Fri, Dec 3, 2021 at 11:24 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Thursday, December 2, 2021 4:54 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
> > On Thursday, December 2, 2021 12:50 PM Amit Kapila
> > <amit.kapila16@gmail.com> wrote:
> > > On Thu, Dec 2, 2021 at 9:41 AM Greg Nancarrow <gregn4422@gmail.com>
> > > wrote:
> > > >
> > > > On Thu, Dec 2, 2021 at 1:48 PM Greg Nancarrow <gregn4422@gmail.com>
> > > wrote:
> > > > >
> > > > > If you updated my original description to say "(instead of just
> > > > > the individual partitions)", it would imply the same I think.
> > > > > But I don't mind if you want to explicitly state both cases to make it clear.
> > > > >
> > > >
> > > > For example, something like:
> > > >
> > > > For publications of partitioned tables with
> > > > publish_via_partition_root set to true, only the partitioned table
> > > > (and not its partitions) is included in the view, whereas if
> > > > publish_via_partition_root is set to false, only the individual partitions are
> > included in the view.
> > > >
> > >
> > > Yeah, that sounds good to me.
> >
> > It looks good to me as well.
> > Attach the patches for (HEAD~13) which merge the suggested doc change. I
> > prepared the code patch and test patch separately to make it easier for
> > committer to confirm.
>
> It seems we might not need to backpatch the doc change, so
> attach another version which remove the doc changes from backpatch patches.

Thanks for the patches, the patch applies and the test passes in head
and the back branches. one minor suggestion:
1) Shall we change:
+  <para>
+   For publications of partitioned tables with
+   <literal>publish_via_partition_root</literal> set to
+   <literal>true</literal>, only the partitioned table (and not its partitions)
+   is included in the view, whereas if
+   <literal>publish_via_partition_root</literal> is set to
+   <literal>false</literal>, only the individual partitions are included in the
+   view.
+  </para>
To:
+  <para>
+   For publications of partitioned tables with
+   <literal>publish_via_partition_root</literal> set to
+   <literal>true</literal>, only the partitioned table (and not its partitions)
+   is included in the view, whereas if
+   <literal>publish_via_partition_root</literal> is set to
+   <literal>false</literal>, only the individual partitions (and not the
+   partitioned table) are included in the
+   view.
+  </para>

2) Any particular reason why the code and tests are backbranched but
not the document changes?

Regards,
Vignesh



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Tue, Dec 7, 2021 at 5:53 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, Dec 3, 2021 at 11:24 AM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
>
> 2) Any particular reason why the code and tests are backbranched but
> not the document changes?
>

I am not sure whether we need the doc change or not as this is not a
new feature and even if we need it as an improvement to docs, shall we
consider backpatching it? I felt that code changes are required to fix
a known issue so the case of backpatching it is clear.

-- 
With Regards,
Amit Kapila.



Re: Data is copied twice when specifying both child and parent table in publication

От
vignesh C
Дата:
On Wed, Dec 8, 2021 at 11:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Dec 7, 2021 at 5:53 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Fri, Dec 3, 2021 at 11:24 AM houzj.fnst@fujitsu.com
> > <houzj.fnst@fujitsu.com> wrote:
> > >
> >
> > 2) Any particular reason why the code and tests are backbranched but
> > not the document changes?
> >
>
> I am not sure whether we need the doc change or not as this is not a
> new feature and even if we need it as an improvement to docs, shall we
> consider backpatching it? I felt that code changes are required to fix
> a known issue so the case of backpatching it is clear.

Thanks for the clarification, I got your point. I'm fine either way
regarding the documentation change. The rest of the patch looks good
to me.

Regards,
Vignesh



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Wed, Dec 8, 2021 at 11:30 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, Dec 8, 2021 at 11:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Dec 7, 2021 at 5:53 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Fri, Dec 3, 2021 at 11:24 AM houzj.fnst@fujitsu.com
> > > <houzj.fnst@fujitsu.com> wrote:
> > > >
> > >
> > > 2) Any particular reason why the code and tests are backbranched but
> > > not the document changes?
> > >
> >
> > I am not sure whether we need the doc change or not as this is not a
> > new feature and even if we need it as an improvement to docs, shall we
> > consider backpatching it? I felt that code changes are required to fix
> > a known issue so the case of backpatching it is clear.
>
> Thanks for the clarification, I got your point. I'm fine either way
> regarding the documentation change. The rest of the patch looks good
> to me.
>

Okay, I have also verified the code and test changes for all branches.
I'll wait for a day to see if anybody else has any comments and then
commit this.

-- 
With Regards,
Amit Kapila.



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Wed, Dec 8, 2021 at 11:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Dec 8, 2021 at 11:30 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Wed, Dec 8, 2021 at 11:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Dec 7, 2021 at 5:53 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > On Fri, Dec 3, 2021 at 11:24 AM houzj.fnst@fujitsu.com
> > > > <houzj.fnst@fujitsu.com> wrote:
> > > > >
> > > >
> > > > 2) Any particular reason why the code and tests are backbranched but
> > > > not the document changes?
> > > >
> > >
> > > I am not sure whether we need the doc change or not as this is not a
> > > new feature and even if we need it as an improvement to docs, shall we
> > > consider backpatching it? I felt that code changes are required to fix
> > > a known issue so the case of backpatching it is clear.
> >
> > Thanks for the clarification, I got your point. I'm fine either way
> > regarding the documentation change. The rest of the patch looks good
> > to me.
> >
>
> Okay, I have also verified the code and test changes for all branches.
> I'll wait for a day to see if anybody else has any comments and then
> commit this.
>

Pushed.

-- 
With Regards,
Amit Kapila.



RE: Data is copied twice when specifying both child and parent table in publication

От
"houzj.fnst@fujitsu.com"
Дата:
Hi,

When reviewing some logical replication related features. I noticed another
possible problem if the subscriber subscribes multiple publications which
publish parent and child table.

For example:

----pub
create table t (a int, b int, c int) partition by range (a);
create table t_1 partition of t for values from (1) to (10);

create publication pub1 for table t
  with (PUBLISH_VIA_PARTITION_ROOT);
create publication pub2 for table t_1
  with (PUBLISH_VIA_PARTITION_ROOT);

----sub
---- prepare table t and t_1
CREATE SUBSCRIPTION sub CONNECTION 'port=10000 dbname=postgres' PUBLICATION pub1, pub2;

select * from pg_subscription_rel ;
 srsubid | srrelid | srsubstate | srsublsn
---------+---------+------------+-----------
   16391 |   16385(t) | r          | 0/150D100
   16391 |   16388(t_1) | r          | 0/150D138

If subscribe two publications one of them publish parent table with
(pubviaroot=true) and another publish child table. Both the parent table and
child table will exist in pg_subscription_rel which also means we will do
initial copy for both tables.

But after initial copy, we only publish change with the schema of the parent
table(t). It looks a bit inconsistent.

Based on the document of PUBLISH_VIA_PARTITION_ROOT option. I think the
expected behavior could be we only store the top most parent(table t) in
pg_subscription_rel and do initial copy for it if pubviaroot is on. I haven't
thought about how to fix this and will investigate this later.

Best regards,
Hou zj

RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Thur, Mar 10, 2021 at 10:08 AM houzj.fnst@fujitsu.com wrote:
> Hi,
> 
> When reviewing some logical replication related features. I noticed another
> possible problem if the subscriber subscribes multiple publications which
> publish parent and child table.
> 
> For example:
> 
> ----pub
> create table t (a int, b int, c int) partition by range (a);
> create table t_1 partition of t for values from (1) to (10);
> 
> create publication pub1 for table t
>   with (PUBLISH_VIA_PARTITION_ROOT);
> create publication pub2 for table t_1
>   with (PUBLISH_VIA_PARTITION_ROOT);
> 
> ----sub
> ---- prepare table t and t_1
> CREATE SUBSCRIPTION sub CONNECTION 'port=10000 dbname=postgres'
> PUBLICATION pub1, pub2;
> 
> select * from pg_subscription_rel ;
>  srsubid | srrelid | srsubstate | srsublsn
> ---------+---------+------------+-----------
>    16391 |   16385(t) | r          | 0/150D100
>    16391 |   16388(t_1) | r          | 0/150D138
> 
> If subscribe two publications one of them publish parent table with
> (pubviaroot=true) and another publish child table. Both the parent table and
> child table will exist in pg_subscription_rel which also means we will do
> initial copy for both tables.
> 
> But after initial copy, we only publish change with the schema of the parent
> table(t). It looks a bit inconsistent.
> 
> Based on the document of PUBLISH_VIA_PARTITION_ROOT option. I think the
> expected behavior could be we only store the top most parent(table t) in
> pg_subscription_rel and do initial copy for it if pubviaroot is on. I haven't
> thought about how to fix this and will investigate this later.
Hi,
I try to fix this bug. Attach the patch.

The current HEAD get table list for one publication by invoking function
pg_get_publication_tables. If multiple publications are subscribed, then this
function is invoked multiple times. So option PUBLISH_VIA_PARTITION_ROOT works
independently on every publication, I think it does not work correctly on
different publications of the same subscription.

So I fix this bug by the following two steps:
First step,
I get oids of subscribed tables by publication list. Then for tables with the
same topmost root table, I filter them base on the option
PUBLISH_VIA_PARTITION_ROOT(see new function filter_partitions_oids).
After filtering, I get the final oid list.
Second step,
I get the required informations(nspname and relname) base on the oid list of
first step.

Regards,
Wang wei

Вложения

RE: Data is copied twice when specifying both child and parent table in publication

От
"houzj.fnst@fujitsu.com"
Дата:

> -----Original Message-----
> From: Wang, Wei/王 威 <wangw.fnst@fujitsu.com>
On Thursday, April 7, 2022 11:08 AM
> 
> On Thur, Mar 10, 2021 at 10:08 AM houzj.fnst@fujitsu.com wrote:
> > Hi,
> >
> > When reviewing some logical replication related features. I noticed another
> > possible problem if the subscriber subscribes multiple publications which
> > publish parent and child table.
> >
> > For example:
> >
> > ----pub
> > create table t (a int, b int, c int) partition by range (a);
> > create table t_1 partition of t for values from (1) to (10);
> >
> > create publication pub1 for table t
> >   with (PUBLISH_VIA_PARTITION_ROOT);
> > create publication pub2 for table t_1
> >   with (PUBLISH_VIA_PARTITION_ROOT);
> >
> > ----sub
> > ---- prepare table t and t_1
> > CREATE SUBSCRIPTION sub CONNECTION 'port=10000 dbname=postgres'
> > PUBLICATION pub1, pub2;
> >
> > select * from pg_subscription_rel ;
> >  srsubid | srrelid | srsubstate | srsublsn
> > ---------+---------+------------+-----------
> >    16391 |   16385(t) | r          | 0/150D100
> >    16391 |   16388(t_1) | r          | 0/150D138
> >
> > If subscribe two publications one of them publish parent table with
> > (pubviaroot=true) and another publish child table. Both the parent table and
> > child table will exist in pg_subscription_rel which also means we will do
> > initial copy for both tables.
> >
> > But after initial copy, we only publish change with the schema of the parent
> > table(t). It looks a bit inconsistent.
> >
> > Based on the document of PUBLISH_VIA_PARTITION_ROOT option. I think
> the
> > expected behavior could be we only store the top most parent(table t) in
> > pg_subscription_rel and do initial copy for it if pubviaroot is on. I haven't
> > thought about how to fix this and will investigate this later.
> Hi,
> I try to fix this bug. Attach the patch.
> 
> The current HEAD get table list for one publication by invoking function
> pg_get_publication_tables. If multiple publications are subscribed, then this
> function is invoked multiple times. So option PUBLISH_VIA_PARTITION_ROOT
> works
> independently on every publication, I think it does not work correctly on
> different publications of the same subscription.
> 
> So I fix this bug by the following two steps:
> First step,
> I get oids of subscribed tables by publication list. Then for tables with the
> same topmost root table, I filter them base on the option
> PUBLISH_VIA_PARTITION_ROOT(see new function filter_partitions_oids).
> After filtering, I get the final oid list.
> Second step,
> I get the required informations(nspname and relname) base on the oid list of
> first step.

Thanks for updating the patch.
I confirmed that the bug is fixed by this patch.

One suggestion is that can we simplify the code by moving the logic of checking
the ancestor into the SQL ?. For example, we could filter the outpout of
pg_publication_tables by adding A WHERE clause which checks whether the table
is a partition and if its ancestor is also in the output. I think we can also
filter the needless partition in this approach.

Best regards,
Hou zj

RE: Data is copied twice when specifying both child and parent table in publication

От
"shiy.fnst@fujitsu.com"
Дата:
On Tue, Apr 19, 2022 3:05 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
>
> > -----Original Message-----
> > From: Wang, Wei/王 威 <wangw.fnst@fujitsu.com>
> On Thursday, April 7, 2022 11:08 AM
> >
> > On Thur, Mar 10, 2021 at 10:08 AM houzj.fnst@fujitsu.com wrote:
> > > Hi,
> > >
> > > When reviewing some logical replication related features. I noticed another
> > > possible problem if the subscriber subscribes multiple publications which
> > > publish parent and child table.
> > >
> > > For example:
> > >
> > > ----pub
> > > create table t (a int, b int, c int) partition by range (a);
> > > create table t_1 partition of t for values from (1) to (10);
> > >
> > > create publication pub1 for table t
> > >   with (PUBLISH_VIA_PARTITION_ROOT);
> > > create publication pub2 for table t_1
> > >   with (PUBLISH_VIA_PARTITION_ROOT);
> > >
> > > ----sub
> > > ---- prepare table t and t_1
> > > CREATE SUBSCRIPTION sub CONNECTION 'port=10000 dbname=postgres'
> > > PUBLICATION pub1, pub2;
> > >
> > > select * from pg_subscription_rel ;
> > >  srsubid | srrelid | srsubstate | srsublsn
> > > ---------+---------+------------+-----------
> > >    16391 |   16385(t) | r          | 0/150D100
> > >    16391 |   16388(t_1) | r          | 0/150D138
> > >
> > > If subscribe two publications one of them publish parent table with
> > > (pubviaroot=true) and another publish child table. Both the parent table and
> > > child table will exist in pg_subscription_rel which also means we will do
> > > initial copy for both tables.
> > >
> > > But after initial copy, we only publish change with the schema of the parent
> > > table(t). It looks a bit inconsistent.
> > >
> > > Based on the document of PUBLISH_VIA_PARTITION_ROOT option. I think
> > the
> > > expected behavior could be we only store the top most parent(table t) in
> > > pg_subscription_rel and do initial copy for it if pubviaroot is on. I haven't
> > > thought about how to fix this and will investigate this later.
> > Hi,
> > I try to fix this bug. Attach the patch.
> >
> > The current HEAD get table list for one publication by invoking function
> > pg_get_publication_tables. If multiple publications are subscribed, then this
> > function is invoked multiple times. So option PUBLISH_VIA_PARTITION_ROOT
> > works
> > independently on every publication, I think it does not work correctly on
> > different publications of the same subscription.
> >
> > So I fix this bug by the following two steps:
> > First step,
> > I get oids of subscribed tables by publication list. Then for tables with the
> > same topmost root table, I filter them base on the option
> > PUBLISH_VIA_PARTITION_ROOT(see new function filter_partitions_oids).
> > After filtering, I get the final oid list.
> > Second step,
> > I get the required informations(nspname and relname) base on the oid list of
> > first step.
> 
> Thanks for updating the patch.
> I confirmed that the bug is fixed by this patch.
> 
> One suggestion is that can we simplify the code by moving the logic of checking
> the ancestor into the SQL ?. For example, we could filter the outpout of
> pg_publication_tables by adding A WHERE clause which checks whether the table
> is a partition and if its ancestor is also in the output. I think we can also
> filter the needless partition in this approach.
> 

I agreed with you and I tried to fix this problem in a simpler way. What we want
is to exclude the partitioned table whose ancestor is also need to be
replicated, so how about implementing that by using the following SQL when
getting the table list from publisher?

SELECT DISTINCT ns.nspname, c.relname
FROM pg_catalog.pg_publication_tables t
JOIN pg_catalog.pg_namespace ns ON ns.nspname = t.schemaname
JOIN pg_catalog.pg_class c ON c.relname = t.tablename AND c.relnamespace = ns.oid
WHERE t.pubname IN ('p0','p2')
AND (c.relispartition IS FALSE OR NOT EXISTS (SELECT 1 FROM pg_partition_ancestors(c.oid)
WHERE relid IN ( SELECT DISTINCT (schemaname||'.'||tablename)::regclass::oid
FROM pg_catalog.pg_publication_tables t
WHERE t.pubname IN ('p0','p2') ) AND relid != c.oid));

Please find the attached patch which used this approach, I also merged the test
in Wang's patch into it.

Regards,
Shi yu

Вложения

RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Tue, Apr 19, 2022 4:53 PM Shi, Yu/侍 雨 <shiy.fnst@cn.fujitsu.com> wrote:
> On Tue, Apr 19, 2022 3:05 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com>
> wrote:
> >
> > > -----Original Message-----
> > > From: Wang, Wei/王 威 <wangw.fnst@fujitsu.com>
> > On Thursday, April 7, 2022 11:08 AM
> > >
> > > On Thur, Mar 10, 2021 at 10:08 AM houzj.fnst@fujitsu.com wrote:
> > > > Hi,
> > > >
> > > > When reviewing some logical replication related features. I noticed
> another
> > > > possible problem if the subscriber subscribes multiple publications which
> > > > publish parent and child table.
> > > >
> > > > For example:
> > > >
> > > > ----pub
> > > > create table t (a int, b int, c int) partition by range (a);
> > > > create table t_1 partition of t for values from (1) to (10);
> > > >
> > > > create publication pub1 for table t
> > > >   with (PUBLISH_VIA_PARTITION_ROOT);
> > > > create publication pub2 for table t_1
> > > >   with (PUBLISH_VIA_PARTITION_ROOT);
> > > >
> > > > ----sub
> > > > ---- prepare table t and t_1
> > > > CREATE SUBSCRIPTION sub CONNECTION 'port=10000 dbname=postgres'
> > > > PUBLICATION pub1, pub2;
> > > >
> > > > select * from pg_subscription_rel ;
> > > >  srsubid | srrelid | srsubstate | srsublsn
> > > > ---------+---------+------------+-----------
> > > >    16391 |   16385(t) | r          | 0/150D100
> > > >    16391 |   16388(t_1) | r          | 0/150D138
> > > >
> > > > If subscribe two publications one of them publish parent table with
> > > > (pubviaroot=true) and another publish child table. Both the parent table
> and
> > > > child table will exist in pg_subscription_rel which also means we will do
> > > > initial copy for both tables.
> > > >
> > > > But after initial copy, we only publish change with the schema of the
> parent
> > > > table(t). It looks a bit inconsistent.
> > > >
> > > > Based on the document of PUBLISH_VIA_PARTITION_ROOT option. I think
> > > the
> > > > expected behavior could be we only store the top most parent(table t) in
> > > > pg_subscription_rel and do initial copy for it if pubviaroot is on. I haven't
> > > > thought about how to fix this and will investigate this later.
> > > Hi,
> > > I try to fix this bug. Attach the patch.
> > >
> > > The current HEAD get table list for one publication by invoking function
> > > pg_get_publication_tables. If multiple publications are subscribed, then this
> > > function is invoked multiple times. So option
> PUBLISH_VIA_PARTITION_ROOT
> > > works
> > > independently on every publication, I think it does not work correctly on
> > > different publications of the same subscription.
> > >
> > > So I fix this bug by the following two steps:
> > > First step,
> > > I get oids of subscribed tables by publication list. Then for tables with the
> > > same topmost root table, I filter them base on the option
> > > PUBLISH_VIA_PARTITION_ROOT(see new function filter_partitions_oids).
> > > After filtering, I get the final oid list.
> > > Second step,
> > > I get the required informations(nspname and relname) base on the oid list
> of
> > > first step.
> >
> > Thanks for updating the patch.
> > I confirmed that the bug is fixed by this patch.
> >
> > One suggestion is that can we simplify the code by moving the logic of
> checking
> > the ancestor into the SQL ?. For example, we could filter the outpout of
> > pg_publication_tables by adding A WHERE clause which checks whether the
> table
> > is a partition and if its ancestor is also in the output. I think we can also
> > filter the needless partition in this approach.
> >
> 
> I agreed with you and I tried to fix this problem in a simpler way. What we want
> is to exclude the partitioned table whose ancestor is also need to be
> replicated, so how about implementing that by using the following SQL when
> getting the table list from publisher?
> 
> SELECT DISTINCT ns.nspname, c.relname
> FROM pg_catalog.pg_publication_tables t
> JOIN pg_catalog.pg_namespace ns ON ns.nspname = t.schemaname
> JOIN pg_catalog.pg_class c ON c.relname = t.tablename AND c.relnamespace =
> ns.oid
> WHERE t.pubname IN ('p0','p2')
> AND (c.relispartition IS FALSE OR NOT EXISTS (SELECT 1 FROM
> pg_partition_ancestors(c.oid)
> WHERE relid IN ( SELECT DISTINCT (schemaname||'.'||tablename)::regclass::oid
> FROM pg_catalog.pg_publication_tables t
> WHERE t.pubname IN ('p0','p2') ) AND relid != c.oid));
> 
> Please find the attached patch which used this approach, I also merged the test
> in Wang's patch into it.
Thanks for your review and patch.

I think the approach of v2 is better than v1. It does not increase the query.
Only move the test cases from 100_bugs.pl to 013_partition.pl and simplify it.
Attach the new patch.

Regards,
Wang wei

Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Tue, Apr 19, 2022 at 2:23 PM shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:
>
> On Tue, Apr 19, 2022 3:05 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
> >
> > One suggestion is that can we simplify the code by moving the logic of checking
> > the ancestor into the SQL ?. For example, we could filter the outpout of
> > pg_publication_tables by adding A WHERE clause which checks whether the table
> > is a partition and if its ancestor is also in the output. I think we can also
> > filter the needless partition in this approach.
> >
>
> I agreed with you and I tried to fix this problem in a simpler way. What we want
> is to exclude the partitioned table whose ancestor is also need to be
> replicated, so how about implementing that by using the following SQL when
> getting the table list from publisher?
>
> SELECT DISTINCT ns.nspname, c.relname
> FROM pg_catalog.pg_publication_tables t
> JOIN pg_catalog.pg_namespace ns ON ns.nspname = t.schemaname
> JOIN pg_catalog.pg_class c ON c.relname = t.tablename AND c.relnamespace = ns.oid
> WHERE t.pubname IN ('p0','p2')
> AND (c.relispartition IS FALSE OR NOT EXISTS (SELECT 1 FROM pg_partition_ancestors(c.oid)
> WHERE relid IN ( SELECT DISTINCT (schemaname||'.'||tablename)::regclass::oid
> FROM pg_catalog.pg_publication_tables t
> WHERE t.pubname IN ('p0','p2') ) AND relid != c.oid));
>
> Please find the attached patch which used this approach, I also merged the test
> in Wang's patch into it.
>

I think this will work but do we need "... relid != c.oid" at the end
of the query? If so, why? Please use an alias for
pg_partition_ancestors to make the statement understandable.

Now, this solution will work but I find this query a bit complex and
will add some overhead as we are calling pg_publication_tables
multiple times. So, I was wondering if we can have a new function
pg_get_publication_tables which takes multiple publications as input
and return the list of qualified tables? I think for back branches we
need something on the lines of what you have proposed but for HEAD we
can have a better solution.

IIRC, the column list and row filter also have some issues exactly due
to this reason, so, I would like those cases to be also mentioned here
and probably include the tests for them in the patch for HEAD.

-- 
With Regards,
Amit Kapila.



RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Thur, Apr 21, 2022 at 5:41 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
Thanks for your comments.

> On Tue, Apr 19, 2022 at 2:23 PM shiy.fnst@fujitsu.com <shiy.fnst@fujitsu.com>
> wrote:
> >
> > On Tue, Apr 19, 2022 3:05 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> > >
> > > One suggestion is that can we simplify the code by moving the logic
> > > of checking the ancestor into the SQL ?. For example, we could
> > > filter the outpout of pg_publication_tables by adding A WHERE clause
> > > which checks whether the table is a partition and if its ancestor is
> > > also in the output. I think we can also filter the needless partition in this
> approach.
> > >
> >
> > I agreed with you and I tried to fix this problem in a simpler way.
> > What we want is to exclude the partitioned table whose ancestor is
> > also need to be replicated, so how about implementing that by using
> > the following SQL when getting the table list from publisher?
> >
> > SELECT DISTINCT ns.nspname, c.relname
> > FROM pg_catalog.pg_publication_tables t JOIN pg_catalog.pg_namespace
> > ns ON ns.nspname = t.schemaname JOIN pg_catalog.pg_class c ON
> > c.relname = t.tablename AND c.relnamespace = ns.oid WHERE t.pubname IN
> > ('p0','p2') AND (c.relispartition IS FALSE OR NOT EXISTS (SELECT 1
> > FROM pg_partition_ancestors(c.oid) WHERE relid IN ( SELECT DISTINCT
> > (schemaname||'.'||tablename)::regclass::oid
> > FROM pg_catalog.pg_publication_tables t WHERE t.pubname IN ('p0','p2')
> > ) AND relid != c.oid));
> >
> > Please find the attached patch which used this approach, I also merged
> > the test in Wang's patch into it.
> >
> 
> I think this will work but do we need "... relid != c.oid" at the end of the query? If
> so, why? Please use an alias for pg_partition_ancestors to make the statement
> understandable.
I think we need this (relid != c.oid). Because when we use function
pg_partition_ancestors(c.oid), its return value not only has ancestors, but
also the input table. That is to say, when we use the table (c.oid) of the
outer query to filter in the sub-query, the table of the outer query will also
appear in the result of the sub-query.
So, I think we need this condition to prevent filtering out itself.

> Now, this solution will work but I find this query a bit complex and will add some
> overhead as we are calling pg_publication_tables multiple times. So, I was
> wondering if we can have a new function pg_get_publication_tables which
> takes multiple publications as input and return the list of qualified tables? I think
> for back branches we need something on the lines of what you have proposed
> but for HEAD we can have a better solution.
Yes, it sounds reasonable to me. Now, to fix this bug:
In the patch for HEAD, add a new function pg_get_publications_tables to get
tables info from a publications array.
In the patch for back-branch (now just share the patch for REL14), modify the
SQL to get tables info.

> IIRC, the column list and row filter also have some issues exactly due to this
> reason, so, I would like those cases to be also mentioned here and probably
> include the tests for them in the patch for HEAD.
Improve the test case about the column list and row filter to cover this bug.

Attach the new patches.[suggestions by Amit-San]
The patch for HEAD:
1. Add a new function to get tables info by a publications array.
The patch for REL14:
1. Use an alias to make the statement understandable. BTW, I adjusted the alignment.
2. Improve the test cast about the column list and row filter to cover this bug.

Regards,
Wang wei

Вложения

RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Sun, Apr 24, 2022 at 2:16 PM I wrote:
> On Thur, Apr 21, 2022 at 5:41 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > IIRC, the column list and row filter also have some issues exactly due to this
> > reason, so, I would like those cases to be also mentioned here and probably
> > include the tests for them in the patch for HEAD.
> Improve the test case about the column list and row filter to cover this bug.
Sorry, I forgot to explain why I modify the tests for row filter and column
filter. If we specify different filters on the parent and child table
respectively, this bug will make us use the wrong filter.

Like the following cases:
[row filter]
- environment in publisher-side.
create table t (a int) partition by range (a);
create table t_1 partition of t default;
create publication pub1 for table t where (a<=10) with (PUBLISH_VIA_PARTITION_ROOT=true);
create publication pub2 for table t_1 where (a>10) with (PUBLISH_VIA_PARTITION_ROOT=true);
insert into t values (9),(11);

- environment in subscriber-side.
create table t (a int) partition by range (a);
create table t_1 partition of t default;
create subscription sub connection 'dbname=postgres user=postgres' publication pub1,pub2;

When we execute the following SQL in subscriber-side, what we expect should be:
select * from t;
 a
---
 9
(1 row)

but the HEAD is:
 a
----
  9
 11
(2 rows)

[column filter]
- environment in publisher-side.
create table t (a int primary key, b int, c int) partition by range (a);
create table t_1 partition of t default;
create publication pub1 for table t(a, b) with (PUBLISH_VIA_PARTITION_ROOT=true);
create publication pub2 for table t_1(a, c) with (PUBLISH_VIA_PARTITION_ROOT=true);
insert into t values (1,1,1);

- environment in subscriber-side.
create table t (a int, b int, c int) partition by range (a);
create table t_1 partition of t default;
create subscription sub connection 'dbname=postgres user=postgres' publication pub1,pub2;

When we execute the following SQL in subscriber-side, what we expect should be:
select * from t;
 a | b | c
---+---+---
 1 | 1 |
(1 row)

but the HEAD is:
 a | b | c
---+---+---
 1 | 1 |
 1 |   | 1
(2 rows)

Regards,
Wang wei

RE: Data is copied twice when specifying both child and parent table in publication

От
"shiy.fnst@fujitsu.com"
Дата:
On Sun, Apr 24, 2022 2:16 PM Wang, Wei/王 威 <wangw.fnst@fujitsu.com> wrote:
> 
> Attach the new patches.[suggestions by Amit-San]
> The patch for HEAD:
> 1. Add a new function to get tables info by a publications array.
> The patch for REL14:
> 1. Use an alias to make the statement understandable. BTW, I adjusted the
> alignment.
> 2. Improve the test cast about the column list and row filter to cover this bug.
> 

Thanks for your patches.

Here's a comment on the patch for REL14.

+    appendStringInfo(&cmd, "SELECT DISTINCT ns.nspname, c.relname\n"
+                     " FROM pg_catalog.pg_publication_tables t\n"
+                     "      JOIN pg_catalog.pg_namespace ns\n"
+                     "         ON ns.nspname = t.schemaname\n"
+                     "      JOIN pg_catalog.pg_class c\n"
+                     "         ON c.relname = t.tablename AND c.relnamespace = ns.oid\n"
+                     " WHERE t.pubname IN (%s)\n"
+                     " AND (c.relispartition IS FALSE\n"
+                     "      OR NOT EXISTS\n"
+                     "        ( SELECT 1 FROM pg_partition_ancestors(c.oid) as relid\n"
+                     "          WHERE relid IN\n"
+                     "            (SELECT DISTINCT (schemaname || '.' || tablename)::regclass::oid\n"
+                     "             FROM pg_catalog.pg_publication_tables t\n"
+                     "             WHERE t.pubname IN (%s))\n"
+                     "          AND relid != c.oid))\n",
+                     pub_names.data, pub_names.data);

I think we can use an alias like 'pa' for pg_partition_ancestors, and modify the SQL as follows. 

+    appendStringInfo(&cmd, "SELECT DISTINCT ns.nspname, c.relname\n"
+                     " FROM pg_catalog.pg_publication_tables t\n"
+                     "      JOIN pg_catalog.pg_namespace ns\n"
+                     "         ON ns.nspname = t.schemaname\n"
+                     "      JOIN pg_catalog.pg_class c\n"
+                     "         ON c.relname = t.tablename AND c.relnamespace = ns.oid\n"
+                     " WHERE t.pubname IN (%s)\n"
+                     " AND (c.relispartition IS FALSE\n"
+                     "      OR NOT EXISTS\n"
+                     "        ( SELECT 1 FROM pg_partition_ancestors(c.oid) pa\n"
+                     "          WHERE pa.relid IN\n"
+                     "            (SELECT DISTINCT (t.schemaname || '.' || t.tablename)::regclass::oid\n"
+                     "             FROM pg_catalog.pg_publication_tables t\n"
+                     "             WHERE t.pubname IN (%s))\n"
+                     "          AND pa.relid != c.oid))\n",
+                     pub_names.data, pub_names.data);

Regards,
Shi yu


RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Tue, Apr 28, 2022 9:22 AM Shi, Yu/侍 雨 <shiy.fnst@cn.fujitsu.com> wrote:
> Thanks for your patches.
> 
> Here's a comment on the patch for REL14.
Thanks for your comments.

> +    appendStringInfo(&cmd, "SELECT DISTINCT ns.nspname, c.relname\n"
> +                     " FROM
> pg_catalog.pg_publication_tables t\n"
> +                     "      JOIN pg_catalog.pg_namespace
> ns\n"
> +                     "         ON ns.nspname =
> t.schemaname\n"
> +                     "      JOIN pg_catalog.pg_class c\n"
> +                     "         ON c.relname = t.tablename AND
> c.relnamespace = ns.oid\n"
> +                     " WHERE t.pubname IN (%s)\n"
> +                     " AND (c.relispartition IS FALSE\n"
> +                     "      OR NOT EXISTS\n"
> +                     "        ( SELECT 1 FROM
> pg_partition_ancestors(c.oid) as relid\n"
> +                     "          WHERE relid IN\n"
> +                     "            (SELECT DISTINCT (schemaname
> || '.' || tablename)::regclass::oid\n"
> +                     "             FROM
> pg_catalog.pg_publication_tables t\n"
> +                     "             WHERE t.pubname IN (%s))\n"
> +                     "          AND relid != c.oid))\n",
> +                     pub_names.data, pub_names.data);
> 
> I think we can use an alias like 'pa' for pg_partition_ancestors, and modify the
> SQL as follows.
> 
> +    appendStringInfo(&cmd, "SELECT DISTINCT ns.nspname, c.relname\n"
> +                     " FROM
> pg_catalog.pg_publication_tables t\n"
> +                     "      JOIN pg_catalog.pg_namespace
> ns\n"
> +                     "         ON ns.nspname =
> t.schemaname\n"
> +                     "      JOIN pg_catalog.pg_class c\n"
> +                     "         ON c.relname = t.tablename AND
> c.relnamespace = ns.oid\n"
> +                     " WHERE t.pubname IN (%s)\n"
> +                     " AND (c.relispartition IS FALSE\n"
> +                     "      OR NOT EXISTS\n"
> +                     "        ( SELECT 1 FROM
> pg_partition_ancestors(c.oid) pa\n"
> +                     "          WHERE pa.relid IN\n"
> +                     "            (SELECT DISTINCT
> (t.schemaname || '.' || t.tablename)::regclass::oid\n"
> +                     "             FROM
> pg_catalog.pg_publication_tables t\n"
> +                     "             WHERE t.pubname IN (%s))\n"
> +                     "          AND pa.relid != c.oid))\n",
> +                     pub_names.data, pub_names.data);
Fix it.

In addition, I try to modify the approach for the HEAD.
I enhance the API of function pg_get_publication_tables. Change the parameter
type from 'text' to 'any'. Then we can use this function to get tables from one
publication or an array of publications. Any thoughts on this approach?

Attach new patches.
The patch for HEAD:
1. Modify the approach. Enhance the API of function pg_get_publication_tables to
handle one publication or an array of publications.
The patch for REL14:
1. Improve the table sync SQL. [suggestions by Shi yu]

Regards,
Wang wei

Вложения

RE: Data is copied twice when specifying both child and parent table in publication

От
"osumi.takamichi@fujitsu.com"
Дата:
On Monday, May 9, 2022 10:51 AM wangw.fnst@fujitsu.com <wangw.fnst@fujitsu.com> wrote:
> Attach new patches.
> The patch for HEAD:
> 1. Modify the approach. Enhance the API of function
> pg_get_publication_tables to handle one publication or an array of
> publications.
> The patch for REL14:
> 1. Improve the table sync SQL. [suggestions by Shi yu]
Hi, thank you for updating the patch !

Minor comments on your patch for HEAD v2.

(1) commit message sentence

I suggest below sentence.

Kindly change from
"... when subscribing to both publications using one subscription, the data is replicated
twice in inital copy"
to "subscribing to both publications from one subscription causes initial copy twice".

(2) unused variable

pg_publication.c: In function ‘pg_get_publication_tables’:
pg_publication.c:1091:11: warning: unused variable ‘pubname’ [-Wunused-variable]
  char    *pubname;

We can remove this.

(3) free of allocated memory

In the pg_get_publication_tables(),
we don't free 'elems'. Don't we need it ?

(4) some coding alignments

4-1.

+       List       *tables_viaroot = NIL,
...
+                          *current_table = NIL;

I suggest we can put some variables
into the condition for the first time call of this function,
like tables_viaroot and current_table.
When you agree, kindly change it.

4-2.

+                       /*
+                        * Publications support partitioned tables, although all changes
+                        * are replicated using leaf partition identity and schema, so we
+                        * only need those.
+                        */
+                       if (publication->alltables)
+                       {
+                               current_table = GetAllTablesPublicationRelations(publication->pubviaroot);
+                       }

This is not related to the change itself and now
we are inheriting the previous curly brackets, but
I think there's no harm in removing it, since it's only for one statement.


Best Regards,
    Takamichi Osumi


RE: Data is copied twice when specifying both child and parent table in publication

От
"osumi.takamichi@fujitsu.com"
Дата:
On Wednesday, May 11, 2022 11:33 AM I wrote:
> On Monday, May 9, 2022 10:51 AM wangw.fnst@fujitsu.com
> <wangw.fnst@fujitsu.com> wrote:
> > Attach new patches.
> > The patch for HEAD:
> > 1. Modify the approach. Enhance the API of function
> > pg_get_publication_tables to handle one publication or an array of
> > publications.
> > The patch for REL14:
> > 1. Improve the table sync SQL. [suggestions by Shi yu]
> Hi, thank you for updating the patch !
> 
> Minor comments on your patch for HEAD v2.
> 
> (1) commit message sentence
> 
> I suggest below sentence.
> 
> Kindly change from
> "... when subscribing to both publications using one subscription, the data is
> replicated twice in inital copy"
> to "subscribing to both publications from one subscription causes initial copy
> twice".
> 
> (2) unused variable
> 
> pg_publication.c: In function ‘pg_get_publication_tables’:
> pg_publication.c:1091:11: warning: unused variable ‘pubname’
> [-Wunused-variable]
>   char    *pubname;
> 
> We can remove this.
> 
> (3) free of allocated memory
> 
> In the pg_get_publication_tables(),
> we don't free 'elems'. Don't we need it ?
> 
> (4) some coding alignments
> 
> 4-1.
> 
> +       List       *tables_viaroot = NIL,
> ...
> +                          *current_table = NIL;
> 
> I suggest we can put some variables
> into the condition for the first time call of this function, like tables_viaroot and
> current_table.
> When you agree, kindly change it.
> 
> 4-2.
> 
> +                       /*
> +                        * Publications support partitioned tables, although
> all changes
> +                        * are replicated using leaf partition identity and
> schema, so we
> +                        * only need those.
> +                        */
> +                       if (publication->alltables)
> +                       {
> +                               current_table =
> GetAllTablesPublicationRelations(publication->pubviaroot);
> +                       }
> 
> This is not related to the change itself and now we are inheriting the previous
> curly brackets, but I think there's no harm in removing it, since it's only for one
> statement.
Hi, 

One more thing I'd like to add is that
we don't hit the below code by tests.
In the HEAD v2, we add a new filtering logic in pg_get_publication_tables.
Although I'm not sure if this is related to the bug fix itself,
when we want to include it in this patch, then
I feel it's better to add some simple test for this part,
to cover all the new main paths and check if
new logic works correctly.


+               /*
+                * If a partition table is published in a publication with viaroot,
+                * and its parent or child table is published in another publication
+                * without viaroot. Then we need to move the parent or child table
+                * from tables to tables_viaroot.
+                *
+                * If all publication(s)'s viaroot are the same, then skip this part.
+                */

....
                                       if (ancestor_viaroot == ancestor)
+                                       {
+                                               tables = foreach_delete_current(tables, lc2);
+                                               change_tables = list_append_unique_oid(change_tables,
+
  relid);
 
+                                       }


Best Regards,
    Takamichi Osumi


RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Thur, May 12, 2022 9:48 AM osumi.takamichi@fujitsu.com <osumi.takamichi@fujitsu.com> wrote:
> On Wednesday, May 11, 2022 11:33 AM I wrote:
> > On Monday, May 9, 2022 10:51 AM wangw.fnst@fujitsu.com
> > <wangw.fnst@fujitsu.com> wrote:
> > > Attach new patches.
> > > The patch for HEAD:
> > > 1. Modify the approach. Enhance the API of function
> > > pg_get_publication_tables to handle one publication or an array of
> > > publications.
> > > The patch for REL14:
> > > 1. Improve the table sync SQL. [suggestions by Shi yu]
> > Hi, thank you for updating the patch !
> >
> > Minor comments on your patch for HEAD v2.
Thanks for your comments.

> > (1) commit message sentence
> >
> > I suggest below sentence.
> >
> > Kindly change from
> > "... when subscribing to both publications using one subscription, the data is
> > replicated twice in inital copy"
> > to "subscribing to both publications from one subscription causes initial copy
> > twice".
Improve it according to your suggestion.

> > (2) unused variable
> >
> > pg_publication.c: In function ‘pg_get_publication_tables’:
> > pg_publication.c:1091:11: warning: unused variable ‘pubname’
> > [-Wunused-variable]
> >   char    *pubname;
> >
> > We can remove this.
Fix it.

> > (3) free of allocated memory
> >
> > In the pg_get_publication_tables(),
> > we don't free 'elems'. Don't we need it ?
Improve it according to your suggestion. Free 'elems'.

> > (4) some coding alignments
> >
> > 4-1.
> >
> > +       List       *tables_viaroot = NIL,
> > ...
> > +                          *current_table = NIL;
> >
> > I suggest we can put some variables
> > into the condition for the first time call of this function, like tables_viaroot and
> > current_table.
> > When you agree, kindly change it.
Improve these according to your suggestions.
Also, I put the code for getting publication(s) into the condition for the
first time call of this function.

> > 4-2.
> >
> > +                       /*
> > +                        * Publications support partitioned tables, although
> > all changes
> > +                        * are replicated using leaf partition identity and
> > schema, so we
> > +                        * only need those.
> > +                        */
> > +                       if (publication->alltables)
> > +                       {
> > +                               current_table =
> > GetAllTablesPublicationRelations(publication->pubviaroot);
> > +                       }
> >
> > This is not related to the change itself and now we are inheriting the previous
> > curly brackets, but I think there's no harm in removing it, since it's only for one
> > statement.
Improve these according to your suggestions.

> Hi,
> 
> One more thing I'd like to add is that
> we don't hit the below code by tests.
> In the HEAD v2, we add a new filtering logic in pg_get_publication_tables.
> Although I'm not sure if this is related to the bug fix itself,
> when we want to include it in this patch, then
> I feel it's better to add some simple test for this part,
> to cover all the new main paths and check if
> new logic works correctly.
> 
> 
> +               /*
> +                * If a partition table is published in a publication with viaroot,
> +                * and its parent or child table is published in another publication
> +                * without viaroot. Then we need to move the parent or child table
> +                * from tables to tables_viaroot.
> +                *
> +                * If all publication(s)'s viaroot are the same, then skip this part.
> +                */
> 
> ....
>                                        if (ancestor_viaroot == ancestor)
> +                                       {
> +                                               tables = foreach_delete_current(tables, lc2);
> +                                               change_tables =
> list_append_unique_oid(change_tables,
> +
    relid);
 
> +                                       }
Yes, I agree.
But when I was adding the test, I found we could improve this part.
So, I removed this part of the code.

Also rebase it because the change in HEAD(23e7b38).

Attach the patches.(Only changed the patch for HEAD.).
1. Improve the commit message. [suggestions by Osumi-san]
2. Improve coding alignments and the usage for SRFs. [suggestions by Osumi-san and I]
3. Simplify the modifications in function pg_get_publication_tables.

Regards,
Wang wei

Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Fri, May 13, 2022 at 7:32 AM wangw.fnst@fujitsu.com
<wangw.fnst@fujitsu.com> wrote:
>
> Attach the patches.(Only changed the patch for HEAD.).
>

Few comments:
=============
1.
@@ -1135,6 +1172,15 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
  if (publication->pubviaroot)
  tables = filter_partitions(tables);
  }
+ pfree(elems);
+
+ /*
+ * We need an additional filter for this case : A partition table is
+ * published in a publication with viaroot, and its parent or child
+ * table is published in another publication without viaroot. In this
+ * case, we should publish only parent table.
+ */
+ tables = filter_partitions(tables);

Do we need to filter partitions twice? Can't we check if any of the
publications has 'pubviaroot' option set, if so, call
filter_partitions at the end?

2.  " FROM pg_class c JOIN pg_namespace n"
+ "   ON n.oid = c.relnamespace,"
+ " LATERAL pg_get_publication_tables(array[ %s ]) gst"

Here, it is better to have an alias name as gpt.

3.
  }
+ pfree(elems);
+

An extra line between these two lines makes it looks slightly better.

4. Not able to apply patch cleanly.
patching file src/test/subscription/t/013_partition.pl
Hunk #1 FAILED at 477.
Hunk #2 FAILED at 556.
Hunk #3 FAILED at 584.
3 out of 3 hunks FAILED -- saving rejects to file
src/test/subscription/t/013_partition.pl.rej
patching file src/test/subscription/t/028_row_filter.pl
Hunk #1 succeeded at 394 (offset 1 line).
Hunk #2 FAILED at 722.
1 out of 2 hunks FAILED -- saving rejects to file
src/test/subscription/t/028_row_filter.pl.rej
patching file src/test/subscription/t/031_column_list.pl
Hunk #1 succeeded at 948 (offset -92 lines).
Hunk #2 FAILED at 1050.
1 out of 2 hunks FAILED -- saving rejects to file
src/test/subscription/t/031_column_list.pl.rej

-- 
With Regards,
Amit Kapila.



RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Fri, May 13, 2022 1:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, May 13, 2022 at 7:32 AM wangw.fnst@fujitsu.com
> <wangw.fnst@fujitsu.com> wrote:
> >
> > Attach the patches.(Only changed the patch for HEAD.).
> >
> 
> Few comments:
> =============
Thanks for your comments.

> 1.
> @@ -1135,6 +1172,15 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
>   if (publication->pubviaroot)
>   tables = filter_partitions(tables);
>   }
> + pfree(elems);
> +
> + /*
> + * We need an additional filter for this case : A partition table is
> + * published in a publication with viaroot, and its parent or child
> + * table is published in another publication without viaroot. In this
> + * case, we should publish only parent table.
> + */
> + tables = filter_partitions(tables);
> 
> Do we need to filter partitions twice? Can't we check if any of the publications
> has 'pubviaroot' option set, if so, call filter_partitions at the end?
Improve it according to your suggestion.

> 2.  " FROM pg_class c JOIN pg_namespace n"
> + "   ON n.oid = c.relnamespace,"
> + " LATERAL pg_get_publication_tables(array[ %s ]) gst"
> 
> Here, it is better to have an alias name as gpt.
Improve it according to your suggestion.

> 3.
>   }
> + pfree(elems);
> +
> 
> An extra line between these two lines makes it looks slightly better.
Improve it according to your suggestion.

> 4. Not able to apply patch cleanly.
> patching file src/test/subscription/t/013_partition.pl
> Hunk #1 FAILED at 477.
> Hunk #2 FAILED at 556.
> Hunk #3 FAILED at 584.
> 3 out of 3 hunks FAILED -- saving rejects to file
> src/test/subscription/t/013_partition.pl.rej
> patching file src/test/subscription/t/028_row_filter.pl
> Hunk #1 succeeded at 394 (offset 1 line).
> Hunk #2 FAILED at 722.
> 1 out of 2 hunks FAILED -- saving rejects to file
> src/test/subscription/t/028_row_filter.pl.rej
> patching file src/test/subscription/t/031_column_list.pl
> Hunk #1 succeeded at 948 (offset -92 lines).
> Hunk #2 FAILED at 1050.
> 1 out of 2 hunks FAILED -- saving rejects to file
> src/test/subscription/t/031_column_list.pl.rej
New patch could apply patch cleanly now.

Attach the patches.(Only changed the patch for HEAD.).
1. Optimize the code. Reduce calls to function filter_partitions. [suggestions by Amit-san]
2. Improve the alias name in SQL. [suggestions by Amit-san]
3. Improve coding alignments. [suggestions by Amit-san] 
4. Do some optimizations for list Concatenate.

Regards,
Wang wei

Вложения

RE: Data is copied twice when specifying both child and parent table in publication

От
"osumi.takamichi@fujitsu.com"
Дата:
On Friday, May 13, 2022 6:42 PM Wang, Wei/王 威 <wangw.fnst@fujitsu.com> wrote:
> Attach the patches.(Only changed the patch for HEAD.).
> 1. Optimize the code. Reduce calls to function filter_partitions. [suggestions by
> Amit-san] 2. Improve the alias name in SQL. [suggestions by Amit-san] 3.
> Improve coding alignments. [suggestions by Amit-san] 4. Do some
> optimizations for list Concatenate.
Hi, thank you for updating the patch.


I have one minor comment on fetch_table_list() in HEAD v4.


@@ -1759,17 +1759,22 @@ static List *
 fetch_table_list(WalReceiverConn *wrconn, List *publications)
 {
        WalRcvExecResult *res;
-       StringInfoData cmd;
+       StringInfoData cmd,
+                               pub_names;
        TupleTableSlot *slot;
        Oid                     tableRow[2] = {TEXTOID, TEXTOID};
        List       *tablelist = NIL;

+       initStringInfo(&pub_names);
+       get_publications_str(publications, &pub_names, true);
+

Kindly free the pub_names's data along with the cmd.data.


Best Regards,
    Takamichi Osumi


Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Fri, May 13, 2022 at 3:11 PM wangw.fnst@fujitsu.com
<wangw.fnst@fujitsu.com> wrote:
>
> Attach the patches.(Only changed the patch for HEAD.).
>

 # publications
-{ oid => '6119', descr => 'get OIDs of tables in a publication',
+{ oid => '6119', descr => 'get OIDs of tables in one or more publications',
   proname => 'pg_get_publication_tables', prorows => '1000', proretset => 't',
-  provolatile => 's', prorettype => 'oid', proargtypes => 'text',
-  proallargtypes => '{text,oid}', proargmodes => '{i,o}',
+  provolatile => 's', prorettype => 'oid', proargtypes => 'any',
+  proallargtypes => '{any,oid}', proargmodes => '{i,o}',

Won't our use case (input one or more publication names) requires the
parameter type to be 'VARIADIC text[]' instead of 'any'? I might be
missing something here so please let me know your reason to change the
type to 'any' from 'text'?

-- 
With Regards,
Amit Kapila.



RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Tues, May 17, 2022 9:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, May 13, 2022 at 3:11 PM wangw.fnst@fujitsu.com
> <wangw.fnst@fujitsu.com> wrote:
> >
> > Attach the patches.(Only changed the patch for HEAD.).
> >
Thanks for your comments.

>  # publications
> -{ oid => '6119', descr => 'get OIDs of tables in a publication',
> +{ oid => '6119', descr => 'get OIDs of tables in one or more publications',
>    proname => 'pg_get_publication_tables', prorows => '1000', proretset => 't',
> -  provolatile => 's', prorettype => 'oid', proargtypes => 'text',
> -  proallargtypes => '{text,oid}', proargmodes => '{i,o}',
> +  provolatile => 's', prorettype => 'oid', proargtypes => 'any',
> +  proallargtypes => '{any,oid}', proargmodes => '{i,o}',
> 
> Won't our use case (input one or more publication names) requires the
> parameter type to be 'VARIADIC text[]' instead of 'any'? I might be
> missing something here so please let me know your reason to change the
> type to 'any' from 'text'?
Yes, you are right. I improve the approach according to your suggestion.
I didn't notice the field "provariadic" in pg_proc before. And now I found we
could change the type of input from text to variadic text by specifying fields
"provariadic" and specifying 'v' in "proargmodes".
I also make corresponding changes to the processing of the input in function
pg_get_publication_tables.

BTW, in previous patch HEAD_v4, when invoke function GetPublicationByName in
function pg_get_publication_tables, I changed the second input from "false" to
"true". I changed this because when we invoke function
pg_get_publication_tables in the query in function fetch_table_list, if the
publication does not exist, it will error.
But this change will affect the compatibility of function
pg_get_publication_tables. So I revert this change in HEAD_v5, and filter the
publications that do not exist by the query in function fetch_table_list.

Attach the patches.(Only changed the patch for HEAD.)
1. Improve the approach to modify the input type of the function
   pg_get_publication_tables. [suggestions by Amit-san]
2. Free allocated memory in function fetch_table_list. [suggestions by Osumi-san]
3. Improve the approach of the handling of non-existing publications.

BTW, I rename the patch for REL14
from
"REL14_v4-0001-Fix-data-replicated-twice-when-specifying-PUBLISH.patch"
to
"REL14_v5-0001-Fix-data-replicated-twice-when-specifying-PUBLISH_patch".
Just for the version doesn't mess up between two branches and for cfbot.

Regards,
Wang wei

Вложения

RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Fri, May 13, 2022 10:57 PM Osumi, Takamichi/大墨 昂道 <osumi.takamichi@fujitsu.com> wrote:
> On Friday, May 13, 2022 6:42 PM Wang, Wei/王 威 <wangw.fnst@fujitsu.com>
> wrote:
> > Attach the patches.(Only changed the patch for HEAD.).
> > 1. Optimize the code. Reduce calls to function filter_partitions.
> > [suggestions by Amit-san] 2. Improve the alias name in SQL. [suggestions by
> Amit-san] 3.
> > Improve coding alignments. [suggestions by Amit-san] 4. Do some
> > optimizations for list Concatenate.
> Hi, thank you for updating the patch.
> 
> 
> I have one minor comment on fetch_table_list() in HEAD v4.
Thanks for your comments.

> @@ -1759,17 +1759,22 @@ static List *
>  fetch_table_list(WalReceiverConn *wrconn, List *publications)  {
>         WalRcvExecResult *res;
> -       StringInfoData cmd;
> +       StringInfoData cmd,
> +                               pub_names;
>         TupleTableSlot *slot;
>         Oid                     tableRow[2] = {TEXTOID, TEXTOID};
>         List       *tablelist = NIL;
> 
> +       initStringInfo(&pub_names);
> +       get_publications_str(publications, &pub_names, true);
> +
> 
> Kindly free the pub_names's data along with the cmd.data.
Improve it according to your suggestion. Free 'pub_names.data'.

I also made some other changes.
Kindly have a look at new patch shared in [1].

[1]
https://www.postgresql.org/message-id/OS3PR01MB6275B26B6BDF23651B8CE86D9ED19%40OS3PR01MB6275.jpnprd01.prod.outlook.com

Regards,
Wang wei

RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Wed, May 18, 2022 4:38 PM I wrote:
> Attach the patches.(Only changed the patch for HEAD.)
Sorry, I forgot to update commit message.

Attach the new patch.
1. Only update the commit message for HEAD_v5.

Regards,
Wang wei

Вложения

RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Wed, May 18, 2022 4:51 PM I wrote:
> Attach the new patch.

Since there are some new commits in HEAD (0ff20288, fd0b9dc and 52b5c53) that
improve the functions pg_get_publication_tables and fetch_table_list, we cannot
apply the patch cleanly. Therefore, I rebased the patch based on the changes in
HEAD.

I also rebased the patch for REL14 because the commit 52d5ea9 in branch
REL_14_STABLE. BTW, I made a slight adjustments in the function
fetch_table_list to the SQL used to get the publisher-side table information
for version 14.

Since we have REL_15_STABLE now, I also attach the patch for version 15.

Attach the patches.

Regards,
Wang wei

Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Peter Smith
Дата:
Here are some review comments for the v6 patch (HEAD only):

============
HEAD_v6-0001
============

1. Commit message

If there are two publications that publish the parent table and the child table
separately, and both specify the option PUBLISH_VIA_PARTITION_ROOT, subscribing
to both publications from one subscription causes initial copy twice. What we
expect is to be copied only once.

~

I don’t think the parameter even works in uppercase, so maybe better to say:
PUBLISH_VIA_PARTITION_ROOT -> 'publish_via_partition_root'

~~~

2.

What we expect is to be copied only once.

SUGGESTION
It should only be copied once.

~~~

3.

To fix this, we extend the API of the function pg_get_publication_tables.
Now, the function pg_get_publication_tables could receive the publication list.
And then, if we specify option viaroot, we could exclude the partitioned table
whose ancestor belongs to the publication list when getting the table
informations.

~

Don't you mean "partition table" instead of "partitioned table"?

SUGGESTION (also reworded)
To fix this, the API function pg_get_publication_tables has been
extended to take a publication list. Now, when getting the table
information, if the publish_via_partition_root is true, the function
can exclude a partition table whose ancestor is also published by the
same publication list.

======

4. src/backend/catalog/pg_publication.c - pg_get_publication_tables

- publication = GetPublicationByName(pubname, false);
+ arr = PG_GETARG_ARRAYTYPE_P(0);
+ deconstruct_array(arr, TEXTOID, -1, false, TYPALIGN_INT,
+   &elems, NULL, &nelems);

Maybe should have some comment to describe that this function
parameter is now an array of publications names.

~~~

5.

+ /* get Oids of tables from each publication */

Uppercase comment

~~~

6.

+ ArrayType  *arr;
+ Datum    *elems;
+ int nelems,
+ i;
+ Publication *publication;
+ bool viaroot = false;
+ List    *pub_infos = NIL;
+ ListCell   *lc1,
+    *lc2;

The 'publication' should be declared only in the loop that uses it.
It's also not good that this is shadowing the same variable name in a
later declaration.

~~~

7.

+ * Publications support partitioned tables, although all changes
+ * are replicated using leaf partition identity and schema, so we
+ * only need those.
  */
+ if (publication->alltables)
+ current_tables = GetAllTablesPublicationRelations(publication->pubviaroot);
+ else
+ {
+ List    *relids,
+    *schemarelids;
+
+ relids = GetPublicationRelations(publication->oid,
+ publication->pubviaroot ?
+ PUBLICATION_PART_ROOT :
+ PUBLICATION_PART_LEAF);
+ schemarelids = GetAllSchemaPublicationRelations(publication->oid,
+ publication->pubviaroot ?
+ PUBLICATION_PART_ROOT :
+ PUBLICATION_PART_LEAF);
+ current_tables = list_concat(relids, schemarelids);
+ }

Somehow I was confused by this comment because it says you only need
the LEAF tables but then the subsequent code is getting ROOT relations
anyway... Can you clarify the comment some more?

~~~

8.

+ bool viaroot = false;

I think that should have a comment something like:
/* At least one publication is using publish_via_partition_root */

~~~

9.

+ /*
+ * Record the published table and the corresponding publication so
+ * that we can get row filters and column list later.
+ */
+ foreach(lc1, tables)
+ {
+ Oid relid = lfirst_oid(lc1);
+
+ foreach(lc2, pub_infos)
+ {
+ pub_info   *pubinfo = (pub_info *) lfirst(lc2);
+
+ if (list_member_oid(pubinfo->table_list, relid))
+ {
+ Oid    *result = (Oid *) malloc(sizeof(Oid) * 2);
+
+ result[0] = relid;
+ result[1] = pubinfo->pubid;
+
+ results = lappend(results, result);
+ }
+ }
  }

I felt a bit uneasy about the double-looping here. I wonder if these
'results' could have been accumulated within the existing loop over
all publications. Then the results would need to be filtered to remove
the ones associated with removed partitions. Otherwise with 10000
tables and also many publications this (current) double-looping seems
like it might be quite expensive.

======

10. src/backend/commands/subscriptioncmds.c - fetch_table_list

+ if (check_columnlist && server_version >= 160000)

This condition does not make much sense to me. Isn’t it effectively
same as saying
if (server_version >= 150000 && server_version >= 160000)

???

~~~

11.

+ /*
+ * Get the list of tables from publisher, the partitioned table whose
+ * ancestor is also in this list should be ignored, otherwise the
+ * initial date in the partitioned table would be replicated twice.
+ */

11.a
Isn't this comment all backwards? I think you mean to say "partition"
or "partition table" (not partitioned table) because partitions have
ancestors but partition-ED tables don't.


11.b
"initial date" -> "initial data"

======

12. src/test/subscription/t/013_partition.pl

-# Note: We create two separate tables, not a partitioned one, so that we can
-# easily identity through which relation were the changes replicated.
+# Note: We only create one table for the partition table (tab4) here.
+# Because we specify option PUBLISH_VIA_PARTITION_ROOT (see pub_all and
+# pub_lower_level above), all data should be replicated to the partition table.
+# So we do not need to create table for the partitioned table.

12.a
AFAIK "tab4" is the *partitioned* table, not a partition. I think this
comment has all the "partitioned" and "partition" back-to-front.

12.b
Also please say “publish_via_partition_root" instead of
PUBLISH_VIA_PARTITION_ROOT

======

13. src/test/subscription/t/028_row_filter.pl

@@ -723,8 +727,11 @@ is($result, qq(t|1), 'check replicated rows to
tab_rowfilter_toast');
 # - INSERT (16)        YES, 16 > 15
 $result =
   $node_subscriber->safe_psql('postgres',
- "SELECT a FROM tab_rowfilter_viaroot_part");
-is($result, qq(16), 'check replicated rows to tab_rowfilter_viaroot_part');
+ "SELECT a FROM tab_rowfilter_viaroot_part ORDER BY 1");
+is($result, qq(16
+17),
+ 'check replicated rows to tab_rowfilter_viaroot_part'
+);

There is a comment above that code like:
# tab_rowfilter_viaroot_part filter is: (a > 15)
# - INSERT (14)        NO, 14 < 15
# - INSERT (15)        NO, 15 = 15
# - INSERT (16)        YES, 16 > 15

I think should modify that comment to explain the new data this patch
inserts - e.g. NO for 13 and YES for 17...

------
Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Thur, Jul 14, 2022 at 12:46 PM Peter Smith <smithpb2250@gmail.com> wrote:
> Here are some review comments for the v6 patch (HEAD only):

Thanks for your comments.

> 1. Commit message
> 
> If there are two publications that publish the parent table and the child table
> separately, and both specify the option PUBLISH_VIA_PARTITION_ROOT,
> subscribing
> to both publications from one subscription causes initial copy twice. What we
> expect is to be copied only once.
> 
> ~
> 
> I don’t think the parameter even works in uppercase, so maybe better to say:
> PUBLISH_VIA_PARTITION_ROOT -> 'publish_via_partition_root'

It seems that there are more places to use lowercase than uppercase, so
improved it as suggested.

> 2.
> 
> What we expect is to be copied only once.
> 
> SUGGESTION
> It should only be copied once.
> 
> ~~~
> 
> 3.
> 
> To fix this, we extend the API of the function pg_get_publication_tables.
> Now, the function pg_get_publication_tables could receive the publication list.
> And then, if we specify option viaroot, we could exclude the partitioned table
> whose ancestor belongs to the publication list when getting the table
> informations.
> 
> ~
> 
> Don't you mean "partition table" instead of "partitioned table"?
> 
> SUGGESTION (also reworded)
> To fix this, the API function pg_get_publication_tables has been
> extended to take a publication list. Now, when getting the table
> information, if the publish_via_partition_root is true, the function
> can exclude a partition table whose ancestor is also published by the
> same publication list.

Improved and fixed as suggested.

> 4. src/backend/catalog/pg_publication.c - pg_get_publication_tables
> 
> - publication = GetPublicationByName(pubname, false);
> + arr = PG_GETARG_ARRAYTYPE_P(0);
> + deconstruct_array(arr, TEXTOID, -1, false, TYPALIGN_INT,
> +   &elems, NULL, &nelems);
> 
> Maybe should have some comment to describe that this function
> parameter is now an array of publications names.

Add the following comment: `/* Deconstruct the parameter into elements. */`.
Also improved the comment above the function pg_get_publication_tables:
`Returns information of tables in one or more publications.`
-->
`Returns information of the tables in the given publication array.`

> 5.
> 
> + /* get Oids of tables from each publication */
> 
> Uppercase comment

Improved as suggested.

> 6.
> 
> + ArrayType  *arr;
> + Datum    *elems;
> + int nelems,
> + i;
> + Publication *publication;
> + bool viaroot = false;
> + List    *pub_infos = NIL;
> + ListCell   *lc1,
> +    *lc2;
> 
> The 'publication' should be declared only in the loop that uses it.
> It's also not good that this is shadowing the same variable name in a
> later declaration.

Reverted changes to variable "publication" declarations.

> 7.
> 
> + * Publications support partitioned tables, although all changes
> + * are replicated using leaf partition identity and schema, so we
> + * only need those.
>   */
> + if (publication->alltables)
> + current_tables = GetAllTablesPublicationRelations(publication->pubviaroot);
> + else
> + {
> + List    *relids,
> +    *schemarelids;
> +
> + relids = GetPublicationRelations(publication->oid,
> + publication->pubviaroot ?
> + PUBLICATION_PART_ROOT :
> + PUBLICATION_PART_LEAF);
> + schemarelids = GetAllSchemaPublicationRelations(publication->oid,
> + publication->pubviaroot ?
> + PUBLICATION_PART_ROOT :
> + PUBLICATION_PART_LEAF);
> + current_tables = list_concat(relids, schemarelids);
> + }
> 
> Somehow I was confused by this comment because it says you only need
> the LEAF tables but then the subsequent code is getting ROOT relations
> anyway... Can you clarify the comment some more?

I think this is a slight mistake when publication parameter
"publish_via_partition_root" was introduced before.
I improved the comment to the following:
```
Publications support partitioned tables. If
publish_via_partition_root is false, all changes are replicated
using leaf partition identity and schema, so we only need those.
Otherwise, If publish_via_partition_root is true, get the
partitioned table itself.
```

> 8.
> 
> + bool viaroot = false;
> 
> I think that should have a comment something like:
> /* At least one publication is using publish_via_partition_root */

Improved as suggested.

> 9.
> 
> + /*
> + * Record the published table and the corresponding publication so
> + * that we can get row filters and column list later.
> + */
> + foreach(lc1, tables)
> + {
> + Oid relid = lfirst_oid(lc1);
> +
> + foreach(lc2, pub_infos)
> + {
> + pub_info   *pubinfo = (pub_info *) lfirst(lc2);
> +
> + if (list_member_oid(pubinfo->table_list, relid))
> + {
> + Oid    *result = (Oid *) malloc(sizeof(Oid) * 2);
> +
> + result[0] = relid;
> + result[1] = pubinfo->pubid;
> +
> + results = lappend(results, result);
> + }
> + }
>   }
> 
> I felt a bit uneasy about the double-looping here. I wonder if these
> 'results' could have been accumulated within the existing loop over
> all publications. Then the results would need to be filtered to remove
> the ones associated with removed partitions. Otherwise with 10000
> tables and also many publications this (current) double-looping seems
> like it might be quite expensive.

Improved as suggested.

> 10. src/backend/commands/subscriptioncmds.c - fetch_table_list
> 
> + if (check_columnlist && server_version >= 160000)
> 
> This condition does not make much sense to me. Isn’t it effectively
> same as saying
> if (server_version >= 150000 && server_version >= 160000)
> 
> ???

Fixed as suggested.

> 11.
> 
> + /*
> + * Get the list of tables from publisher, the partitioned table whose
> + * ancestor is also in this list should be ignored, otherwise the
> + * initial date in the partitioned table would be replicated twice.
> + */
> 
> 11.a
> Isn't this comment all backwards? I think you mean to say "partition"
> or "partition table" (not partitioned table) because partitions have
> ancestors but partition-ED tables don't.
> 
> 
> 11.b
> "initial date" -> "initial data"

Fixed as suggested.

> 12. src/test/subscription/t/013_partition.pl
> 
> -# Note: We create two separate tables, not a partitioned one, so that we can
> -# easily identity through which relation were the changes replicated.
> +# Note: We only create one table for the partition table (tab4) here.
> +# Because we specify option PUBLISH_VIA_PARTITION_ROOT (see pub_all and
> +# pub_lower_level above), all data should be replicated to the partition table.
> +# So we do not need to create table for the partitioned table.
> 
> 12.a
> AFAIK "tab4" is the *partitioned* table, not a partition. I think this
> comment has all the "partitioned" and "partition" back-to-front.
> 
> 12.b
> Also please say “publish_via_partition_root" instead of
> PUBLISH_VIA_PARTITION_ROOT

Fixed as suggested.

> 13. src/test/subscription/t/028_row_filter.pl
> 
> @@ -723,8 +727,11 @@ is($result, qq(t|1), 'check replicated rows to
> tab_rowfilter_toast');
>  # - INSERT (16)        YES, 16 > 15
>  $result =
>    $node_subscriber->safe_psql('postgres',
> - "SELECT a FROM tab_rowfilter_viaroot_part");
> -is($result, qq(16), 'check replicated rows to tab_rowfilter_viaroot_part');
> + "SELECT a FROM tab_rowfilter_viaroot_part ORDER BY 1");
> +is($result, qq(16
> +17),
> + 'check replicated rows to tab_rowfilter_viaroot_part'
> +);
> 
> There is a comment above that code like:
> # tab_rowfilter_viaroot_part filter is: (a > 15)
> # - INSERT (14)        NO, 14 < 15
> # - INSERT (15)        NO, 15 = 15
> # - INSERT (16)        YES, 16 > 15
> 
> I think should modify that comment to explain the new data this patch
> inserts - e.g. NO for 13 and YES for 17...

Improved as suggested.

I also improved the patches for back-branch according to some of Peter's
comments and added the back-branch patch for REL_13.
In addition, in the patch (REL15_v6) I attached for REL15 in [1], I forgot to
remove the modification to the function pg_get_publication_tables. I removed
related modifications now (REL15_v7).

Attach the new patches.

[1] -
https://www.postgresql.org/message-id/OS3PR01MB6275AFA91925615A4AA782D09EBD9%40OS3PR01MB6275.jpnprd01.prod.outlook.com

Regards,
Wang wei

Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Peter Smith
Дата:
Here are some review comments for the HEAD_v7-0001 patch:

======

1. <General>

I have a fundamental question about this patch.

IIUC the purpose of this patch is to ensure that (when
publish_via_root = true) the copy of the partition data will happen
only once (e.g. from one parent table on one of the publishers). But I
think there is no guarantee that those 2 publishers even had the same
data, right? Therefore it seems to me you could get different results
if the data were copied from pub1 or from pub2. (I have not tried it -
this is just my suspicion).

Am I correct or mistaken? If correct, then it means there is a big
(but subtle) difference related to the ordering of processing ...  A)
is this explicitly documented so the user knows what data to expect?
B) is the effect of different ordering tested anywhere? Actually, I
have no idea what exactly determines the order – is it the original
CREATE SUBSCRIPTION publication list order? Is it the logic of the
pg_get_publication_tables function? Is it the SQL in function
fetch_table_list? Or is it not deterministic at all? Please confirm
it.

======

2. Commit message.

2a.

If there are two publications that publish the parent table and the child table
separately, and both specify the option publish_via_partition_root, subscribing
to both publications from one subscription causes initial copy twice.

SUGGESTION
If there are two publications that publish the parent table and the child table
respectively, but both specify publish_via_partition_root = true, subscribing
to both publications from one subscription causes initial copy twice.

2b. <General>

Actually, isn't it more subtle than what that comment is describing?
Maybe nobody is explicitly publishing a parent table at all. Maybe
pub1 publishes partition1 and pub2 publishes partition2, but both
publications are using publish_via_partition_root = true. Is this
scenario even tested? Does the logic of pg_get_publication_tables
cover this scenario?

======

3. src/backend/catalog/pg_publication.c - pg_get_publication_tables

 pg_get_publication_tables(PG_FUNCTION_ARGS)
 {
 #define NUM_PUBLICATION_TABLES_ELEM 3
- FuncCallContext *funcctx;
- char    *pubname = text_to_cstring(PG_GETARG_TEXT_PP(0));
- Publication *publication;
- List    *tables;
+ FuncCallContext *funcctx;
+ Publication *publication;

Something seems strange about having a common Publication declaration.
Firstly it is used to represent every publication element in the array
loop. Later, it is overwritten to represent a single publication.

I think it might be easier if you declare these variables separately:

E.g.1
Publication *pub_elem; -- for the array element processing declared
within the for loop

E.g.2
Publication *pub; -- declared within if (funcctx->call_cntr <
list_length(results))

~~~

4.

+ /* Filter by final published table. */
+ foreach(lc, results)
+ {
+ Oid *table_info = (Oid *) lfirst(lc);
+
+ if (!list_member_oid(tables, table_info[0]))
+ results = foreach_delete_current(results, lc);
  }

The comment did not convey enough meaning. Can you make it more
descriptive to explain why/what the logic is doing here?

======

5. src/backend/commands/subscriptioncmds.c - fetch_table_list

  /* Get column lists for each relation if the publisher supports it */
- if (check_columnlist)
- appendStringInfoString(&cmd, ", t.attnames\n");
+ if (server_version >= 160000)
+ appendStringInfo(&cmd, "SELECT DISTINCT n.nspname, c.relname,\n"

That comment is exactly the same as it was before the patch. But it
doesn't seem quite appropriate anymore for this new condition and this
new query.

~~~

6.

+ /*
+ * Get the list of tables from publisher, the partition table whose
+ * ancestor is also in this list should be ignored, otherwise the
+ * initial data in the partition table would be replicated twice.
+ */

Why say "should be ignored" --  don’t you mean "will be" or "must be" or "is".

~~~

7.

  initStringInfo(&cmd);
- appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, t.tablename \n");

  /* Get column lists for each relation if the publisher supports it */
- if (check_columnlist)
- appendStringInfoString(&cmd, ", t.attnames\n");
+ if (server_version >= 160000)
+ appendStringInfo(&cmd, "SELECT DISTINCT n.nspname, c.relname,\n"
+ "              ( CASE WHEN (array_length(gpt.attrs, 1) = c.relnatts)\n"
+ "                     THEN NULL ELSE gpt.attrs END\n"
+ "              ) AS attnames\n"
+ " FROM pg_class c\n"
+ "   JOIN pg_namespace n ON n.oid = c.relnamespace\n"
+ "   JOIN ( SELECT (pg_get_publication_tables(VARIADIC
array_agg(pubname::text))).*\n"
+ "          FROM pg_publication\n"
+ "          WHERE pubname IN ( %s )) as gpt\n"
+ "       ON gpt.relid = c.oid\n",
+ pub_names.data);
+ else
+ {
+ /*
+ * Get the list of tables from publisher, the partition table whose
+ * ancestor is also in this list should be ignored, otherwise the
+ * initial data in the partition table would be replicated twice.
+ */

- appendStringInfoString(&cmd, "FROM pg_catalog.pg_publication_tables t\n"
-    " WHERE t.pubname IN (");
- get_publications_str(publications, &cmd, true);
- appendStringInfoChar(&cmd, ')');
+ appendStringInfoString(&cmd, "WITH pub_tabs AS(\n"
+    " SELECT DISTINCT N.nspname, C.oid, C.relname, C.relispartition\n");
+
+ /* Get column lists for each relation if the publisher supports it */
+ if (check_columnlist)
+ appendStringInfoString(&cmd, ",( CASE WHEN (array_length(gpt.attrs,
1) = c.relnatts)\n"
+    "              THEN NULL ELSE gpt.attrs END\n"
+    "       ) AS attnames\n");
+
+ appendStringInfo(&cmd, " FROM pg_publication P,\n"
+ "      LATERAL pg_get_publication_tables(P.pubname) GPT,\n"
+ "      pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)\n"
+ "  WHERE C.oid = GPT.relid AND P.pubname IN ( %s )\n"
+ ")\n"
+ "SELECT DISTINCT pub_tabs.nspname, pub_tabs.relname\n",
+ pub_names.data);
+
+ /* Get column lists for each relation if the publisher supports it */
+ if (check_columnlist)
+ appendStringInfoString(&cmd, ", pub_tabs.attnames\n");
+
+ appendStringInfoString(&cmd, "FROM pub_tabs\n"
+    " WHERE (pub_tabs.relispartition IS FALSE\n"
+    "  OR NOT EXISTS (SELECT 1 FROM
pg_partition_ancestors(pub_tabs.oid) as pa\n"
+    "                  WHERE pa.relid IN (SELECT pub_tabs.oid FROM pub_tabs)\n"
+    "                   AND pa.relid != pub_tabs.oid))\n");
+ }

Please use a consistent case for all the SQL aliases. E.g "gpt" versus
"GPT", "c" versus "C", etc.

======

8. src/test/subscription/t/013_partition.pl

+# Note: We only create one table for the partitioned table (tab4) here. Because
+# we specify option "publish_via_partition_root" (see pub_all and
+# pub_lower_level above), all data should be replicated to the partitioned
+# table. So we do not need to create table for the partition table.

"replicated to the partitioned table" ??

The entire comment seems a bit misleading because how can we call the
subscriber table a "partitioned" table when it has no partitions?!

SUGGESTION (maybe?)
Note: We only create one table (tab4) here. We specified
publish_via_partition_root = true (see pub_all and pub_lower_level
above), so all data will be replicated to that table.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Data is copied twice when specifying both child and parent table in publication

От
"houzj.fnst@fujitsu.com"
Дата:
On Thursday, July 28, 2022 5:17 PM Peter Smith <smithpb2250@gmail.com> wrote:
> Here are some review comments for the HEAD_v7-0001 patch:
> 
> ======
> 
> 1. <General>
> 
> I have a fundamental question about this patch.
> 
> IIUC the purpose of this patch is to ensure that (when
> publish_via_root = true) the copy of the partition data will happen
> only once (e.g. from one parent table on one of the publishers). But I
> think there is no guarantee that those 2 publishers even had the same
> data, right? Therefore it seems to me you could get different results
> if the data were copied from pub1 or from pub2. (I have not tried it -
> this is just my suspicion).
> 
> Am I correct or mistaken?

Since the subscribed publishers are combined with OR and are from the same database.
And we are trying to copy the data from the top most parent table, so I think the results
should be as expected.

Best regards,
Hou zj

RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Thur, Jul 28, 2022 at 17:17 PM Peter Smith <smithpb2250@gmail.com> wrote:
> Here are some review comments for the HEAD_v7-0001 patch:

Thanks for your comments.

> 2. Commit message.
> 
> 2a.
> 
> If there are two publications that publish the parent table and the child table
> separately, and both specify the option publish_via_partition_root, subscribing
> to both publications from one subscription causes initial copy twice.
> 
> SUGGESTION
> If there are two publications that publish the parent table and the child table
> respectively, but both specify publish_via_partition_root = true, subscribing
> to both publications from one subscription causes initial copy twice.
> 
> 2b. <General>
> 
> Actually, isn't it more subtle than what that comment is describing?
> Maybe nobody is explicitly publishing a parent table at all. Maybe
> pub1 publishes partition1 and pub2 publishes partition2, but both
> publications are using publish_via_partition_root = true. Is this
> scenario even tested? Does the logic of pg_get_publication_tables
> cover this scenario?

=>2a.
Okay, changed it as suggested.

=>2b.
This is not the case we are trying to fix. The problematic scenario is when the
a parent table is published via root partitioned table and in this case we need
to ignore other partitions. And I try to improve the commit message to make it
clear.

> 4.
> 
> + /* Filter by final published table. */
> + foreach(lc, results)
> + {
> + Oid *table_info = (Oid *) lfirst(lc);
> +
> + if (!list_member_oid(tables, table_info[0]))
> + results = foreach_delete_current(results, lc);
>   }
> 
> The comment did not convey enough meaning. Can you make it more
> descriptive to explain why/what the logic is doing here?

I think the comments above `tables = filter_partitions(tables);` explain this.

> 5. src/backend/commands/subscriptioncmds.c - fetch_table_list
> 
>   /* Get column lists for each relation if the publisher supports it */
> - if (check_columnlist)
> - appendStringInfoString(&cmd, ", t.attnames\n");
> + if (server_version >= 160000)
> + appendStringInfo(&cmd, "SELECT DISTINCT n.nspname, c.relname,\n"
> 
> That comment is exactly the same as it was before the patch. But it
> doesn't seem quite appropriate anymore for this new condition and this
> new query.

Improved the comments as following:
```
Get information of the tables belonging to the specified publications
```

The rest of the comments are improved as suggested.
I also rebased the patch based on the commit (0c20dd3) on HEAD, and made some
changes to the back-branch patches based on some of Peter's comments.

Attach the new patches.

Regards,
Wang wei

Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Peter Smith
Дата:
Here are some review comment for the HEAD_v8 patch:

======

1. Commit message

If there are two publications, one of them publish a parent table with
(publish_via_partition_root = true) and another publish child table,
subscribing to both publications from one subscription results in two initial
replications. It should only be copied once.

SUGGESTION (Note**)
If there are two publications - one of them publishing a parent table
(using publish_via_partition_root = true) and the other is publishing
one of the parent's child tables - then subscribing to both
publications from one subscription results in the same initial child
data being copied twice. It should only be copied once.


Note** - I've only reworded the original commit message slightly but
otherwise left it saying the same thing. But I still have doubts if
this message actually covers all the cases the patch is trying to
address. e.g. The comment (see below) in the 'fetch_table_list'
function seemed to cover more cases than what this commit message is
describing.
/*
* Get the list of tables from publisher, the partition table whose
* ancestor is also in this list will be ignored, otherwise the initial
* data in the partition table would be replicated twice.
*/


======

2. src/backend/catalog/pg_publication.c - pg_get_publication_tables

2a.
 /*
- * Returns information of tables in a publication.
+ * Returns information of the tables in the given publication array.
  */

What does "information of the tables" actually mean? Is it just the
names of the tables; is it more than that? IMO a longer, more
explanatory comment will be better here instead of a brief ambiguous
one.


2b.
+ *results = NIL;

This variable name 'results' is too generic, so it is not helpful when
trying to understand the subsequent code logic. Please give this a
meaningful name/description.

2c.
/* Deconstruct the parameter into elements. */

SUGGESTION
Deconstruct the parameter into elements where each element is a
publication name.

2d.
+ List *current_tables = NIL;

I think this is the tables only on the current pub_elem, so I thought
'pub_elem_tables' might make it easier to understand this list's
meaning.

2e.
+ /* Now sort and de-duplicate the result list */
+ list_sort(tables, list_oid_cmp);
+ list_deduplicate_oid(tables);

IMO this comment is confusing because there is another list that is
called the 'results' list, but that is not the same list you are
processing here. Also, it does not really add anything helpful to just
have comments saying almost the same as the function names
(sort/de-duplicate), so please give longer comments to say the reason
*why* the logic does this rather than just describing the steps.

2f.
+ /* Filter by final published table */
+ foreach(lc, results)

Please make this comment more descriptive to explain why/what the
logic is doing.

======

3. src/backend/commands/subscriptioncmds.c - fetch_table_list

3a.
+ bool check_columnlist = (server_version >= 150000);

Given the assignment, maybe 'columnlist_supported' is a better name?

3b.
+ /* Get information of the tables belonging to the specified publications */

For  "Get information of..." can you elaborate *what* table
information this is getting and why?

3c.
+ if (server_version >= 160000)
+ appendStringInfo(&cmd, "SELECT DISTINCT N.nspname, C.relname,\n"
+ "              ( CASE WHEN (array_length(GPT.attrs, 1) = C.relnatts)\n"
+ "                     THEN NULL ELSE GPT.attrs END\n"
+ "              ) AS attnames\n"
+ " FROM pg_class C\n"
+ "   JOIN pg_namespace N ON N.oid = C.relnamespace\n"
+ "   JOIN ( SELECT (pg_get_publication_tables(VARIADIC
array_agg(pubname::text))).*\n"
+ "          FROM pg_publication\n"
+ "          WHERE pubname IN ( %s )) as GPT\n"
+ "       ON GPT.relid = C.oid\n",
+ pub_names.data);

AFAICT the main reason for this check was to decide if you can use the
new version of 'pg_get_publication_tables' that supports the VARIADIC
array of pub names or not. If that is correct, then maybe the comment
should describe that reason, or maybe add another bool var similar to
the 'check_columnlist' one for this.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Data is copied twice when specifying both child and parent table in publication

От
Peter Smith
Дата:
Here are some more review comments for the HEAD_v8 patch:

======

1. Commit message

If there are two publications, one of them publish a parent table with
(publish_via_partition_root = true) and another publish child table,
subscribing to both publications from one subscription results in two initial
replications. It should only be copied once.

~

I took a 2nd look at that commit message and it seemed slightly
backwards to me - e.g. don't you really mean for the
'publish_via_partition_root' parameter to be used when publishing the
*child* table?

SUGGESTION
If there are two publications, one of them publishing a parent table
directly, and the other publishing a child table with
publish_via_partition_root = true, then subscribing to both those
publications from one subscription results in two initial
replications. It should only be copied once.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Data is copied twice when specifying both child and parent table in publication

От
"houzj.fnst@fujitsu.com"
Дата:
On Wednesday, August 10, 2022 7:45 AM Peter Smith <smithpb2250@gmail.com> wrote:
> 
> Here are some more review comments for the HEAD_v8 patch:
> 
> ======
> 
> 1. Commit message
> 
> If there are two publications, one of them publish a parent table with
> (publish_via_partition_root = true) and another publish child table, subscribing
> to both publications from one subscription results in two initial replications. It
> should only be copied once.
> 
> ~
> 
> I took a 2nd look at that commit message and it seemed slightly backwards to
> me - e.g. don't you really mean for the 'publish_via_partition_root' parameter
> to be used when publishing the
> *child* table?

I'm not sure about this, I think we are trying to fix the bug when
'publish_via_partition_root' is used when publishing the parent table.

For this case(via_root used when publishing parent):

CREATE PUBLICATION pub1 for TABLE parent with(publish_via_partition_root);
CREATE PUBLICATION pub2 for TABLE child;
CREATE SUBSCRIPTION sub connect xxx PUBLICATION pub1,pub2;

The expected behavior is only the parent table is published, all the changes
should be replicated using the parent table's identity. So, we should only do
initial sync for the parent table once, but we currently will do table sync for
both parent and child which we think is a bug.

For another case you mentioned(via_root used when publishing child)

CREATE PUBLICATION pub1 for TABLE parent;
CREATE PUBLICATION pub2 for TABLE child with (publish_via_partition_root);
CREATE SUBSCRIPTION sub connect xxx PUBLICATION pub1,pub2;

The expected behavior is only the child table is published, all the changes
should be replicated using the child table's identity. We should do table sync
only for child tables and is same as the current behavior on HEAD. So, I think
there is no bug in this case.

Best regards,
Hou zj


RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Tues, Aug 9, 2022 at 15:15 PM Peter Smith <smithpb2250@gmail.com> wrote:
> Here are some review comment for the HEAD_v8 patch:

Thanks for your comments.

> 1. Commit message
> 
> If there are two publications, one of them publish a parent table with
> (publish_via_partition_root = true) and another publish child table,
> subscribing to both publications from one subscription results in two initial
> replications. It should only be copied once.
> 
> SUGGESTION (Note**)
> If there are two publications - one of them publishing a parent table
> (using publish_via_partition_root = true) and the other is publishing
> one of the parent's child tables - then subscribing to both
> publications from one subscription results in the same initial child
> data being copied twice. It should only be copied once.
> 
> 
> Note** - I've only reworded the original commit message slightly but
> otherwise left it saying the same thing. But I still have doubts if
> this message actually covers all the cases the patch is trying to
> address. e.g. The comment (see below) in the 'fetch_table_list'
> function seemed to cover more cases than what this commit message is
> describing.
> /*
> * Get the list of tables from publisher, the partition table whose
> * ancestor is also in this list will be ignored, otherwise the initial
> * data in the partition table would be replicated twice.
> */

=> commit message
Changed.

=> Note**
I think the commit message and the comment you mentioned refer to the same kind
of scenario.

> 2. src/backend/catalog/pg_publication.c - pg_get_publication_tables
> 
> 2a.
>  /*
> - * Returns information of tables in a publication.
> + * Returns information of the tables in the given publication array.
>   */
> 
> What does "information of the tables" actually mean? Is it just the
> names of the tables; is it more than that? IMO a longer, more
> explanatory comment will be better here instead of a brief ambiguous
> one.

Changed as below:
```
Get information of the tables in the given publication array.

Returns the oid, column list, row filter for each table.
```

> 2b.
> + *results = NIL;
> 
> This variable name 'results' is too generic, so it is not helpful when
> trying to understand the subsequent code logic. Please give this a
> meaningful name/description.

Changed the variable name as below:
results -> table_infos

> 2c.
> /* Deconstruct the parameter into elements. */
> 
> SUGGESTION
> Deconstruct the parameter into elements where each element is a
> publication name.

Changed.

> 2d.
> + List *current_tables = NIL;
> 
> I think this is the tables only on the current pub_elem, so I thought
> 'pub_elem_tables' might make it easier to understand this list's
> meaning.

Changed.

> 2e.
> + /* Now sort and de-duplicate the result list */
> + list_sort(tables, list_oid_cmp);
> + list_deduplicate_oid(tables);
> 
> IMO this comment is confusing because there is another list that is
> called the 'results' list, but that is not the same list you are
> processing here. Also, it does not really add anything helpful to just
> have comments saying almost the same as the function names
> (sort/de-duplicate), so please give longer comments to say the reason
> *why* the logic does this rather than just describing the steps.

Fixed the comment. ("result" -> "tables")
I think the purpose of these two functions is clear. The reason I added the
comment here is for consistency with other calling locations.

> 2f.
> + /* Filter by final published table */
> + foreach(lc, results)
> 
> Please make this comment more descriptive to explain why/what the
> logic is doing.

Changed as below:
```
For tables that have been filtered out, delete the corresponding
table information in the table_infos list.
```

> 3. src/backend/commands/subscriptioncmds.c - fetch_table_list
> 
> 3a.
> + bool check_columnlist = (server_version >= 150000);
> 
> Given the assignment, maybe 'columnlist_supported' is a better name?

I am not sure if this name could be changed in this thread.

> 3b.
> + /* Get information of the tables belonging to the specified publications */
> 
> For  "Get information of..." can you elaborate *what* table
> information this is getting and why?

I am not sure if we need to add a reason.
So, I only added what information we are going to get:
```
Get namespace, relname and column list (if supported) of the tables
belonging to the specified publications.
```

> 3c.
> + if (server_version >= 160000)
> + appendStringInfo(&cmd, "SELECT DISTINCT N.nspname, C.relname,\n"
> + "              ( CASE WHEN (array_length(GPT.attrs, 1) = C.relnatts)\n"
> + "                     THEN NULL ELSE GPT.attrs END\n"
> + "              ) AS attnames\n"
> + " FROM pg_class C\n"
> + "   JOIN pg_namespace N ON N.oid = C.relnamespace\n"
> + "   JOIN ( SELECT (pg_get_publication_tables(VARIADIC
> array_agg(pubname::text))).*\n"
> + "          FROM pg_publication\n"
> + "          WHERE pubname IN ( %s )) as GPT\n"
> + "       ON GPT.relid = C.oid\n",
> + pub_names.data);
> 
> AFAICT the main reason for this check was to decide if you can use the
> new version of 'pg_get_publication_tables' that supports the VARIADIC
> array of pub names or not. If that is correct, then maybe the comment
> should describe that reason, or maybe add another bool var similar to
> the 'check_columnlist' one for this.

I added the comment as following:
```
From version 16, the parameter of the function pg_get_publication_tables
can be an array of publications. The partition table whose ancestor is
also published in this publication array will be filtered out in this
function.
```

I also rebased the REL_15 patch based on the commit (15014b8), and made some
changes to the back-branch patches based on Peter's suggestions.

Attach the new patches.

Regards,
Wang wei

Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Peter Smith
Дата:
FYI, I'm not sure why the cfbot hasn't reported this, but the apply v9
patch failed for me on HEAD as below:

[postgres@CentOS7-x64 oss_postgres_misc]$ git apply
../patches_misc/HEAD_v9-0001-Fix-data-replicated-twice-when-specifying-publish.patch
--verbose
Checking patch src/backend/catalog/pg_publication.c...
Checking patch src/backend/commands/subscriptioncmds.c...
Hunk #1 succeeded at 1917 (offset 123 lines).
Checking patch src/include/catalog/pg_proc.dat...
Hunk #1 succeeded at 11607 (offset -74 lines).
Checking patch src/test/regress/expected/rules.out...
error: while searching for:
             JOIN pg_attribute a ON (((a.attrelid = gpt.relid) AND
(a.attnum = k.k))))) AS attnames,
    pg_get_expr(gpt.qual, gpt.relid) AS rowfilter
   FROM pg_publication p,
    LATERAL pg_get_publication_tables((p.pubname)::text) gpt(relid,
attrs, qual),
    (pg_class c
     JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
  WHERE (c.oid = gpt.relid);

error: patch failed: src/test/regress/expected/rules.out:1449
error: src/test/regress/expected/rules.out: patch does not apply
Checking patch src/test/subscription/t/013_partition.pl...
Checking patch src/test/subscription/t/028_row_filter.pl...
Checking patch src/test/subscription/t/031_column_list.pl...

------
Kind Regards,
Peter Smith.
Fujitsu Australia.



RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Mon, Sept 19, 2022 at 14:52 PM Peter Smith <smithpb2250@gmail.com> wrote:
> FYI, I'm not sure why the cfbot hasn't reported this, but the apply v9
> patch failed for me on HEAD as below:
> 
> [postgres@CentOS7-x64 oss_postgres_misc]$ git apply
> ../patches_misc/HEAD_v9-0001-Fix-data-replicated-twice-when-specifying-
> publish.patch
> --verbose
> Checking patch src/backend/catalog/pg_publication.c...
> Checking patch src/backend/commands/subscriptioncmds.c...
> Hunk #1 succeeded at 1917 (offset 123 lines).
> Checking patch src/include/catalog/pg_proc.dat...
> Hunk #1 succeeded at 11607 (offset -74 lines).
> Checking patch src/test/regress/expected/rules.out...
> error: while searching for:
>              JOIN pg_attribute a ON (((a.attrelid = gpt.relid) AND
> (a.attnum = k.k))))) AS attnames,
>     pg_get_expr(gpt.qual, gpt.relid) AS rowfilter
>    FROM pg_publication p,
>     LATERAL pg_get_publication_tables((p.pubname)::text) gpt(relid,
> attrs, qual),
>     (pg_class c
>      JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
>   WHERE (c.oid = gpt.relid);
> 
> error: patch failed: src/test/regress/expected/rules.out:1449
> error: src/test/regress/expected/rules.out: patch does not apply
> Checking patch src/test/subscription/t/013_partition.pl...
> Checking patch src/test/subscription/t/028_row_filter.pl...
> Checking patch src/test/subscription/t/031_column_list.pl...

Thanks for your kindly reminder.

Rebased the patch based on the changes in HEAD (20b6847).
Attach the new patches.

Regards,
Wang wei

Вложения

RE: Data is copied twice when specifying both child and parent table in publication

От
"osumi.takamichi@fujitsu.com"
Дата:
On Tuesday, September 20, 2022 3:18 PM Wang, Wei/王 威 <wangw.fnst@fujitsu.com> wrote:
> Rebased the patch based on the changes in HEAD (20b6847).
> Attach the new patches.
Hi, thank you for updating the patchset.


FYI, I noticed that the patch for head is no longer applicable.

$ git apply --check HEAD_v10-0001-Fix-data-replicated-twice-when-specifying-publis.patch
error: patch failed: src/backend/catalog/pg_publication.c:1097
error: src/backend/catalog/pg_publication.c: patch does not apply


Also, one minor comment on the change in src/include/catalog/pg_proc.dat.

 # publications
-{ oid => '6119', descr => 'get information of tables in a publication',
-  proname => 'pg_get_publication_tables', prorows => '1000', proretset => 't',
-  provolatile => 's', prorettype => 'record', proargtypes => 'text',
-  proallargtypes => '{text,oid,int2vector,pg_node_tree}',
-  proargmodes => '{i,o,o,o}', proargnames => '{pubname,relid,attrs,qual}',
+{ oid => '6119',
+  descr => 'get information of the tables belonging to the specified publications.',

Please remove the period at the end of 'descr' string.
It seems we don't write it in this file and removing it makes the code more aligned.



Best Regards,
    Takamichi Osumi


RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Mon, Sep 26, 2022 at 10:31 AM Osumi, Takamichi/大墨 昂道 <osumi.takamichi@fujitsu.com> wrote:
> Hi, thank you for updating the patchset.
> 
> 
> FYI, I noticed that the patch for head is no longer applicable.

Thanks for your kindly reminder and comment.

> $ git apply --check HEAD_v10-0001-Fix-data-replicated-twice-when-specifying-
> publis.patch
> error: patch failed: src/backend/catalog/pg_publication.c:1097
> error: src/backend/catalog/pg_publication.c: patch does not apply

Rebased the patch based on the changes in HEAD (13a185f).

> Also, one minor comment on the change in src/include/catalog/pg_proc.dat.
> 
>  # publications
> -{ oid => '6119', descr => 'get information of tables in a publication',
> -  proname => 'pg_get_publication_tables', prorows => '1000', proretset => 't',
> -  provolatile => 's', prorettype => 'record', proargtypes => 'text',
> -  proallargtypes => '{text,oid,int2vector,pg_node_tree}',
> -  proargmodes => '{i,o,o,o}', proargnames => '{pubname,relid,attrs,qual}',
> +{ oid => '6119',
> +  descr => 'get information of the tables belonging to the specified
> publications.',
> 
> Please remove the period at the end of 'descr' string.
> It seems we don't write it in this file and removing it makes the code more
> aligned.

Improved as suggested.
Also modified the description to be consistent with the comments atop the
function pg_get_publication_tables.

Attach the new patches.

Regards,
Wang wei

Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Peter Smith
Дата:
Here are my review comments for the HEAD_v11-0001 patch:

======

1. General - Another related bug?

In [1] Hou-san wrote:

For another case you mentioned (via_root used when publishing child)
CREATE PUBLICATION pub1 for TABLE parent;
CREATE PUBLICATION pub2 for TABLE child with (publish_via_partition_root);
CREATE SUBSCRIPTION sub connect xxx PUBLICATION pub1,pub2;

The expected behavior is only the child table is published, all the changes
should be replicated using the child table's identity. We should do table sync
only for child tables and is same as the current behavior on HEAD. So, I think
there is no bug in this case.

~

That behaviour seems different to my understanding because the pgdocs
says when the via_root param is true the 'child' table would be using
the 'parent' identity:

[2] publish_via_partition_root determines whether changes in a
partitioned table (or on its partitions) contained in the publication
will be published using the identity and schema of the partitioned
table rather than that of the individual partitions that are actually
changed.

~

So is this another bug (slightly different from the current one being
patched), or is it just some different special behaviour? If it's
another bug then you need to know that ASAP because I think you may
want to fix both of them at the same time which might impact how this
2x data copy patch should be implemented.

Or perhaps just the pgdocs need more notes about special
cases/combinations like this?

======

2. General - documentation?

For this current patch, IIUC it was decided that it is a bug because
the data gets duplicated, and then some sensible rule was decided that
this patch should use to address it (e.g. publishing a child combined
with publishing a parent via_root will just ignore the child's
publication...).

So my question is - is this (new/fixed) behaviour something that a
user will be able to figure out themselves from the existing
documentation, or does this patch now need its own special notes in
the documentation?

======

3. src/backend/catalog/pg_publication.c - pg_get_publication_tables

+ foreach(lc, pub_elem_tables)
+ {
+ Oid *result = (Oid *) malloc(sizeof(Oid) * 2);
+
+ result[0] = lfirst_oid(lc);
+ result[1] = pub_elem->oid;
+ table_infos = lappend(table_infos, result);
+ }

3a.
It looks like each element in the table_infos list is a malloced obj
of 2x Oids (Oid of table, Oid of pub). IMO better to call this element
'table_info' instead of the meaningless 'result'

~

3b.
Actually, I think it would be better if this function defines a little
2-element structure {Oid relid, Oid pubid} to use instead of this
array (which requires knowledge that [0] means relid and [1] means
pubid).

~~~

4.

+ foreach(lc, table_infos)
+ {
+ Oid *table_info_tmp = (Oid *) lfirst(lc);
+
+ if (!list_member_oid(tables, table_info_tmp[0]))
+ table_infos = foreach_delete_current(table_infos, lc);
  }
I think the '_tmp' suffix is not helpful here - IMO having another
relid variable would make this more self-explanatory.

Or better yet adopt the suggestion o f #3b and have a little struct
with self-explanatory member names.

=====

5. src/backend/commands/subscriptioncmds.c - fetch_table_list

+ if (server_version >= 160000)
+ appendStringInfo(&cmd, "SELECT DISTINCT N.nspname, C.relname,\n"

Since there is an else statement block, I think this would be more
readable if there was a statement block here too. YMMV

SUGGESTION
if (server_version >= 160000)
{
appendStringInfo(&cmd, "SELECT DISTINCT N.nspname, C.relname,\n"
...
}

~~~

6.

+ /*
+ * Get the list of tables from publisher, the partition table whose
+ * ancestor is also in this list will be ignored, otherwise the initial
+ * data in the partition table would be replicated twice.
+ */

6a.
"from publisher, the partition" -> "from the publisher. The partition"

~

6b.
This looks like a common comment that also applied to the "if" part,
so it seems more appropriate to move it to where I indicated below.
Perhaps the whole comment needs a bit of massaging after you move
it...

+ /*
+ * Get namespace, relname and column list (if supported) of the tables
+ * belonging to the specified publications.
+ *
+ * HERE <<<<<<<<<
+ *
+ * From version 16, the parameter of the function pg_get_publication_tables
+ * can be an array of publications. The partition table whose ancestor is
+ * also published in this publication array will be filtered out in this
+ * function.
+ */


------
[1]
https://www.postgresql.org/message-id/OS0PR01MB5716A30DDEECC59132E1084F94799%40OS0PR01MB5716.jpnprd01.prod.outlook.com
[2] https://www.postgresql.org/docs/devel/sql-createpublication.html

Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Tues, Sep 27, 2022 at 16:45 PM Peter Smith <smithpb2250@gmail.com> wrote:
> Here are my review comments for the HEAD_v11-0001 patch:

Thanks for your comments.

> ======
> 
> 1. General - Another related bug?
> 
> In [1] Hou-san wrote:
> 
> For another case you mentioned (via_root used when publishing child)
> CREATE PUBLICATION pub1 for TABLE parent;
> CREATE PUBLICATION pub2 for TABLE child with (publish_via_partition_root);
> CREATE SUBSCRIPTION sub connect xxx PUBLICATION pub1,pub2;
> 
> The expected behavior is only the child table is published, all the changes
> should be replicated using the child table's identity. We should do table sync
> only for child tables and is same as the current behavior on HEAD. So, I think
> there is no bug in this case.
> 
> ~
> 
> That behaviour seems different to my understanding because the pgdocs
> says when the via_root param is true the 'child' table would be using
> the 'parent' identity:
> 
> [2] publish_via_partition_root determines whether changes in a
> partitioned table (or on its partitions) contained in the publication
> will be published using the identity and schema of the partitioned
> table rather than that of the individual partitions that are actually
> changed.
> 
> ~
> 
> So is this another bug (slightly different from the current one being
> patched), or is it just some different special behaviour? If it's
> another bug then you need to know that ASAP because I think you may
> want to fix both of them at the same time which might impact how this
> 2x data copy patch should be implemented.
> 
> Or perhaps just the pgdocs need more notes about special
> cases/combinations like this?
> 
> ======
> 
> 2. General - documentation?
> 
> For this current patch, IIUC it was decided that it is a bug because
> the data gets duplicated, and then some sensible rule was decided that
> this patch should use to address it (e.g. publishing a child combined
> with publishing a parent via_root will just ignore the child's
> publication...).
> 
> So my question is - is this (new/fixed) behaviour something that a
> user will be able to figure out themselves from the existing
> documentation, or does this patch now need its own special notes in
> the documentation?

IMO this behaviour doesn't look like a bug.
I think the behaviour of multiple publications with parameter
publish_via_partition_root could be added to the pg-doc later in a separate
patch.

> ======
> 
> 3. src/backend/catalog/pg_publication.c - pg_get_publication_tables
> 
> + foreach(lc, pub_elem_tables)
> + {
> + Oid *result = (Oid *) malloc(sizeof(Oid) * 2);
> +
> + result[0] = lfirst_oid(lc);
> + result[1] = pub_elem->oid;
> + table_infos = lappend(table_infos, result);
> + }
> 
> 3a.
> It looks like each element in the table_infos list is a malloced obj
> of 2x Oids (Oid of table, Oid of pub). IMO better to call this element
> 'table_info' instead of the meaningless 'result'
> 
> ~
> 
> 3b.
> Actually, I think it would be better if this function defines a little
> 2-element structure {Oid relid, Oid pubid} to use instead of this
> array (which requires knowledge that [0] means relid and [1] means
> pubid).
> 
> ~~~
> 
> 4.
> 
> + foreach(lc, table_infos)
> + {
> + Oid *table_info_tmp = (Oid *) lfirst(lc);
> +
> + if (!list_member_oid(tables, table_info_tmp[0]))
> + table_infos = foreach_delete_current(table_infos, lc);
>   }
> I think the '_tmp' suffix is not helpful here - IMO having another
> relid variable would make this more self-explanatory.
> 
> Or better yet adopt the suggestion o f #3b and have a little struct
> with self-explanatory member names.

Improved as suggested.

> =====
> 
> 5. src/backend/commands/subscriptioncmds.c - fetch_table_list
> 
> + if (server_version >= 160000)
> + appendStringInfo(&cmd, "SELECT DISTINCT N.nspname, C.relname,\n"
> 
> Since there is an else statement block, I think this would be more
> readable if there was a statement block here too. YMMV
> 
> SUGGESTION
> if (server_version >= 160000)
> {
> appendStringInfo(&cmd, "SELECT DISTINCT N.nspname, C.relname,\n"
> ...
> }

Improved as suggested.

> ~~~
> 
> 6.
> 
> + /*
> + * Get the list of tables from publisher, the partition table whose
> + * ancestor is also in this list will be ignored, otherwise the initial
> + * data in the partition table would be replicated twice.
> + */
> 
> 6a.
> "from publisher, the partition" -> "from the publisher. The partition"
> 
> ~
> 
> 6b.
> This looks like a common comment that also applied to the "if" part,
> so it seems more appropriate to move it to where I indicated below.
> Perhaps the whole comment needs a bit of massaging after you move
> it...
> 
> + /*
> + * Get namespace, relname and column list (if supported) of the tables
> + * belonging to the specified publications.
> + *
> + * HERE <<<<<<<<<
> + *
> + * From version 16, the parameter of the function pg_get_publication_tables
> + * can be an array of publications. The partition table whose ancestor is
> + * also published in this publication array will be filtered out in this
> + * function.
> + */

Improved as suggested.

Also rebased the patch because the change in the HEAD (20b6847).

Attach the new patches.

Regards,
Wang wei

Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Peter Smith
Дата:
Hi Wang-san. Here are my review comments for HEAD_v12-0001 patch.

======

1. Missing documentation.

In [1] you wrote:
> I think the behaviour of multiple publications with parameter publish_via_partition_root could be added to the pg-doc
laterin a separate patch. 

~

That doesn't seem right to me. IMO the related documentation updates
cannot really be separated from this patch. Otherwise, what's the
alternative? Push this change, and then (while waiting for the
documentation patch) users will just have to use trial and error to
guess how it works...?

------

2. src/backend/catalog/pg_publication.c

+ typedef struct
+ {
+ Oid relid; /* OID of published table */
+ Oid pubid; /* OID of publication that publishes this
+ * table. */
+ } published_rel;

2a.
I think that should be added to typedefs.list

~

2b.
Maybe this also needs some comment to clarify that there will be
*multiple* of these structures in scenarios where the same table is
published by different publications in the array passed.

------

3. QUESTION - pg_get_publication_tables / fetch_table_list

When the same table is published by different publications (but there
are other differences like row-filters/column-lists in each
publication) the result tuple of this function does not include the
pubid. Maybe the SQL of pg_publication_tables/fetch_table_list() is OK
as-is but how does it manage to associate each table with the correct
tuple?

I know it apparently all seems to work but I’m not how does that
happen? Can you explain why a puboid is not needed for the result
tuple of this function?

~~

test_pub=# create table t1(a int, b int, c int);
CREATE TABLE
test_pub=# create publication pub1 for table t1(a) where (a > 99);
CREATE PUBLICATION
test_pub=# create publication pub2 for table t1(a,b) where (b < 33);
CREATE PUBLICATION

Following seems OK when I swap orders of publication names...

test_pub=# SELECT gpt.relid, gpt.attrs, pg_get_expr(gpt.qual,
gpt.relid) AS rowfilter from pg_get_publication_tables(VARIADIC
ARRAY['pub2','pub1']) gpt(relid, attrs, qual);
 relid | attrs | rowfilter
-------+-------+-----------
 16385 | 1 2   | (b < 33)
 16385 | 1     | (a > 99)
(2 rows)

test_pub=# SELECT gpt.relid, gpt.attrs, pg_get_expr(gpt.qual,
gpt.relid) AS rowfilter from pg_get_publication_tables(VARIADIC
ARRAY['pub1','pub2']) gpt(relid, attrs, qual);
 relid | attrs | rowfilter
-------+-------+-----------
 16385 | 1     | (a > 99)
 16385 | 1 2   | (b < 33)
(2 rows)

But what about this (this is similar to the SQL fragment from
fetch_table_list); I swapped the pub names but the results are the
same...

test_pub=# SELECT pg_get_publication_tables(VARIADIC
array_agg(p.pubname)) from pg_publication p where pubname
IN('pub2','pub1');

                     pg_get_publication_tables


----------------------------------------------------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------
 (16385,1,"{OPEXPR :opno 521 :opfuncid 147 :opresulttype 16 :opretset
false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 1
:vartype 23 :vartypmod -1 :var
collid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 1 :location 47}
{CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
:constbyval true :constisnull false :
location 51 :constvalue 4 [ 99 0 0 0 0 0 0 0 ]}) :location 49}")
 (16385,"1 2","{OPEXPR :opno 97 :opfuncid 66 :opresulttype 16
:opretset false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1
:varattno 2 :vartype 23 :vartypmod -1 :v
arcollid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 2 :location 49}
{CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
:constbyval true :constisnull false
 :location 53 :constvalue 4 [ 33 0 0 0 0 0 0 0 ]}) :location 51}")
(2 rows)

test_pub=# SELECT pg_get_publication_tables(VARIADIC
array_agg(p.pubname)) from pg_publication p where pubname
IN('pub1','pub2');

                     pg_get_publication_tables


----------------------------------------------------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------
 (16385,1,"{OPEXPR :opno 521 :opfuncid 147 :opresulttype 16 :opretset
false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 1
:vartype 23 :vartypmod -1 :var
collid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 1 :location 47}
{CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
:constbyval true :constisnull false :
location 51 :constvalue 4 [ 99 0 0 0 0 0 0 0 ]}) :location 49}")
 (16385,"1 2","{OPEXPR :opno 97 :opfuncid 66 :opresulttype 16
:opretset false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1
:varattno 2 :vartype 23 :vartypmod -1 :v
arcollid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 2 :location 49}
{CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
:constbyval true :constisnull false
 :location 53 :constvalue 4 [ 33 0 0 0 0 0 0 0 ]}) :location 51}")
(2 rows)


------
[1]
https://www.postgresql.org/message-id/OS3PR01MB6275A9B8C65C381C6828DF9D9E549%40OS3PR01MB6275.jpnprd01.prod.outlook.com

Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Data is copied twice when specifying both child and parent table in publication

От
"osumi.takamichi@fujitsu.com"
Дата:
On Wednesday, September 28, 2022 5:36 PM Wang, Wei/王 威 <wangw.fnst@fujitsu.com> wrote:
> Also rebased the patch because the change in the HEAD (20b6847).
> 
> Attach the new patches.
Hi, thank you for the updated patches!


Here are my minor review comments for HEAD v12.

(1) typo & suggestion to reword one comment


+                        * Publications support partitioned tables. If
+                        * publish_via_partition_root is false, all changes are replicated
+                        * using leaf partition identity and schema, so we only need
+                        * those. Otherwise, If publish_via_partition_root is true, get
+                        * the partitioned table itself.


The last sentence has "If" in the middle of the sentence.
We can use the lower letter for it. Or, I think "Otherwise" by itself means
"If publish_via_partition_root is true". So, I'll suggest a below change.


FROM:
Otherwise, If publish_via_partition_root is true, get the partitioned table itself.
TO:
Otherwise, get the partitioned table itself.


(2) Do we need to get "attnames" column from the publisher in the fetch_table_list() ?

When I was looking at v16 path, I didn't see any codes that utilize
the "attnames" column information returned from the publisher.
If we don't need it, could we remove it ?
I can miss something greatly, but this might be affected by HEAD codes ?



Best Regards,
    Takamichi Osumi


RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Wed, Oct 5, 2022 at 11:08 AM Peter Smith <smithpb2250@gmail.com> wrote:
> Hi Wang-san. Here are my review comments for HEAD_v12-0001 patch.

Thanks for your comments.

> ======
> 
> 1. Missing documentation.
> 
> In [1] you wrote:
> > I think the behaviour of multiple publications with parameter
> publish_via_partition_root could be added to the pg-doc later in a separate
> patch.
> 
> ~
> 
> That doesn't seem right to me. IMO the related documentation updates
> cannot really be separated from this patch. Otherwise, what's the
> alternative? Push this change, and then (while waiting for the
> documentation patch) users will just have to use trial and error to
> guess how it works...?

I tried to add related documentation in a separate patch (HEAD_v13-0002*).

> ------
> 
> 2. src/backend/catalog/pg_publication.c
> 
> + typedef struct
> + {
> + Oid relid; /* OID of published table */
> + Oid pubid; /* OID of publication that publishes this
> + * table. */
> + } published_rel;
> 
> 2a.
> I think that should be added to typedefs.list

Added.

> ~
> 
> 2b.
> Maybe this also needs some comment to clarify that there will be
> *multiple* of these structures in scenarios where the same table is
> published by different publications in the array passed.

Added the comments.

> ------
> 
> 3. QUESTION - pg_get_publication_tables / fetch_table_list
> 
> When the same table is published by different publications (but there
> are other differences like row-filters/column-lists in each
> publication) the result tuple of this function does not include the
> pubid. Maybe the SQL of pg_publication_tables/fetch_table_list() is OK
> as-is but how does it manage to associate each table with the correct
> tuple?
> 
> I know it apparently all seems to work but I’m not how does that
> happen? Can you explain why a puboid is not needed for the result
> tuple of this function?

Sorry, I am not sure I understand your question.
I try to answer your question by explaining the two functions you mentioned:

First, the function pg_get_publication_tables gets the list (see table_infos)
that included published table and the corresponding publication. Then based
on this list, the function pg_get_publication_tables returns information
(scheme, relname, row filter and column list) about the published tables in the
publications list. It just doesn't return pubid.

Then, the SQL in the function fetch_table_list will get the columns in the
column list from pg_attribute. (This is to return all columns when the column
list is not specified)

> ~~
> 
> test_pub=# create table t1(a int, b int, c int);
> CREATE TABLE
> test_pub=# create publication pub1 for table t1(a) where (a > 99);
> CREATE PUBLICATION
> test_pub=# create publication pub2 for table t1(a,b) where (b < 33);
> CREATE PUBLICATION
> 
> Following seems OK when I swap orders of publication names...
> 
> test_pub=# SELECT gpt.relid, gpt.attrs, pg_get_expr(gpt.qual,
> gpt.relid) AS rowfilter from pg_get_publication_tables(VARIADIC
> ARRAY['pub2','pub1']) gpt(relid, attrs, qual);
>  relid | attrs | rowfilter
> -------+-------+-----------
>  16385 | 1 2   | (b < 33)
>  16385 | 1     | (a > 99)
> (2 rows)
> 
> test_pub=# SELECT gpt.relid, gpt.attrs, pg_get_expr(gpt.qual,
> gpt.relid) AS rowfilter from pg_get_publication_tables(VARIADIC
> ARRAY['pub1','pub2']) gpt(relid, attrs, qual);
>  relid | attrs | rowfilter
> -------+-------+-----------
>  16385 | 1     | (a > 99)
>  16385 | 1 2   | (b < 33)
> (2 rows)
> 
> But what about this (this is similar to the SQL fragment from
> fetch_table_list); I swapped the pub names but the results are the
> same...
> 
> test_pub=# SELECT pg_get_publication_tables(VARIADIC
> array_agg(p.pubname)) from pg_publication p where pubname
> IN('pub2','pub1');
> 
>                      pg_get_publication_tables
> 
> -------------------------------------------------------------------------------------------------
> ---------------------------------------------------------------------
> -------------------------------------------------------------------------------------------------
> ---------------------------------------------------------------------
> -------------------------------------------------------------------
>  (16385,1,"{OPEXPR :opno 521 :opfuncid 147 :opresulttype 16 :opretset
> false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 1
> :vartype 23 :vartypmod -1 :var
> collid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 1 :location 47}
> {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
> :constbyval true :constisnull false :
> location 51 :constvalue 4 [ 99 0 0 0 0 0 0 0 ]}) :location 49}")
>  (16385,"1 2","{OPEXPR :opno 97 :opfuncid 66 :opresulttype 16
> :opretset false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1
> :varattno 2 :vartype 23 :vartypmod -1 :v
> arcollid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 2 :location 49}
> {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
> :constbyval true :constisnull false
>  :location 53 :constvalue 4 [ 33 0 0 0 0 0 0 0 ]}) :location 51}")
> (2 rows)
> 
> test_pub=# SELECT pg_get_publication_tables(VARIADIC
> array_agg(p.pubname)) from pg_publication p where pubname
> IN('pub1','pub2');
> 
>                      pg_get_publication_tables
> 
> -------------------------------------------------------------------------------------------------
> ---------------------------------------------------------------------
> -------------------------------------------------------------------------------------------------
> ---------------------------------------------------------------------
> -------------------------------------------------------------------
>  (16385,1,"{OPEXPR :opno 521 :opfuncid 147 :opresulttype 16 :opretset
> false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 1
> :vartype 23 :vartypmod -1 :var
> collid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 1 :location 47}
> {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
> :constbyval true :constisnull false :
> location 51 :constvalue 4 [ 99 0 0 0 0 0 0 0 ]}) :location 49}")
>  (16385,"1 2","{OPEXPR :opno 97 :opfuncid 66 :opresulttype 16
> :opretset false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1
> :varattno 2 :vartype 23 :vartypmod -1 :v
> arcollid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 2 :location 49}
> {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
> :constbyval true :constisnull false
>  :location 53 :constvalue 4 [ 33 0 0 0 0 0 0 0 ]}) :location 51}")
> (2 rows)

I think this is because the usage of SELECT statement. The order seems depend
on pg_publication. Such as:

postgres=# SELECT array_agg(p.pubname) FROM pg_publication p WHERE pubname IN ('pub1','pub2');
  array_agg
-------------
 {pub1,pub2}
(1 row)

postgres=# SELECT array_agg(p.pubname) FROM pg_publication p WHERE pubname IN ('pub2','pub1');
  array_agg
-------------
 {pub1,pub2}
(1 row)

Attach the new patch set.

Regards,
Wang wei

Вложения

RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Wed, Oct 5, 2022 at 23:05 PM Osumi, Takamichi/大墨 昂道 <osumi.takamichi@fujitsu.com> wrote:
> Hi, thank you for the updated patches!
> 
> 
> Here are my minor review comments for HEAD v12.

Thanks for your comments.

> (1) typo & suggestion to reword one comment
> 
> 
> +                        * Publications support partitioned tables. If
> +                        * publish_via_partition_root is false, all changes are replicated
> +                        * using leaf partition identity and schema, so we only need
> +                        * those. Otherwise, If publish_via_partition_root is true, get
> +                        * the partitioned table itself.
> 
> 
> The last sentence has "If" in the middle of the sentence.
> We can use the lower letter for it. Or, I think "Otherwise" by itself means
> "If publish_via_partition_root is true". So, I'll suggest a below change.
> 
> 
> FROM:
> Otherwise, If publish_via_partition_root is true, get the partitioned table itself.
> TO:
> Otherwise, get the partitioned table itself.

Improved.

> (2) Do we need to get "attnames" column from the publisher in the
> fetch_table_list() ?
> 
> When I was looking at v16 path, I didn't see any codes that utilize
> the "attnames" column information returned from the publisher.
> If we don't need it, could we remove it ?
> I can miss something greatly, but this might be affected by HEAD codes ?

Yes, it is affected by HEAD. I think we need this column to check whether the
same table has multiple column lists. (see commit fd0b9dc)

The new patch set were attached in [1].

[1] -
https://www.postgresql.org/message-id/OS3PR01MB6275843B2BBE92870F7881C19E299%40OS3PR01MB6275.jpnprd01.prod.outlook.com

Regards,
Wang wei

Re: Data is copied twice when specifying both child and parent table in publication

От
Peter Smith
Дата:
On Mon, Oct 17, 2022 at 4:49 PM wangw.fnst@fujitsu.com
<wangw.fnst@fujitsu.com> wrote:
>
> On Wed, Oct 5, 2022 at 11:08 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > Hi Wang-san. Here are my review comments for HEAD_v12-0001 patch.
>
...
> >
> > 3. QUESTION - pg_get_publication_tables / fetch_table_list
> >
> > When the same table is published by different publications (but there
> > are other differences like row-filters/column-lists in each
> > publication) the result tuple of this function does not include the
> > pubid. Maybe the SQL of pg_publication_tables/fetch_table_list() is OK
> > as-is but how does it manage to associate each table with the correct
> > tuple?
> >
> > I know it apparently all seems to work but I’m not how does that
> > happen? Can you explain why a puboid is not needed for the result
> > tuple of this function?
>
> Sorry, I am not sure I understand your question.
> I try to answer your question by explaining the two functions you mentioned:
>
> First, the function pg_get_publication_tables gets the list (see table_infos)
> that included published table and the corresponding publication. Then based
> on this list, the function pg_get_publication_tables returns information
> (scheme, relname, row filter and column list) about the published tables in the
> publications list. It just doesn't return pubid.
>
> Then, the SQL in the function fetch_table_list will get the columns in the
> column list from pg_attribute. (This is to return all columns when the column
> list is not specified)
>

I meant, for example, if the different publications specified
different col-lists for the same table then IIUC the
fetch_table_lists() is going to return 2 list elements
(schema,rel_name,row_filter,col_list). But when the schema/rel_name
are the same for 2 elements then (without also a pubid) how are you
going to know where the list element came from, and how come that is
not important?

> > ~~
> >
> > test_pub=# create table t1(a int, b int, c int);
> > CREATE TABLE
> > test_pub=# create publication pub1 for table t1(a) where (a > 99);
> > CREATE PUBLICATION
> > test_pub=# create publication pub2 for table t1(a,b) where (b < 33);
> > CREATE PUBLICATION
> >
> > Following seems OK when I swap orders of publication names...
> >
> > test_pub=# SELECT gpt.relid, gpt.attrs, pg_get_expr(gpt.qual,
> > gpt.relid) AS rowfilter from pg_get_publication_tables(VARIADIC
> > ARRAY['pub2','pub1']) gpt(relid, attrs, qual);
> >  relid | attrs | rowfilter
> > -------+-------+-----------
> >  16385 | 1 2   | (b < 33)
> >  16385 | 1     | (a > 99)
> > (2 rows)
> >
> > test_pub=# SELECT gpt.relid, gpt.attrs, pg_get_expr(gpt.qual,
> > gpt.relid) AS rowfilter from pg_get_publication_tables(VARIADIC
> > ARRAY['pub1','pub2']) gpt(relid, attrs, qual);
> >  relid | attrs | rowfilter
> > -------+-------+-----------
> >  16385 | 1     | (a > 99)
> >  16385 | 1 2   | (b < 33)
> > (2 rows)
> >
> > But what about this (this is similar to the SQL fragment from
> > fetch_table_list); I swapped the pub names but the results are the
> > same...
> >
> > test_pub=# SELECT pg_get_publication_tables(VARIADIC
> > array_agg(p.pubname)) from pg_publication p where pubname
> > IN('pub2','pub1');
> >
> >                      pg_get_publication_tables
> >
> > -------------------------------------------------------------------------------------------------
> > ---------------------------------------------------------------------
> > -------------------------------------------------------------------------------------------------
> > ---------------------------------------------------------------------
> > -------------------------------------------------------------------
> >  (16385,1,"{OPEXPR :opno 521 :opfuncid 147 :opresulttype 16 :opretset
> > false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 1
> > :vartype 23 :vartypmod -1 :var
> > collid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 1 :location 47}
> > {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
> > :constbyval true :constisnull false :
> > location 51 :constvalue 4 [ 99 0 0 0 0 0 0 0 ]}) :location 49}")
> >  (16385,"1 2","{OPEXPR :opno 97 :opfuncid 66 :opresulttype 16
> > :opretset false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1
> > :varattno 2 :vartype 23 :vartypmod -1 :v
> > arcollid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 2 :location 49}
> > {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
> > :constbyval true :constisnull false
> >  :location 53 :constvalue 4 [ 33 0 0 0 0 0 0 0 ]}) :location 51}")
> > (2 rows)
> >
> > test_pub=# SELECT pg_get_publication_tables(VARIADIC
> > array_agg(p.pubname)) from pg_publication p where pubname
> > IN('pub1','pub2');
> >
> >                      pg_get_publication_tables
> >
> > -------------------------------------------------------------------------------------------------
> > ---------------------------------------------------------------------
> > -------------------------------------------------------------------------------------------------
> > ---------------------------------------------------------------------
> > -------------------------------------------------------------------
> >  (16385,1,"{OPEXPR :opno 521 :opfuncid 147 :opresulttype 16 :opretset
> > false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 1
> > :vartype 23 :vartypmod -1 :var
> > collid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 1 :location 47}
> > {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
> > :constbyval true :constisnull false :
> > location 51 :constvalue 4 [ 99 0 0 0 0 0 0 0 ]}) :location 49}")
> >  (16385,"1 2","{OPEXPR :opno 97 :opfuncid 66 :opresulttype 16
> > :opretset false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1
> > :varattno 2 :vartype 23 :vartypmod -1 :v
> > arcollid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 2 :location 49}
> > {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
> > :constbyval true :constisnull false
> >  :location 53 :constvalue 4 [ 33 0 0 0 0 0 0 0 ]}) :location 51}")
> > (2 rows)
>
> I think this is because the usage of SELECT statement. The order seems depend
> on pg_publication. Such as:
>
> postgres=# SELECT array_agg(p.pubname) FROM pg_publication p WHERE pubname IN ('pub1','pub2');
>   array_agg
> -------------
>  {pub1,pub2}
> (1 row)
>
> postgres=# SELECT array_agg(p.pubname) FROM pg_publication p WHERE pubname IN ('pub2','pub1');
>   array_agg
> -------------
>  {pub1,pub2}
> (1 row)
>

Right, so I felt it was a bit dubious that the result of the function
“seems to depend on” something. That’s why I was asking why the list
elements did not include a pubid. Then a caller could be certain what
element belonged with what publication. It's not quite clear to me why
that is not important for this patch - but anyway, even if it's not
necessary for this patch's usage, this is a function that is exposed
to users who might have different needs/expectations than this patch
has, so shouldn't the result be less fuzzy for them?

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Data is copied twice when specifying both child and parent table in publication

От
Peter Smith
Дата:
Here are my review comments for HEAD patches v13*

//////

Patch HEAD_v13-0001

I already posted some follow-up questions. See [1]

/////

Patch HEAD_v13-0002

1. Commit message

The following usage scenarios are not described in detail in the manual:
If one subscription subscribes multiple publications, and these publications
publish a partitioned table and its partitions respectively. When we specify
this parameter on one or more of these publications, which identity and schema
should be used to publish the changes?

In these cases, I think the parameter publish_via_partition_root behave as
follows:

~

It seemed worded a bit strangely. Also, you said "on one or more of
these publications" but the examples only show only one publication
having 'publish_via_partition_root'.

SUGGESTION (I've modified the wording slightly but the examples are unchanged).

Assume a subscription is subscribing to multiple publications, and
these publications publish a partitioned table and its partitions
respectively:

[publisher-side]
create table parent (a int primary key) partition by range (a);
create table child partition of parent default;

create publication pub1 for table parent;
create publication pub2 for table child;

[subscriber-side]
create subscription sub connection 'xxxx' publication pub1, pub2;

The manual does not clearly describe the behaviour when the user had
specified the parameter 'publish_via_partition_root' on just one of
the publications. This patch modifies documentation to clarify the
following rules:

- If the parameter publish_via_partition_root is specified only in pub1,
changes will be published using the identity and schema of the table 'parent'.

- If the parameter publish_via_partition_root is specified only in pub2,
changes will be published using the identity and schema of the table 'child'.

~~~

2.

- If the parameter publish_via_partition_root is specified only in pub2,
changes will be published using the identity and schema of the table child.

~

Is that right though? This rule seems 100% contrary to the meaning of
'publish_via_partition_root=true'.

------

3. doc/src/sgml/ref/create_publication.sgml

+         <para>
+          If a root partitioned table is published by any subscribed
publications which
+          set publish_via_partition_root = true, changes on this root
partitioned table
+          (or on its partitions) will be published using the identity
and schema of this
+          root partitioned table rather than that of the individual partitions.
+         </para>

This seems to only describe the first example from the commit message.
What about some description to explain the second example?

------
[1] https://www.postgresql.org/message-id/CAHut%2BPt%2B1PNx6VsZ-xKzAU-18HmNXhjCC1TGakKX46Wg7YNT1Q%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Data is copied twice when specifying both child and parent table in publication

От
Ian Lawrence Barwick
Дата:
2022年10月17日(月) 14:49 wangw.fnst@fujitsu.com <wangw.fnst@fujitsu.com>:
>
> On Wed, Oct 5, 2022 at 11:08 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > Hi Wang-san. Here are my review comments for HEAD_v12-0001 patch.
>
> Thanks for your comments.
>
> > ======
> >
> > 1. Missing documentation.
> >
> > In [1] you wrote:
> > > I think the behaviour of multiple publications with parameter
> > publish_via_partition_root could be added to the pg-doc later in a separate
> > patch.
> >
> > ~
> >
> > That doesn't seem right to me. IMO the related documentation updates
> > cannot really be separated from this patch. Otherwise, what's the
> > alternative? Push this change, and then (while waiting for the
> > documentation patch) users will just have to use trial and error to
> > guess how it works...?
>
> I tried to add related documentation in a separate patch (HEAD_v13-0002*).
>
> > ------
> >
> > 2. src/backend/catalog/pg_publication.c
> >
> > + typedef struct
> > + {
> > + Oid relid; /* OID of published table */
> > + Oid pubid; /* OID of publication that publishes this
> > + * table. */
> > + } published_rel;
> >
> > 2a.
> > I think that should be added to typedefs.list
>
> Added.
>
> > ~
> >
> > 2b.
> > Maybe this also needs some comment to clarify that there will be
> > *multiple* of these structures in scenarios where the same table is
> > published by different publications in the array passed.
>
> Added the comments.
>
> > ------
> >
> > 3. QUESTION - pg_get_publication_tables / fetch_table_list
> >
> > When the same table is published by different publications (but there
> > are other differences like row-filters/column-lists in each
> > publication) the result tuple of this function does not include the
> > pubid. Maybe the SQL of pg_publication_tables/fetch_table_list() is OK
> > as-is but how does it manage to associate each table with the correct
> > tuple?
> >
> > I know it apparently all seems to work but I’m not how does that
> > happen? Can you explain why a puboid is not needed for the result
> > tuple of this function?
>
> Sorry, I am not sure I understand your question.
> I try to answer your question by explaining the two functions you mentioned:
>
> First, the function pg_get_publication_tables gets the list (see table_infos)
> that included published table and the corresponding publication. Then based
> on this list, the function pg_get_publication_tables returns information
> (scheme, relname, row filter and column list) about the published tables in the
> publications list. It just doesn't return pubid.
>
> Then, the SQL in the function fetch_table_list will get the columns in the
> column list from pg_attribute. (This is to return all columns when the column
> list is not specified)
>
> > ~~
> >
> > test_pub=# create table t1(a int, b int, c int);
> > CREATE TABLE
> > test_pub=# create publication pub1 for table t1(a) where (a > 99);
> > CREATE PUBLICATION
> > test_pub=# create publication pub2 for table t1(a,b) where (b < 33);
> > CREATE PUBLICATION
> >
> > Following seems OK when I swap orders of publication names...
> >
> > test_pub=# SELECT gpt.relid, gpt.attrs, pg_get_expr(gpt.qual,
> > gpt.relid) AS rowfilter from pg_get_publication_tables(VARIADIC
> > ARRAY['pub2','pub1']) gpt(relid, attrs, qual);
> >  relid | attrs | rowfilter
> > -------+-------+-----------
> >  16385 | 1 2   | (b < 33)
> >  16385 | 1     | (a > 99)
> > (2 rows)
> >
> > test_pub=# SELECT gpt.relid, gpt.attrs, pg_get_expr(gpt.qual,
> > gpt.relid) AS rowfilter from pg_get_publication_tables(VARIADIC
> > ARRAY['pub1','pub2']) gpt(relid, attrs, qual);
> >  relid | attrs | rowfilter
> > -------+-------+-----------
> >  16385 | 1     | (a > 99)
> >  16385 | 1 2   | (b < 33)
> > (2 rows)
> >
> > But what about this (this is similar to the SQL fragment from
> > fetch_table_list); I swapped the pub names but the results are the
> > same...
> >
> > test_pub=# SELECT pg_get_publication_tables(VARIADIC
> > array_agg(p.pubname)) from pg_publication p where pubname
> > IN('pub2','pub1');
> >
> >                      pg_get_publication_tables
> >
> > -------------------------------------------------------------------------------------------------
> > ---------------------------------------------------------------------
> > -------------------------------------------------------------------------------------------------
> > ---------------------------------------------------------------------
> > -------------------------------------------------------------------
> >  (16385,1,"{OPEXPR :opno 521 :opfuncid 147 :opresulttype 16 :opretset
> > false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 1
> > :vartype 23 :vartypmod -1 :var
> > collid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 1 :location 47}
> > {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
> > :constbyval true :constisnull false :
> > location 51 :constvalue 4 [ 99 0 0 0 0 0 0 0 ]}) :location 49}")
> >  (16385,"1 2","{OPEXPR :opno 97 :opfuncid 66 :opresulttype 16
> > :opretset false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1
> > :varattno 2 :vartype 23 :vartypmod -1 :v
> > arcollid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 2 :location 49}
> > {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
> > :constbyval true :constisnull false
> >  :location 53 :constvalue 4 [ 33 0 0 0 0 0 0 0 ]}) :location 51}")
> > (2 rows)
> >
> > test_pub=# SELECT pg_get_publication_tables(VARIADIC
> > array_agg(p.pubname)) from pg_publication p where pubname
> > IN('pub1','pub2');
> >
> >                      pg_get_publication_tables
> >
> > -------------------------------------------------------------------------------------------------
> > ---------------------------------------------------------------------
> > -------------------------------------------------------------------------------------------------
> > ---------------------------------------------------------------------
> > -------------------------------------------------------------------
> >  (16385,1,"{OPEXPR :opno 521 :opfuncid 147 :opresulttype 16 :opretset
> > false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 1
> > :vartype 23 :vartypmod -1 :var
> > collid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 1 :location 47}
> > {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
> > :constbyval true :constisnull false :
> > location 51 :constvalue 4 [ 99 0 0 0 0 0 0 0 ]}) :location 49}")
> >  (16385,"1 2","{OPEXPR :opno 97 :opfuncid 66 :opresulttype 16
> > :opretset false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1
> > :varattno 2 :vartype 23 :vartypmod -1 :v
> > arcollid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 2 :location 49}
> > {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
> > :constbyval true :constisnull false
> >  :location 53 :constvalue 4 [ 33 0 0 0 0 0 0 0 ]}) :location 51}")
> > (2 rows)
>
> I think this is because the usage of SELECT statement. The order seems depend
> on pg_publication. Such as:
>
> postgres=# SELECT array_agg(p.pubname) FROM pg_publication p WHERE pubname IN ('pub1','pub2');
>   array_agg
> -------------
>  {pub1,pub2}
> (1 row)
>
> postgres=# SELECT array_agg(p.pubname) FROM pg_publication p WHERE pubname IN ('pub2','pub1');
>   array_agg
> -------------
>  {pub1,pub2}
> (1 row)
>
> Attach the new patch set.


This entry was marked as "Needs review" in the CommitFest app but cfbot
reports the patch no longer applies.

We've marked it as "Waiting on Author". As CommitFest 2022-11 is
currently underway, this would be an excellent time update the patch.

Once you think the patchset is ready for review again, you (or any
interested party) can  move the patch entry forward by visiting

    https://commitfest.postgresql.org/40/3623/

and changing the status to "Needs review".


Thanks

Ian Barwick



RE: Data is copied twice when specifying both child and parent table in publication

От
"Takamichi Osumi (Fujitsu)"
Дата:
On Monday, October 17, 2022 2:49 PM Wang, Wei/王 威 <wangw.fnst@fujitsu.com> wrote:
> Attach the new patch set.
Hi, thank you for posting the new patches.


Here are minor comments on the HEAD_v13-0002.

(1) Suggestion for the document description

+         <para>
+          If a root partitioned table is published by any subscribed publications which
+          set publish_via_partition_root = true, changes on this root partitioned table
+          (or on its partitions) will be published using the identity and schema of this
+          root partitioned table rather than that of the individual partitions.
+         </para>
+

I suppose this sentence looks quite similar to the one in the previous paragraph and can be adjusted.

IIUC the main value of the patch is to clarify what happens when
we mix publications of different publish_via_partition_root settings for one partition hierarchy.
If this is true, how about below sentence instead of the one above ?

"
There can be a case where a subscription combines publications with
different publish_via_partition_root values for one same partition hierarchy
(e.g. subscribe two publications indicating the root partitioned table and its child table respectively).
In this case, the identity and schema of the root partitioned table take priority.
"

(2) Better documentation alignment

I think we need to wrap publish_via_partition_root by "literal" tag
in the documentation create_publication.sgml.


Best Regards,
    Takamichi Osumi


RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Fri, Oct 21, 2022 at 17:02 PM Peter Smith <smithpb2250@gmail.com> wrote:
>

Thanks for your comments. Sorry for not replying in time.

> On Mon, Oct 17, 2022 at 4:49 PM wangw.fnst@fujitsu.com
> <wangw.fnst@fujitsu.com> wrote:
> >
> > On Wed, Oct 5, 2022 at 11:08 AM Peter Smith <smithpb2250@gmail.com>
> wrote:
> > > Hi Wang-san. Here are my review comments for HEAD_v12-0001 patch.
> >
> ...
> > >
> > > 3. QUESTION - pg_get_publication_tables / fetch_table_list
> > >
> > > When the same table is published by different publications (but there
> > > are other differences like row-filters/column-lists in each
> > > publication) the result tuple of this function does not include the
> > > pubid. Maybe the SQL of pg_publication_tables/fetch_table_list() is OK
> > > as-is but how does it manage to associate each table with the correct
> > > tuple?
> > >
> > > I know it apparently all seems to work but I’m not how does that
> > > happen? Can you explain why a puboid is not needed for the result
> > > tuple of this function?
> >
> > Sorry, I am not sure I understand your question.
> > I try to answer your question by explaining the two functions you mentioned:
> >
> > First, the function pg_get_publication_tables gets the list (see table_infos)
> > that included published table and the corresponding publication. Then based
> > on this list, the function pg_get_publication_tables returns information
> > (scheme, relname, row filter and column list) about the published tables in the
> > publications list. It just doesn't return pubid.
> >
> > Then, the SQL in the function fetch_table_list will get the columns in the
> > column list from pg_attribute. (This is to return all columns when the column
> > list is not specified)
> >
> 
> I meant, for example, if the different publications specified
> different col-lists for the same table then IIUC the
> fetch_table_lists() is going to return 2 list elements
> (schema,rel_name,row_filter,col_list). But when the schema/rel_name
> are the same for 2 elements then (without also a pubid) how are you
> going to know where the list element came from, and how come that is
> not important?
> 
> > > ~~
> > >
> > > test_pub=# create table t1(a int, b int, c int);
> > > CREATE TABLE
> > > test_pub=# create publication pub1 for table t1(a) where (a > 99);
> > > CREATE PUBLICATION
> > > test_pub=# create publication pub2 for table t1(a,b) where (b < 33);
> > > CREATE PUBLICATION
> > >
> > > Following seems OK when I swap orders of publication names...
> > >
> > > test_pub=# SELECT gpt.relid, gpt.attrs, pg_get_expr(gpt.qual,
> > > gpt.relid) AS rowfilter from pg_get_publication_tables(VARIADIC
> > > ARRAY['pub2','pub1']) gpt(relid, attrs, qual);
> > >  relid | attrs | rowfilter
> > > -------+-------+-----------
> > >  16385 | 1 2   | (b < 33)
> > >  16385 | 1     | (a > 99)
> > > (2 rows)
> > >
> > > test_pub=# SELECT gpt.relid, gpt.attrs, pg_get_expr(gpt.qual,
> > > gpt.relid) AS rowfilter from pg_get_publication_tables(VARIADIC
> > > ARRAY['pub1','pub2']) gpt(relid, attrs, qual);
> > >  relid | attrs | rowfilter
> > > -------+-------+-----------
> > >  16385 | 1     | (a > 99)
> > >  16385 | 1 2   | (b < 33)
> > > (2 rows)
> > >
> > > But what about this (this is similar to the SQL fragment from
> > > fetch_table_list); I swapped the pub names but the results are the
> > > same...
> > >
> > > test_pub=# SELECT pg_get_publication_tables(VARIADIC
> > > array_agg(p.pubname)) from pg_publication p where pubname
> > > IN('pub2','pub1');
> > >
> > >                      pg_get_publication_tables
> > >
> > > ---------------------------------------------------------------------------------------------
> ----
> > > ---------------------------------------------------------------------
> > > ---------------------------------------------------------------------------------------------
> ----
> > > ---------------------------------------------------------------------
> > > -------------------------------------------------------------------
> > >  (16385,1,"{OPEXPR :opno 521 :opfuncid 147 :opresulttype 16 :opretset
> > > false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 1
> > > :vartype 23 :vartypmod -1 :var
> > > collid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 1 :location 47}
> > > {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
> > > :constbyval true :constisnull false :
> > > location 51 :constvalue 4 [ 99 0 0 0 0 0 0 0 ]}) :location 49}")
> > >  (16385,"1 2","{OPEXPR :opno 97 :opfuncid 66 :opresulttype 16
> > > :opretset false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1
> > > :varattno 2 :vartype 23 :vartypmod -1 :v
> > > arcollid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 2 :location 49}
> > > {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
> > > :constbyval true :constisnull false
> > >  :location 53 :constvalue 4 [ 33 0 0 0 0 0 0 0 ]}) :location 51}")
> > > (2 rows)
> > >
> > > test_pub=# SELECT pg_get_publication_tables(VARIADIC
> > > array_agg(p.pubname)) from pg_publication p where pubname
> > > IN('pub1','pub2');
> > >
> > >                      pg_get_publication_tables
> > >
> > > ---------------------------------------------------------------------------------------------
> ----
> > > ---------------------------------------------------------------------
> > > ---------------------------------------------------------------------------------------------
> ----
> > > ---------------------------------------------------------------------
> > > -------------------------------------------------------------------
> > >  (16385,1,"{OPEXPR :opno 521 :opfuncid 147 :opresulttype 16 :opretset
> > > false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 1
> > > :vartype 23 :vartypmod -1 :var
> > > collid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 1 :location 47}
> > > {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
> > > :constbyval true :constisnull false :
> > > location 51 :constvalue 4 [ 99 0 0 0 0 0 0 0 ]}) :location 49}")
> > >  (16385,"1 2","{OPEXPR :opno 97 :opfuncid 66 :opresulttype 16
> > > :opretset false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1
> > > :varattno 2 :vartype 23 :vartypmod -1 :v
> > > arcollid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 2 :location 49}
> > > {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
> > > :constbyval true :constisnull false
> > >  :location 53 :constvalue 4 [ 33 0 0 0 0 0 0 0 ]}) :location 51}")
> > > (2 rows)
> >
> > I think this is because the usage of SELECT statement. The order seems
> depend
> > on pg_publication. Such as:
> >
> > postgres=# SELECT array_agg(p.pubname) FROM pg_publication p WHERE
> pubname IN ('pub1','pub2');
> >   array_agg
> > -------------
> >  {pub1,pub2}
> > (1 row)
> >
> > postgres=# SELECT array_agg(p.pubname) FROM pg_publication p WHERE
> pubname IN ('pub2','pub1');
> >   array_agg
> > -------------
> >  {pub1,pub2}
> > (1 row)
> >
> 
> Right, so I felt it was a bit dubious that the result of the function
> “seems to depend on” something. That’s why I was asking why the list
> elements did not include a pubid. Then a caller could be certain what
> element belonged with what publication. It's not quite clear to me why
> that is not important for this patch - but anyway, even if it's not
> necessary for this patch's usage, this is a function that is exposed
> to users who might have different needs/expectations than this patch
> has, so shouldn't the result be less fuzzy for them?

Yes, I agree that there may be such a need in the future. Added 'pubid' to the
output of this function.
BTW, I think the usage of the function pg_get_publication_tables is not
documented in the pg-doc now, it doesn't seem to be a function provided to
users. So I didn't modify the documentation.

Attach new patches.

Regards,
Wang wei

Вложения

RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Fri, Oct 21, 2022 at 17:02 PM Peter Smith <smithpb2250@gmail.com> wrote:
> Here are my review comments for HEAD patches v13*

Thanks for your comments.
 
> Patch HEAD_v13-0002
> 
> 1. Commit message
> 
> The following usage scenarios are not described in detail in the manual:
> If one subscription subscribes multiple publications, and these publications
> publish a partitioned table and its partitions respectively. When we specify
> this parameter on one or more of these publications, which identity and schema
> should be used to publish the changes?
> 
> In these cases, I think the parameter publish_via_partition_root behave as
> follows:
> 
> ~
> 
> It seemed worded a bit strangely. Also, you said "on one or more of
> these publications" but the examples only show only one publication
> having 'publish_via_partition_root'.
> 
> SUGGESTION (I've modified the wording slightly but the examples are
> unchanged).
> 
> Assume a subscription is subscribing to multiple publications, and
> these publications publish a partitioned table and its partitions
> respectively:
> 
> [publisher-side]
> create table parent (a int primary key) partition by range (a);
> create table child partition of parent default;
> 
> create publication pub1 for table parent;
> create publication pub2 for table child;
> 
> [subscriber-side]
> create subscription sub connection 'xxxx' publication pub1, pub2;
> 
> The manual does not clearly describe the behaviour when the user had
> specified the parameter 'publish_via_partition_root' on just one of
> the publications. This patch modifies documentation to clarify the
> following rules:
> 
> - If the parameter publish_via_partition_root is specified only in pub1,
> changes will be published using the identity and schema of the table 'parent'.
> 
> - If the parameter publish_via_partition_root is specified only in pub2,
> changes will be published using the identity and schema of the table 'child'.

Improved as suggested.

> ~~~
> 
> 2.
> 
> - If the parameter publish_via_partition_root is specified only in pub2,
> changes will be published using the identity and schema of the table child.
> 
> ~
> 
> Is that right though? This rule seems 100% contrary to the meaning of
> 'publish_via_partition_root=true'.

Yes, I think this behaviour fits the meaning of publish_via_partition_root.
Please refer to this description in the document:
```
This parameter determines whether changes in a partitioned table (or on its
partitions) contained in the publication will be published ...
```

So I think for 'publish_via_partition_root' to work, the partitioned table must
be specified in this publication.

Since only one member (partition table 'child') of this partition tree
('parent', 'child') is specified in 'pub2', even if 'pub2' specifies the
parameter 'publish_via_partition_root', 'pub2' will publish changes using the
identity and schema of the table 'child'.

> ------
> 
> 3. doc/src/sgml/ref/create_publication.sgml
> 
> +         <para>
> +          If a root partitioned table is published by any subscribed
> publications which
> +          set publish_via_partition_root = true, changes on this root
> partitioned table
> +          (or on its partitions) will be published using the identity
> and schema of this
> +          root partitioned table rather than that of the individual partitions.
> +         </para>
> 
> This seems to only describe the first example from the commit message.
> What about some description to explain the second example?

I think the second example is already described in the pg-doc (please see the
reply to #2). I am not quite sure if additional modifications are required. Do
you have any suggestions?

The new patch set was attached in [1].

[1] -
https://www.postgresql.org/message-id/OS3PR01MB6275FB5397C6A647F262A3A69E009%40OS3PR01MB6275.jpnprd01.prod.outlook.com

Regards,
Wang wei

RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Tues, Nov 8, 2022 at 12:12 PM Osumi, Takamichi/大墨 昂道 <osumi.takamichi@fujitsu.com> wrote:
> On Monday, October 17, 2022 2:49 PM Wang, Wei/王 威
> <wangw.fnst@fujitsu.com> wrote:
> > Attach the new patch set.
> Hi, thank you for posting the new patches.
> 
> 
> Here are minor comments on the HEAD_v13-0002.

Thanks for your comments.

> (1) Suggestion for the document description
> 
> +         <para>
> +          If a root partitioned table is published by any subscribed publications
> which
> +          set publish_via_partition_root = true, changes on this root partitioned
> table
> +          (or on its partitions) will be published using the identity and schema of this
> +          root partitioned table rather than that of the individual partitions.
> +         </para>
> +
> 
> I suppose this sentence looks quite similar to the one in the previous paragraph
> and can be adjusted.
> 
> IIUC the main value of the patch is to clarify what happens when
> we mix publications of different publish_via_partition_root settings for one
> partition hierarchy.
> If this is true, how about below sentence instead of the one above ?
> 
> "
> There can be a case where a subscription combines publications with
> different publish_via_partition_root values for one same partition hierarchy
> (e.g. subscribe two publications indicating the root partitioned table and its child
> table respectively).
> In this case, the identity and schema of the root partitioned table take priority.
> "

Thanks for your suggestion.
I agree that we should mention that this description is for a case where one
subscription subscribes to multiple publications. And I think it would be
better if we mentioned that the option publish_via_partition_root is specified
on a publication that publishes a root partitioned table. So I added the
description of this case as you suggested.

> (2) Better documentation alignment
> 
> I think we need to wrap publish_via_partition_root by "literal" tag
> in the documentation create_publication.sgml.

Improved.

The new patch set was attached in [1].

[1] -
https://www.postgresql.org/message-id/OS3PR01MB6275FB5397C6A647F262A3A69E009%40OS3PR01MB6275.jpnprd01.prod.outlook.com

Regards,
Wang wei

Re: Data is copied twice when specifying both child and parent table in publication

От
vignesh C
Дата:
  On Fri, 11 Nov 2022 at 11:13, wangw.fnst@fujitsu.com
<wangw.fnst@fujitsu.com> wrote:
>
> On Fri, Oct 21, 2022 at 17:02 PM Peter Smith <smithpb2250@gmail.com> wrote:
> >
>
> Thanks for your comments. Sorry for not replying in time.
>
> > On Mon, Oct 17, 2022 at 4:49 PM wangw.fnst@fujitsu.com
> > <wangw.fnst@fujitsu.com> wrote:
> > >
> > > On Wed, Oct 5, 2022 at 11:08 AM Peter Smith <smithpb2250@gmail.com>
> > wrote:
> > > > Hi Wang-san. Here are my review comments for HEAD_v12-0001 patch.
> > >
> > ...
> > > >
> > > > 3. QUESTION - pg_get_publication_tables / fetch_table_list
> > > >
> > > > When the same table is published by different publications (but there
> > > > are other differences like row-filters/column-lists in each
> > > > publication) the result tuple of this function does not include the
> > > > pubid. Maybe the SQL of pg_publication_tables/fetch_table_list() is OK
> > > > as-is but how does it manage to associate each table with the correct
> > > > tuple?
> > > >
> > > > I know it apparently all seems to work but I’m not how does that
> > > > happen? Can you explain why a puboid is not needed for the result
> > > > tuple of this function?
> > >
> > > Sorry, I am not sure I understand your question.
> > > I try to answer your question by explaining the two functions you mentioned:
> > >
> > > First, the function pg_get_publication_tables gets the list (see table_infos)
> > > that included published table and the corresponding publication. Then based
> > > on this list, the function pg_get_publication_tables returns information
> > > (scheme, relname, row filter and column list) about the published tables in the
> > > publications list. It just doesn't return pubid.
> > >
> > > Then, the SQL in the function fetch_table_list will get the columns in the
> > > column list from pg_attribute. (This is to return all columns when the column
> > > list is not specified)
> > >
> >
> > I meant, for example, if the different publications specified
> > different col-lists for the same table then IIUC the
> > fetch_table_lists() is going to return 2 list elements
> > (schema,rel_name,row_filter,col_list). But when the schema/rel_name
> > are the same for 2 elements then (without also a pubid) how are you
> > going to know where the list element came from, and how come that is
> > not important?
> >
> > > > ~~
> > > >
> > > > test_pub=# create table t1(a int, b int, c int);
> > > > CREATE TABLE
> > > > test_pub=# create publication pub1 for table t1(a) where (a > 99);
> > > > CREATE PUBLICATION
> > > > test_pub=# create publication pub2 for table t1(a,b) where (b < 33);
> > > > CREATE PUBLICATION
> > > >
> > > > Following seems OK when I swap orders of publication names...
> > > >
> > > > test_pub=# SELECT gpt.relid, gpt.attrs, pg_get_expr(gpt.qual,
> > > > gpt.relid) AS rowfilter from pg_get_publication_tables(VARIADIC
> > > > ARRAY['pub2','pub1']) gpt(relid, attrs, qual);
> > > >  relid | attrs | rowfilter
> > > > -------+-------+-----------
> > > >  16385 | 1 2   | (b < 33)
> > > >  16385 | 1     | (a > 99)
> > > > (2 rows)
> > > >
> > > > test_pub=# SELECT gpt.relid, gpt.attrs, pg_get_expr(gpt.qual,
> > > > gpt.relid) AS rowfilter from pg_get_publication_tables(VARIADIC
> > > > ARRAY['pub1','pub2']) gpt(relid, attrs, qual);
> > > >  relid | attrs | rowfilter
> > > > -------+-------+-----------
> > > >  16385 | 1     | (a > 99)
> > > >  16385 | 1 2   | (b < 33)
> > > > (2 rows)
> > > >
> > > > But what about this (this is similar to the SQL fragment from
> > > > fetch_table_list); I swapped the pub names but the results are the
> > > > same...
> > > >
> > > > test_pub=# SELECT pg_get_publication_tables(VARIADIC
> > > > array_agg(p.pubname)) from pg_publication p where pubname
> > > > IN('pub2','pub1');
> > > >
> > > >                      pg_get_publication_tables
> > > >
> > > > ---------------------------------------------------------------------------------------------
> > ----
> > > > ---------------------------------------------------------------------
> > > > ---------------------------------------------------------------------------------------------
> > ----
> > > > ---------------------------------------------------------------------
> > > > -------------------------------------------------------------------
> > > >  (16385,1,"{OPEXPR :opno 521 :opfuncid 147 :opresulttype 16 :opretset
> > > > false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 1
> > > > :vartype 23 :vartypmod -1 :var
> > > > collid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 1 :location 47}
> > > > {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
> > > > :constbyval true :constisnull false :
> > > > location 51 :constvalue 4 [ 99 0 0 0 0 0 0 0 ]}) :location 49}")
> > > >  (16385,"1 2","{OPEXPR :opno 97 :opfuncid 66 :opresulttype 16
> > > > :opretset false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1
> > > > :varattno 2 :vartype 23 :vartypmod -1 :v
> > > > arcollid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 2 :location 49}
> > > > {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
> > > > :constbyval true :constisnull false
> > > >  :location 53 :constvalue 4 [ 33 0 0 0 0 0 0 0 ]}) :location 51}")
> > > > (2 rows)
> > > >
> > > > test_pub=# SELECT pg_get_publication_tables(VARIADIC
> > > > array_agg(p.pubname)) from pg_publication p where pubname
> > > > IN('pub1','pub2');
> > > >
> > > >                      pg_get_publication_tables
> > > >
> > > > ---------------------------------------------------------------------------------------------
> > ----
> > > > ---------------------------------------------------------------------
> > > > ---------------------------------------------------------------------------------------------
> > ----
> > > > ---------------------------------------------------------------------
> > > > -------------------------------------------------------------------
> > > >  (16385,1,"{OPEXPR :opno 521 :opfuncid 147 :opresulttype 16 :opretset
> > > > false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 1
> > > > :vartype 23 :vartypmod -1 :var
> > > > collid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 1 :location 47}
> > > > {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
> > > > :constbyval true :constisnull false :
> > > > location 51 :constvalue 4 [ 99 0 0 0 0 0 0 0 ]}) :location 49}")
> > > >  (16385,"1 2","{OPEXPR :opno 97 :opfuncid 66 :opresulttype 16
> > > > :opretset false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1
> > > > :varattno 2 :vartype 23 :vartypmod -1 :v
> > > > arcollid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 2 :location 49}
> > > > {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
> > > > :constbyval true :constisnull false
> > > >  :location 53 :constvalue 4 [ 33 0 0 0 0 0 0 0 ]}) :location 51}")
> > > > (2 rows)
> > >
> > > I think this is because the usage of SELECT statement. The order seems
> > depend
> > > on pg_publication. Such as:
> > >
> > > postgres=# SELECT array_agg(p.pubname) FROM pg_publication p WHERE
> > pubname IN ('pub1','pub2');
> > >   array_agg
> > > -------------
> > >  {pub1,pub2}
> > > (1 row)
> > >
> > > postgres=# SELECT array_agg(p.pubname) FROM pg_publication p WHERE
> > pubname IN ('pub2','pub1');
> > >   array_agg
> > > -------------
> > >  {pub1,pub2}
> > > (1 row)
> > >
> >
> > Right, so I felt it was a bit dubious that the result of the function
> > “seems to depend on” something. That’s why I was asking why the list
> > elements did not include a pubid. Then a caller could be certain what
> > element belonged with what publication. It's not quite clear to me why
> > that is not important for this patch - but anyway, even if it's not
> > necessary for this patch's usage, this is a function that is exposed
> > to users who might have different needs/expectations than this patch
> > has, so shouldn't the result be less fuzzy for them?
>
> Yes, I agree that there may be such a need in the future. Added 'pubid' to the
> output of this function.
> BTW, I think the usage of the function pg_get_publication_tables is not
> documented in the pg-doc now, it doesn't seem to be a function provided to
> users. So I didn't modify the documentation.
>
> Attach new patches.

Here we are having tables list to store the relids and table_infos
list which stores pubid along with relid. Here tables list acts as a
temporary list to get filter_partitions and then delete the
published_rel from table_infos. Will it be possible to directly
operate on table_infos list and remove the temporary tables list used.
We might have to implement comparator, deduplication functions and
change filter_partitions function to work directly on published_rel
type list.
+                       /
+                        * Record the published table and the
corresponding publication so
+                        * that we can get row filters and column list later.
+                        *
+                        * When a table is published by multiple
publications, to obtain
+                        * all row filters and column list, the
structure related to this
+                        * table will be recorded multiple times.
+                        */
+                       foreach(lc, pub_elem_tables)
+                       {
+                               published_rel *table_info =
(published_rel *) malloc(sizeof(published_rel));
+
+                               table_info->relid = lfirst_oid(lc);
+                               table_info->pubid = pub_elem->oid;
+                               table_infos = lappend(table_infos, table_info);
+                       }
+
+                       tables = list_concat(tables, pub_elem_tables);

Thoughts?

Regards,
Vignesh



RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Mon, Nov 14, 2022 at 0:56 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Attach new patches.
> 

Thanks for your comments.

> Here we are having tables list to store the relids and table_infos
> list which stores pubid along with relid. Here tables list acts as a
> temporary list to get filter_partitions and then delete the
> published_rel from table_infos. Will it be possible to directly
> operate on table_infos list and remove the temporary tables list used.
> We might have to implement comparator, deduplication functions and
> change filter_partitions function to work directly on published_rel
> type list.
> +                       /
> +                        * Record the published table and the
> corresponding publication so
> +                        * that we can get row filters and column list later.
> +                        *
> +                        * When a table is published by multiple
> publications, to obtain
> +                        * all row filters and column list, the
> structure related to this
> +                        * table will be recorded multiple times.
> +                        */
> +                       foreach(lc, pub_elem_tables)
> +                       {
> +                               published_rel *table_info =
> (published_rel *) malloc(sizeof(published_rel));
> +
> +                               table_info->relid = lfirst_oid(lc);
> +                               table_info->pubid = pub_elem->oid;
> +                               table_infos = lappend(table_infos, table_info);
> +                       }
> +
> +                       tables = list_concat(tables, pub_elem_tables);
> 
> Thoughts?

I think we could only deduplicate published tables per publication to get all
row filters and column lists for each published table later.
I removed the temporary list 'tables' and modified the API of the function
filter_partitions to handle published_rel type list.

Attach the new patch set.

Regards,
Wang wei

Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
vignesh C
Дата:
On Wed, 16 Nov 2022 at 14:28, wangw.fnst@fujitsu.com
<wangw.fnst@fujitsu.com> wrote:
>
> On Mon, Nov 14, 2022 at 0:56 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > Attach new patches.
> >
>
> Thanks for your comments.
>
> > Here we are having tables list to store the relids and table_infos
> > list which stores pubid along with relid. Here tables list acts as a
> > temporary list to get filter_partitions and then delete the
> > published_rel from table_infos. Will it be possible to directly
> > operate on table_infos list and remove the temporary tables list used.
> > We might have to implement comparator, deduplication functions and
> > change filter_partitions function to work directly on published_rel
> > type list.
> > +                       /
> > +                        * Record the published table and the
> > corresponding publication so
> > +                        * that we can get row filters and column list later.
> > +                        *
> > +                        * When a table is published by multiple
> > publications, to obtain
> > +                        * all row filters and column list, the
> > structure related to this
> > +                        * table will be recorded multiple times.
> > +                        */
> > +                       foreach(lc, pub_elem_tables)
> > +                       {
> > +                               published_rel *table_info =
> > (published_rel *) malloc(sizeof(published_rel));
> > +
> > +                               table_info->relid = lfirst_oid(lc);
> > +                               table_info->pubid = pub_elem->oid;
> > +                               table_infos = lappend(table_infos, table_info);
> > +                       }
> > +
> > +                       tables = list_concat(tables, pub_elem_tables);
> >
> > Thoughts?
>
> I think we could only deduplicate published tables per publication to get all
> row filters and column lists for each published table later.
> I removed the temporary list 'tables' and modified the API of the function
> filter_partitions to handle published_rel type list.
>
> Attach the new patch set.

Thanks for the update patch.
One suggestion:
+/* Records association between publication and published table */
+typedef struct
+{
+       Oid                     relid;          /* OID of published table */
+       Oid                     pubid;          /* OID of publication
that publishes this
+                                                        * table. */
+} published_rel;
+

+                       /*
+                        * Record the published table and the
corresponding publication so
+                        * that we can get row filters and column list later.
+                        *
+                        * When a table is published by multiple
publications, to obtain
+                        * all row filters and column list, the
structure related to this
+                        * table will be recorded multiple times.
+                        */
+                       foreach(lc, pub_elem_tables)
+                       {
+                               published_rel *table_info =
(published_rel *) malloc(sizeof(published_rel));
+
+                               table_info->relid = lfirst_oid(lc);
+                               table_info->pubid = pub_elem->oid;
+                               table_infos = lappend(table_infos, table_info);
+                       }

In this format if there are n relations in publication we will store
pubid n times, in all tables publication there will many thousands of
tables. We could avoid storing the pubid for every relid, instead we
could  represent it like below to avoid storing publication id for
each tables:

+/* Records association between publication and published tables */
+typedef struct
+{
+       List    *relids,          /* OIDs of the publisher tables */
+       Oid         pubid;           /* OID of publication that publishes this
+                                     * tables. */
+}published_rel;

Thoughts?

Regards,
Vignesh



RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Thurs, Nov 17, 2022 at 13:58 PM vignesh C <vignesh21@gmail.com> wrote:
> On Wed, 16 Nov 2022 at 14:28, wangw.fnst@fujitsu.com
> <wangw.fnst@fujitsu.com> wrote:
> >
> > On Mon, Nov 14, 2022 at 0:56 AM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > Attach new patches.
> > >
> >
> > Thanks for your comments.
> >
> > > Here we are having tables list to store the relids and table_infos
> > > list which stores pubid along with relid. Here tables list acts as a
> > > temporary list to get filter_partitions and then delete the
> > > published_rel from table_infos. Will it be possible to directly
> > > operate on table_infos list and remove the temporary tables list used.
> > > We might have to implement comparator, deduplication functions and
> > > change filter_partitions function to work directly on published_rel
> > > type list.
> > > +                       /
> > > +                        * Record the published table and the
> > > corresponding publication so
> > > +                        * that we can get row filters and column list later.
> > > +                        *
> > > +                        * When a table is published by multiple
> > > publications, to obtain
> > > +                        * all row filters and column list, the
> > > structure related to this
> > > +                        * table will be recorded multiple times.
> > > +                        */
> > > +                       foreach(lc, pub_elem_tables)
> > > +                       {
> > > +                               published_rel *table_info =
> > > (published_rel *) malloc(sizeof(published_rel));
> > > +
> > > +                               table_info->relid = lfirst_oid(lc);
> > > +                               table_info->pubid = pub_elem->oid;
> > > +                               table_infos = lappend(table_infos, table_info);
> > > +                       }
> > > +
> > > +                       tables = list_concat(tables, pub_elem_tables);
> > >
> > > Thoughts?
> >
> > I think we could only deduplicate published tables per publication to get all
> > row filters and column lists for each published table later.
> > I removed the temporary list 'tables' and modified the API of the function
> > filter_partitions to handle published_rel type list.
> >
> > Attach the new patch set.
> 
> Thanks for the update patch.

Thanks for your comment.

> One suggestion:
> +/* Records association between publication and published table */
> +typedef struct
> +{
> +       Oid                     relid;          /* OID of published table */
> +       Oid                     pubid;          /* OID of publication
> that publishes this
> +                                                        * table. */
> +} published_rel;
> +
> 
> +                       /*
> +                        * Record the published table and the
> corresponding publication so
> +                        * that we can get row filters and column list later.
> +                        *
> +                        * When a table is published by multiple
> publications, to obtain
> +                        * all row filters and column list, the
> structure related to this
> +                        * table will be recorded multiple times.
> +                        */
> +                       foreach(lc, pub_elem_tables)
> +                       {
> +                               published_rel *table_info =
> (published_rel *) malloc(sizeof(published_rel));
> +
> +                               table_info->relid = lfirst_oid(lc);
> +                               table_info->pubid = pub_elem->oid;
> +                               table_infos = lappend(table_infos, table_info);
> +                       }
> 
> In this format if there are n relations in publication we will store
> pubid n times, in all tables publication there will many thousands of
> tables. We could avoid storing the pubid for every relid, instead we
> could  represent it like below to avoid storing publication id for
> each tables:
> 
> +/* Records association between publication and published tables */
> +typedef struct
> +{
> +       List    *relids,          /* OIDs of the publisher tables */
> +       Oid         pubid;           /* OID of publication that publishes this
> +                                     * tables. */
> +}published_rel;
> 
> Thoughts?

I think this complicates the function filter_partitions.
Because if we use such a node type, I think we need to concatenate 'relids'
list of each node of the 'table_infos' list in the function filter_partitions
to become a temporary list. Then filter this temporary list and process the
'table_infos' list according to the filtering result.

Regards,
Wang wei

Re: Data is copied twice when specifying both child and parent table in publication

От
Andres Freund
Дата:
Hi,

On 2022-11-16 08:58:31 +0000, wangw.fnst@fujitsu.com wrote:
> Attach the new patch set.

This patch causes several of the tests to fail. See e.g.:

https://cirrus-ci.com/task/6587624765259776

Most of the failures appear to be due to the main regression tests failing:
https://api.cirrus-ci.com/v1/artifact/task/6587624765259776/testrun/build/testrun/regress/regress/regression.diffs

diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/publication.out
/tmp/cirrus-ci-build/build/testrun/regress/regress/results/publication.out
--- /tmp/cirrus-ci-build/src/test/regress/expected/publication.out    2023-02-07 20:19:34.318018729 +0000
+++ /tmp/cirrus-ci-build/build/testrun/regress/regress/results/publication.out    2023-02-07 20:22:53.545223026 +0000
@@ -1657,7 +1657,7 @@
 SELECT * FROM pg_publication_tables;
  pubname | schemaname | tablename  | attnames | rowfilter 
 ---------+------------+------------+----------+-----------
- pub     | sch2       | tbl1_part1 | {a}      | 
+ pub     | sch2       | tbl1_part1 |          | 
 (1 row)
 
 DROP PUBLICATION pub;




RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Wed, Feb 8, 2023 4:29 AM Andres Freund <andres@anarazel.de> wrote:
> Hi,
> 
> On 2022-11-16 08:58:31 +0000, wangw.fnst@fujitsu.com wrote:
> > Attach the new patch set.
> 
> This patch causes several of the tests to fail. See e.g.:
> 
> https://cirrus-ci.com/task/6587624765259776
> 
> Most of the failures appear to be due to the main regression tests failing:
> https://api.cirrus-
> ci.com/v1/artifact/task/6587624765259776/testrun/build/testrun/regress/regres
> s/regression.diffs
> 
> diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/publication.out
> /tmp/cirrus-ci-build/build/testrun/regress/regress/results/publication.out
> --- /tmp/cirrus-ci-build/src/test/regress/expected/publication.out    2023-02-
> 07 20:19:34.318018729 +0000
> +++ /tmp/cirrus-ci-build/build/testrun/regress/regress/results/publication.out
>     2023-02-07 20:22:53.545223026 +0000
> @@ -1657,7 +1657,7 @@
>  SELECT * FROM pg_publication_tables;
>   pubname | schemaname | tablename  | attnames | rowfilter
>  ---------+------------+------------+----------+-----------
> - pub     | sch2       | tbl1_part1 | {a}      |
> + pub     | sch2       | tbl1_part1 |          |
>  (1 row)
> 
>  DROP PUBLICATION pub;

Thanks for your kind reminder and analysis.

I think this failure is caused by the recently commit (b7ae039) in the current
HEAD. Rebased the patch set and attach them.

Regards,
Wang Wei

Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Wed, Feb 8, 2023 at 9:21 AM wangw.fnst@fujitsu.com
<wangw.fnst@fujitsu.com> wrote:
>
> I think this failure is caused by the recently commit (b7ae039) in the current
> HEAD. Rebased the patch set and attach them.
>

+ if (server_version >= 160000)
+ {
+ appendStringInfo(&cmd, "SELECT DISTINCT N.nspname, C.relname,\n"
+ "              ( SELECT array_agg(a.attname ORDER BY a.attnum)\n"
+ "                FROM pg_attribute a\n"
+ "                WHERE a.attrelid = GPT.relid AND a.attnum > 0 AND\n"
+ "                      NOT a.attisdropped AND\n"
+ "                      (a.attnum = ANY(GPT.attrs) OR GPT.attrs IS NULL)\n"
+ "              ) AS attnames\n"
+ " FROM pg_class C\n"
+ "   JOIN pg_namespace N ON N.oid = C.relnamespace\n"
+ "   JOIN ( SELECT (pg_get_publication_tables(VARIADIC
array_agg(pubname::text))).*\n"
+ "          FROM pg_publication\n"
+ "          WHERE pubname IN ( %s )) as GPT\n"
+ "       ON GPT.relid = C.oid\n",
+ pub_names.data);

The function pg_get_publication_tables()  has already handled dropped
columns, so we don't need it here in this query. Also, the part to
build attnames should be the same as it is in view
pg_publication_tables. Can we directly try to pass the list of
pubnames to the function pg_get_publication_tables() instead of
joining it with pg_publication?

Can we keep the changes in the else part (fix when publisher < 16) the
same as HEAD and move the proposed change to a separate patch?
Basically, for the HEAD patch, let's just try to fix this when
publisher >=16. I am slightly worried that as this is a corner case
bug and we didn't see any user complaints for this, so introducing a
complex fix for back branches may not be required or at least we can
discuss that separately.

--
With Regards,
Amit Kapila.



RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Thu, Mar 16, 2023 at 20:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>

Thanks for your comments.

> + if (server_version >= 160000)
> + {
> + appendStringInfo(&cmd, "SELECT DISTINCT N.nspname, C.relname,\n"
> + "              ( SELECT array_agg(a.attname ORDER BY a.attnum)\n"
> + "                FROM pg_attribute a\n"
> + "                WHERE a.attrelid = GPT.relid AND a.attnum > 0 AND\n"
> + "                      NOT a.attisdropped AND\n"
> + "                      (a.attnum = ANY(GPT.attrs) OR GPT.attrs IS NULL)\n"
> + "              ) AS attnames\n"
> + " FROM pg_class C\n"
> + "   JOIN pg_namespace N ON N.oid = C.relnamespace\n"
> + "   JOIN ( SELECT (pg_get_publication_tables(VARIADIC
> array_agg(pubname::text))).*\n"
> + "          FROM pg_publication\n"
> + "          WHERE pubname IN ( %s )) as GPT\n"
> + "       ON GPT.relid = C.oid\n",
> + pub_names.data);
> 
> The function pg_get_publication_tables()  has already handled dropped
> columns, so we don't need it here in this query. Also, the part to
> build attnames should be the same as it is in view
> pg_publication_tables.

Agree. Changed.

> Can we directly try to pass the list of
> pubnames to the function pg_get_publication_tables() instead of
> joining it with pg_publication?

Changed.
I think the aim of joining it with pg_publication before is to exclude
non-existing publications. Otherwise, we would get an error because of the call
to function GetPublicationByName (with 'missing_ok = false') in function
pg_get_publication_tables. So, I changed "missing_ok" to true. If anyone doesn't
like this change, I'll reconsider this in the next version.

> Can we keep the changes in the else part (fix when publisher < 16) the
> same as HEAD and move the proposed change to a separate patch?
> Basically, for the HEAD patch, let's just try to fix this when
> publisher >=16. I am slightly worried that as this is a corner case
> bug and we didn't see any user complaints for this, so introducing a
> complex fix for back branches may not be required or at least we can
> discuss that separately.

Split the patch as suggested.

Attach the new patch set.

Regards,
Wang Wei

Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Fri, Mar 17, 2023 at 11:58 AM wangw.fnst@fujitsu.com
<wangw.fnst@fujitsu.com> wrote:
>
> On Thu, Mar 16, 2023 at 20:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> Thanks for your comments.
>
> > + if (server_version >= 160000)
> > + {
> > + appendStringInfo(&cmd, "SELECT DISTINCT N.nspname, C.relname,\n"
> > + "              ( SELECT array_agg(a.attname ORDER BY a.attnum)\n"
> > + "                FROM pg_attribute a\n"
> > + "                WHERE a.attrelid = GPT.relid AND a.attnum > 0 AND\n"
> > + "                      NOT a.attisdropped AND\n"
> > + "                      (a.attnum = ANY(GPT.attrs) OR GPT.attrs IS NULL)\n"
> > + "              ) AS attnames\n"
> > + " FROM pg_class C\n"
> > + "   JOIN pg_namespace N ON N.oid = C.relnamespace\n"
> > + "   JOIN ( SELECT (pg_get_publication_tables(VARIADIC
> > array_agg(pubname::text))).*\n"
> > + "          FROM pg_publication\n"
> > + "          WHERE pubname IN ( %s )) as GPT\n"
> > + "       ON GPT.relid = C.oid\n",
> > + pub_names.data);
> >
> > The function pg_get_publication_tables()  has already handled dropped
> > columns, so we don't need it here in this query. Also, the part to
> > build attnames should be the same as it is in view
> > pg_publication_tables.
>
> Agree. Changed.
>
> > Can we directly try to pass the list of
> > pubnames to the function pg_get_publication_tables() instead of
> > joining it with pg_publication?
>
> Changed.
> I think the aim of joining it with pg_publication before is to exclude
> non-existing publications.
>

Okay, A comment for that would have made it clear.

> Otherwise, we would get an error because of the call
> to function GetPublicationByName (with 'missing_ok = false') in function
> pg_get_publication_tables. So, I changed "missing_ok" to true. If anyone doesn't
> like this change, I'll reconsider this in the next version.
>

I am not sure about changing missing_ok behavior. Did you check it for
any other similar usage in other functions?

+ foreach(lc, pub_elem_tables)
+ {
+ published_rel *table_info = (published_rel *) malloc(sizeof(published_rel));

Is there a reason to use malloc instead of palloc?

--
With Regards,
Amit Kapila.



Re: Data is copied twice when specifying both child and parent table in publication

От
Jacob Champion
Дата:
On Thu, Mar 16, 2023 at 11:28 PM wangw.fnst@fujitsu.com
<wangw.fnst@fujitsu.com> wrote:
> Attach the new patch set.

Hi,

I ran into this problem while hacking on [1], so thank you for tackling
it! I have no strong opinions on the implementation itself; I just want
to register a concern that the tests have not kept up with the
implementation complexity.

For example, the corner case mentioned in 0003, with multiple
publications having conflicting pubviaroot settings, isn't tested as far
as I can see. (I checked manually, and it appears to work as intended.)
And the related pub_lower_level test currently only covers the case
where multiple publications have pubviaroot=true, so the following test
comment is now misleading:

> # for tab4, we publish changes through the "middle" partitioned table
> $node_publisher->safe_psql('postgres',
>     "CREATE PUBLICATION pub_lower_level FOR TABLE tab4_1 WITH (publish_via_partition_root = true)"
> );

...since the changes are now in fact published via the tab4 root after
this patchset is applied.

> I think the aim of joining it with pg_publication before is to exclude
> non-existing publications. Otherwise, we would get an error because of the call
> to function GetPublicationByName (with 'missing_ok = false') in function
> pg_get_publication_tables.

In the same vein, I don't think that case is covered anywhere.

There are a bunch of moving parts and hidden subtleties here, and I fell
into a few traps when I was working on my patch, so it'd be nice to have
additional coverage. I'm happy to contribute effort in that area if it's
helpful.

Thanks!
--Jacob

[1] https://postgr.es/m/dc57f088-039b-7a71-8f4c-082ef106246e%40timescale.com



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Sat, Mar 18, 2023 at 5:06 AM Jacob Champion <jchampion@timescale.com> wrote:
>
> On Thu, Mar 16, 2023 at 11:28 PM wangw.fnst@fujitsu.com
> <wangw.fnst@fujitsu.com> wrote:
> > Attach the new patch set.
>
> Hi,
>
> I ran into this problem while hacking on [1], so thank you for tackling
> it! I have no strong opinions on the implementation itself; I just want
> to register a concern that the tests have not kept up with the
> implementation complexity.
>
> For example, the corner case mentioned in 0003, with multiple
> publications having conflicting pubviaroot settings, isn't tested as far
> as I can see. (I checked manually, and it appears to work as intended.)
> And the related pub_lower_level test currently only covers the case
> where multiple publications have pubviaroot=true, so the following test
> comment is now misleading:
>
> > # for tab4, we publish changes through the "middle" partitioned table
> > $node_publisher->safe_psql('postgres',
> >       "CREATE PUBLICATION pub_lower_level FOR TABLE tab4_1 WITH (publish_via_partition_root = true)"
> > );
>
> ...since the changes are now in fact published via the tab4 root after
> this patchset is applied.
>
> > I think the aim of joining it with pg_publication before is to exclude
> > non-existing publications. Otherwise, we would get an error because of the call
> > to function GetPublicationByName (with 'missing_ok = false') in function
> > pg_get_publication_tables.
>
> In the same vein, I don't think that case is covered anywhere.
>

We can have a test case to cover this scenario.

> There are a bunch of moving parts and hidden subtleties here, and I fell
> into a few traps when I was working on my patch, so it'd be nice to have
> additional coverage. I'm happy to contribute effort in that area if it's
> helpful.
>

I think it depends on what tests you have in mind. I suggest you can
propose a patch to cover tests for this are in a separate thread. We
can then evaluate those separately.

--
With Regards,
Amit Kapila.



RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Fri, Mar 17, 2023 at 20:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Mar 17, 2023 at 11:58 AM wangw.fnst@fujitsu.com
> <wangw.fnst@fujitsu.com> wrote:
> >
> > On Thu, Mar 16, 2023 at 20:25 PM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> > >
> >
> > Thanks for your comments.
> >
> > > + if (server_version >= 160000)
> > > + {
> > > + appendStringInfo(&cmd, "SELECT DISTINCT N.nspname, C.relname,\n"
> > > + "              ( SELECT array_agg(a.attname ORDER BY a.attnum)\n"
> > > + "                FROM pg_attribute a\n"
> > > + "                WHERE a.attrelid = GPT.relid AND a.attnum > 0 AND\n"
> > > + "                      NOT a.attisdropped AND\n"
> > > + "                      (a.attnum = ANY(GPT.attrs) OR GPT.attrs IS NULL)\n"
> > > + "              ) AS attnames\n"
> > > + " FROM pg_class C\n"
> > > + "   JOIN pg_namespace N ON N.oid = C.relnamespace\n"
> > > + "   JOIN ( SELECT (pg_get_publication_tables(VARIADIC
> > > array_agg(pubname::text))).*\n"
> > > + "          FROM pg_publication\n"
> > > + "          WHERE pubname IN ( %s )) as GPT\n"
> > > + "       ON GPT.relid = C.oid\n",
> > > + pub_names.data);
> > >
> > > The function pg_get_publication_tables()  has already handled dropped
> > > columns, so we don't need it here in this query. Also, the part to
> > > build attnames should be the same as it is in view
> > > pg_publication_tables.
> >
> > Agree. Changed.
> >
> > > Can we directly try to pass the list of
> > > pubnames to the function pg_get_publication_tables() instead of
> > > joining it with pg_publication?
> >
> > Changed.
> > I think the aim of joining it with pg_publication before is to exclude
> > non-existing publications.
> >
> 
> Okay, A comment for that would have made it clear.

Make sense. Added the comment atop the query.

> > Otherwise, we would get an error because of the call
> > to function GetPublicationByName (with 'missing_ok = false') in function
> > pg_get_publication_tables. So, I changed "missing_ok" to true. If anyone doesn't
> > like this change, I'll reconsider this in the next version.
> >
> 
> I am not sure about changing missing_ok behavior. Did you check it for
> any other similar usage in other functions?

After reviewing the pg_get_* functions in the 'pg_proc.dat' file, I think most
of them ignore incorrect input, such as the function pg_get_indexdef. However,
some functions, such as pg_get_serial_sequence and pg_get_object_address, will
report an error. So, I think it's better to discuss this in a separate thread.
Reverted this modification. And I will start a new separate thread for this
later.

> + foreach(lc, pub_elem_tables)
> + {
> + published_rel *table_info = (published_rel *) malloc(sizeof(published_rel));
> 
> Is there a reason to use malloc instead of palloc?

No. I think we need to use palloc here.
Changed.

Attach the new patch set.

Regards,
Wang Wei

Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Peter Smith
Дата:
Here are some review comments for v17-0001.

======
src/backend/catalog/pg_publication.c

1. filter_partitions

-static List *
-filter_partitions(List *relids)
+static void
+filter_partitions(List *table_infos)
 {
- List    *result = NIL;
  ListCell   *lc;
- ListCell   *lc2;

- foreach(lc, relids)
+ foreach(lc, table_infos)
  {
- bool skip = false;
- List    *ancestors = NIL;
- Oid relid = lfirst_oid(lc);
+ bool skip = false;
+ List    *ancestors = NIL;
+ ListCell    *lc2;
+ published_rel    *table_info = (published_rel *) lfirst(lc);

- if (get_rel_relispartition(relid))
- ancestors = get_partition_ancestors(relid);
+ if (get_rel_relispartition(table_info->relid))
+ ancestors = get_partition_ancestors(table_info->relid);

  foreach(lc2, ancestors)
  {
  Oid ancestor = lfirst_oid(lc2);
+ ListCell   *lc3;

  /* Check if the parent table exists in the published table list. */
- if (list_member_oid(relids, ancestor))
+ foreach(lc3, table_infos)
  {
- skip = true;
- break;
+ Oid relid = ((published_rel *) lfirst(lc3))->relid;
+
+ if (relid == ancestor)
+ {
+ skip = true;
+ break;
+ }
  }
+
+ if (skip)
+ break;
  }

- if (!skip)
- result = lappend_oid(result, relid);
+ if (skip)
+ table_infos = foreach_delete_current(table_infos, lc);
  }
-
- return result;
 }


It seems the 'skip' and 'ancestors' and 'lc2' vars are not needed
except when "if (get_rel_relispartition(table_info->relid))" is true,
so won't it be better to restructure the code to put everything inside
that condition. Then you will save a few unnecessary tests of
foreach(lc2, ancestors) and (skip).

For example,

static void
filter_partitions(List *table_infos)
{
ListCell   *lc;

foreach(lc, table_infos)
{
published_rel    *table_info = (published_rel *) lfirst(lc);

if (get_rel_relispartition(table_info->relid))
{
bool skip = false;
List *ancestors = get_partition_ancestors(table_info->relid);
ListCell *lc2;

foreach(lc2, ancestors)
{
Oid ancestor = lfirst_oid(lc2);
ListCell   *lc3;
/* Check if the parent table exists in the published table list. */
foreach(lc3, table_infos)
{
Oid relid = ((published_rel *) lfirst(lc3))->relid;

if (relid == ancestor)
{
skip = true;
break;
}
}
if (skip)
break;
}

if (skip)
table_infos = foreach_delete_current(table_infos, lc);
}
}
}

~~~

2. pg_get_publication_tables

+ else
+ {
+ List    *relids,
+    *schemarelids;
+
+ relids = GetPublicationRelations(pub_elem->oid,
+ pub_elem->pubviaroot ?
+ PUBLICATION_PART_ROOT :
+ PUBLICATION_PART_LEAF);
+ schemarelids = GetAllSchemaPublicationRelations(pub_elem->oid,
+ pub_elem->pubviaroot ?
+ PUBLICATION_PART_ROOT :
+ PUBLICATION_PART_LEAF);
+ pub_elem_tables = list_concat_unique_oid(relids, schemarelids);
+ }

2a.
Maybe 'schema_relids' would be a better name than 'schemareliids'?

~

2b.
By introducing another variable maybe you could remove some of this
duplicated code.

PublicationPartOpt root_or_leaf = pub_elem->pubviaroot ?
PUBLICATION_PART_ROOT : PUBLICATION_PART_LEAF;

~~~

3. pg_get_publication_tables

  /* Show all columns when the column list is not specified. */
- if (nulls[1] == true)
+ if (nulls[2] == true)

Since you are changing this line anyway, you might as well change it
to remove the redundant "== true" part.

SUGGESTION
if (nulls[2])

======
src/include/catalog/pg_proc.dat

4.
+{ oid => '6119',
+  descr => 'get information of the tables in the given publication array',

Should that be worded in a way to make it more clear that the
"publication array" is really an "array of publication names"?

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Mon, Mar 20, 2023 at 1:02 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
>
> 2. pg_get_publication_tables
>
> + else
> + {
> + List    *relids,
> +    *schemarelids;
> +
> + relids = GetPublicationRelations(pub_elem->oid,
> + pub_elem->pubviaroot ?
> + PUBLICATION_PART_ROOT :
> + PUBLICATION_PART_LEAF);
> + schemarelids = GetAllSchemaPublicationRelations(pub_elem->oid,
> + pub_elem->pubviaroot ?
> + PUBLICATION_PART_ROOT :
> + PUBLICATION_PART_LEAF);
> + pub_elem_tables = list_concat_unique_oid(relids, schemarelids);
> + }
>
> 2a.
> Maybe 'schema_relids' would be a better name than 'schemareliids'?
>
> ~
>
> 2b.
> By introducing another variable maybe you could remove some of this
> duplicated code.
>
> PublicationPartOpt root_or_leaf = pub_elem->pubviaroot ?
> PUBLICATION_PART_ROOT : PUBLICATION_PART_LEAF;
>

IIUC, 2b is an existing code, so I would prefer not to change that as
part of this patch. Similarly, for other comments, unless something is
a very clear improvement and makes difference w.r.t this patch, it
makes sense to change that, otherwise, let's focus on the current
issue.

--
With Regards,
Amit Kapila.



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Mon, Mar 20, 2023 at 1:02 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
>
> ======
> src/include/catalog/pg_proc.dat
>
> 4.
> +{ oid => '6119',
> +  descr => 'get information of the tables in the given publication array',
>
> Should that be worded in a way to make it more clear that the
> "publication array" is really an "array of publication names"?
>

I don't know how important it is to tell that the array is an array of
publication names but the current description can be improved. How
about something like: "get information of the tables that are part of
the specified publications"

Few other comments:
=================
1.
  foreach(lc2, ancestors)
  {
  Oid ancestor = lfirst_oid(lc2);
+ ListCell   *lc3;

  /* Check if the parent table exists in the published table list. */
- if (list_member_oid(relids, ancestor))
+ foreach(lc3, table_infos)
  {
- skip = true;
- break;
+ Oid relid = ((published_rel *) lfirst(lc3))->relid;
+
+ if (relid == ancestor)
+ {
+ skip = true;
+ break;
+ }
  }
+
+ if (skip)
+ break;
  }

- if (!skip)
- result = lappend_oid(result, relid);
+ if (skip)
+ table_infos = foreach_delete_current(table_infos, lc);

The usage of skip looks a bit ugly to me. Can we move the code for the
inner loop to a separate function (like
is_ancestor_member_tableinfos()) and remove the current cell if it
returns true?

2.
  * Filter out the partitions whose parent tables were also specified in
  * the publication.
  */
-static List *
-filter_partitions(List *relids)
+static void
+filter_partitions(List *table_infos)

The comment atop filter_partitions is no longer valid. Can we slightly
change it to: "Filter out the partitions whose parent tables are also
present in the list."?

3.
-# Note: We create two separate tables, not a partitioned one, so that we can
-# easily identity through which relation were the changes replicated.
+# Note: We only create one table (tab4) here. We specified
+# publish_via_partition_root = true (see pub_all and pub_lower_level above), so
+# all data will be replicated to that table.
 $node_subscriber2->safe_psql('postgres',
  "CREATE TABLE tab4 (a int PRIMARY KEY)");
-$node_subscriber2->safe_psql('postgres',
- "CREATE TABLE tab4_1 (a int PRIMARY KEY)");

I am not sure if it is a good idea to remove tab4_1 here. It is
testing something different as mentioned in the comments. Also, I
don't see any data in tab4 for the initial sync, so not sure if this
tests the behavior changed by this patch.

4.
--- a/src/test/subscription/t/031_column_list.pl
+++ b/src/test/subscription/t/031_column_list.pl
@@ -959,7 +959,8 @@ $node_publisher->safe_psql(
  CREATE TABLE test_root_1 PARTITION OF test_root FOR VALUES FROM (1) TO (10);
  CREATE TABLE test_root_2 PARTITION OF test_root FOR VALUES FROM (10) TO (20);

- CREATE PUBLICATION pub_root_true FOR TABLE test_root (a) WITH
(publish_via_partition_root = true);
+ CREATE PUBLICATION pub_root_true_1 FOR TABLE test_root (a) WITH
(publish_via_partition_root = true);
+ CREATE PUBLICATION pub_root_true_2 FOR TABLE test_root_1 (a, b) WITH
(publish_via_partition_root = true);

  -- initial data
  INSERT INTO test_root VALUES (1, 2, 3);
@@ -968,7 +969,7 @@ $node_publisher->safe_psql(

 $node_subscriber->safe_psql(
  'postgres', qq(
- CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION
pub_root_true;
+ CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION
pub_root_true_1, pub_root_true_2;

It is not clear to me what exactly you want to test here. Please add
some comments.

5. I think you can merge the 0001 and 0003 patches.

Apart from the above, attached is a patch to change some of the
comments in the patch.

--
With Regards,
Amit Kapila.

Вложения

RE: Data is copied twice when specifying both child and parent table in publication

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Wang,

I have tested about multilevel partitions, and it worked well.
Followings are my comments for v18-0001.

01. pg_get_publication_tables

```
+               ListCell   *lc;
```

This definition can be inside of the "for (i = 0; i < nelems; i++)".

02. pg_get_publication_tables

```
-                        * If the publication publishes partition changes via their
-                        * respective root partitioned tables, we must exclude partitions
-                        * in favor of including the root partitioned tables. Otherwise,
-                        * the function could return both the child and parent tables
-                        * which could cause data of the child table to be
-                        * double-published on the subscriber side.
+                        * Publications support partitioned tables. If
+                        * publish_via_partition_root is false, all changes are replicated
+                        * using leaf partition identity and schema, so we only need those.
+                        * Otherwise, get the partitioned table itself.
```

The comments can be  inside of the "else".

03. pg_get_publication_tables

```
+               pfree(elems);
```

Only elems is pfree()'d here, but how about other variable like pub_elem and pub_elem_tables?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


Re: Data is copied twice when specifying both child and parent table in publication

От
Jacob Champion
Дата:
On Fri, Mar 17, 2023 at 9:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > There are a bunch of moving parts and hidden subtleties here, and I fell
> > into a few traps when I was working on my patch, so it'd be nice to have
> > additional coverage. I'm happy to contribute effort in that area if it's
> > helpful.
>
> I think it depends on what tests you have in mind.

Just the ones I mentioned, to start with.

> I suggest you can
> propose a patch to cover tests for this are in a separate thread. We
> can then evaluate those separately.

To confirm -- you want me to start a new thread for tests for this
patchset? (Tests written against HEAD would likely be obsoleted by
this change.)

--Jacob



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Mon, Mar 20, 2023 at 11:22 PM Jacob Champion <jchampion@timescale.com> wrote:
>
> On Fri, Mar 17, 2023 at 9:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > There are a bunch of moving parts and hidden subtleties here, and I fell
> > > into a few traps when I was working on my patch, so it'd be nice to have
> > > additional coverage. I'm happy to contribute effort in that area if it's
> > > helpful.
> >
> > I think it depends on what tests you have in mind.
>
> Just the ones I mentioned, to start with.
>
> > I suggest you can
> > propose a patch to cover tests for this are in a separate thread. We
> > can then evaluate those separately.
>
> To confirm -- you want me to start a new thread for tests for this
> patchset? (Tests written against HEAD would likely be obsoleted by
> this change.)
>

If the tests you have in mind are only related to this patch set then
feel free to propose them here if you feel the current ones are not
sufficient. I just want to be cautious that we shouldn't spend too
much time adding additional tests which are related to the base
functionality as we have left with less time for the last CF and I
would like to push the change for HEAD before that.

--
With Regards,
Amit Kapila.



RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Mon, Mar 20, 2023 at 18:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>

Thanks for your comments.

> On Mon, Mar 20, 2023 at 1:02 PM Peter Smith <smithpb2250@gmail.com>
> wrote:
> >
> >
> > ======
> > src/include/catalog/pg_proc.dat
> >
> > 4.
> > +{ oid => '6119',
> > +  descr => 'get information of the tables in the given publication array',
> >
> > Should that be worded in a way to make it more clear that the
> > "publication array" is really an "array of publication names"?
> >
> 
> I don't know how important it is to tell that the array is an array of
> publication names but the current description can be improved. How
> about something like: "get information of the tables that are part of
> the specified publications"

Changed.

> Few other comments:
> =================
> 1.
>   foreach(lc2, ancestors)
>   {
>   Oid ancestor = lfirst_oid(lc2);
> + ListCell   *lc3;
> 
>   /* Check if the parent table exists in the published table list. */
> - if (list_member_oid(relids, ancestor))
> + foreach(lc3, table_infos)
>   {
> - skip = true;
> - break;
> + Oid relid = ((published_rel *) lfirst(lc3))->relid;
> +
> + if (relid == ancestor)
> + {
> + skip = true;
> + break;
> + }
>   }
> +
> + if (skip)
> + break;
>   }
> 
> - if (!skip)
> - result = lappend_oid(result, relid);
> + if (skip)
> + table_infos = foreach_delete_current(table_infos, lc);
> 
> The usage of skip looks a bit ugly to me. Can we move the code for the
> inner loop to a separate function (like
> is_ancestor_member_tableinfos()) and remove the current cell if it
> returns true?

Changed.

> 2.
>   * Filter out the partitions whose parent tables were also specified in
>   * the publication.
>   */
> -static List *
> -filter_partitions(List *relids)
> +static void
> +filter_partitions(List *table_infos)
> 
> The comment atop filter_partitions is no longer valid. Can we slightly
> change it to: "Filter out the partitions whose parent tables are also
> present in the list."?

Changed.

> 3.
> -# Note: We create two separate tables, not a partitioned one, so that we can
> -# easily identity through which relation were the changes replicated.
> +# Note: We only create one table (tab4) here. We specified
> +# publish_via_partition_root = true (see pub_all and pub_lower_level above), so
> +# all data will be replicated to that table.
>  $node_subscriber2->safe_psql('postgres',
>   "CREATE TABLE tab4 (a int PRIMARY KEY)");
> -$node_subscriber2->safe_psql('postgres',
> - "CREATE TABLE tab4_1 (a int PRIMARY KEY)");
> 
> I am not sure if it is a good idea to remove tab4_1 here. It is
> testing something different as mentioned in the comments. Also, I
> don't see any data in tab4 for the initial sync, so not sure if this
> tests the behavior changed by this patch.

Reverted this change. And inserted the initial sync data into table tab4 to test
this more clearly.

> 4.
> --- a/src/test/subscription/t/031_column_list.pl
> +++ b/src/test/subscription/t/031_column_list.pl
> @@ -959,7 +959,8 @@ $node_publisher->safe_psql(
>   CREATE TABLE test_root_1 PARTITION OF test_root FOR VALUES FROM (1) TO
> (10);
>   CREATE TABLE test_root_2 PARTITION OF test_root FOR VALUES FROM (10) TO
> (20);
> 
> - CREATE PUBLICATION pub_root_true FOR TABLE test_root (a) WITH
> (publish_via_partition_root = true);
> + CREATE PUBLICATION pub_root_true_1 FOR TABLE test_root (a) WITH
> (publish_via_partition_root = true);
> + CREATE PUBLICATION pub_root_true_2 FOR TABLE test_root_1 (a, b) WITH
> (publish_via_partition_root = true);
> 
>   -- initial data
>   INSERT INTO test_root VALUES (1, 2, 3);
> @@ -968,7 +969,7 @@ $node_publisher->safe_psql(
> 
>  $node_subscriber->safe_psql(
>   'postgres', qq(
> - CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION
> pub_root_true;
> + CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION
> pub_root_true_1, pub_root_true_2;
> 
> It is not clear to me what exactly you want to test here. Please add
> some comments.

Tried to add the following comment to make it clear:
```
+# Subscribe to pub_root_true_1 and pub_root_true_2 at the same time, which
+# means that the initial data will be synced once, and only the column list of
+# the parent table (test_root) in the publication pub_root_true_1 will be used
+# for both table sync and data replication.
 $node_subscriber->safe_psql(
     'postgres', qq(
-    CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub_root_true;
+    CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub_root_true_1, pub_root_true_2;
```

> 5. I think you can merge the 0001 and 0003 patches.

Merged.

> Apart from the above, attached is a patch to change some of the
> comments in the patch.

Thanks for this improvement. I've checked and merged it.

Attach the new patch set.

Regards,
Wang Wei

Вложения

RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Sat, Mar 18, 2023 at 7:37 AM Jacob Champion <jchampion@timescale.com> wrote:
> On Thu, Mar 16, 2023 at 11:28 PM wangw.fnst@fujitsu.com
> <wangw.fnst@fujitsu.com> wrote:
> > Attach the new patch set.

Thanks for your comments and testing.

> For example, the corner case mentioned in 0003, with multiple
> publications having conflicting pubviaroot settings, isn't tested as far
> as I can see. (I checked manually, and it appears to work as intended.)
> And the related pub_lower_level test currently only covers the case
> where multiple publications have pubviaroot=true, so the following test
> comment is now misleading:
> 
> > # for tab4, we publish changes through the "middle" partitioned table
> > $node_publisher->safe_psql('postgres',
> >     "CREATE PUBLICATION pub_lower_level FOR TABLE tab4_1 WITH
> (publish_via_partition_root = true)"
> > );
> 
> ...since the changes are now in fact published via the tab4 root after
> this patchset is applied.

Make sense.
Tried to improve this comment like below:
```
If we subscribe only to pub_lower_level, changes for tab4 will be published
through the "middle" partition table. However, since we will be subscribing to
both pub_lower_level and pub_all (see subscription sub2 below), we will publish
changes via the root table (tab4).
```

Regards,
Wang Wei

RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Mon, Mar 20, 2023 at 15:32 PM Peter Smith <smithpb2250@gmail.com> wrote:
> Here are some review comments for v17-0001.

Thanks for your comments.

> ======
> src/backend/catalog/pg_publication.c
> 
> 1. filter_partitions
> 
> -static List *
> -filter_partitions(List *relids)
> +static void
> +filter_partitions(List *table_infos)
>  {
> - List    *result = NIL;
>   ListCell   *lc;
> - ListCell   *lc2;
> 
> - foreach(lc, relids)
> + foreach(lc, table_infos)
>   {
> - bool skip = false;
> - List    *ancestors = NIL;
> - Oid relid = lfirst_oid(lc);
> + bool skip = false;
> + List    *ancestors = NIL;
> + ListCell    *lc2;
> + published_rel    *table_info = (published_rel *) lfirst(lc);
> 
> - if (get_rel_relispartition(relid))
> - ancestors = get_partition_ancestors(relid);
> + if (get_rel_relispartition(table_info->relid))
> + ancestors = get_partition_ancestors(table_info->relid);
> 
>   foreach(lc2, ancestors)
>   {
>   Oid ancestor = lfirst_oid(lc2);
> + ListCell   *lc3;
> 
>   /* Check if the parent table exists in the published table list. */
> - if (list_member_oid(relids, ancestor))
> + foreach(lc3, table_infos)
>   {
> - skip = true;
> - break;
> + Oid relid = ((published_rel *) lfirst(lc3))->relid;
> +
> + if (relid == ancestor)
> + {
> + skip = true;
> + break;
> + }
>   }
> +
> + if (skip)
> + break;
>   }
> 
> - if (!skip)
> - result = lappend_oid(result, relid);
> + if (skip)
> + table_infos = foreach_delete_current(table_infos, lc);
>   }
> -
> - return result;
>  }
> 
> 
> It seems the 'skip' and 'ancestors' and 'lc2' vars are not needed
> except when "if (get_rel_relispartition(table_info->relid))" is true,
> so won't it be better to restructure the code to put everything inside
> that condition. Then you will save a few unnecessary tests of
> foreach(lc2, ancestors) and (skip).
> 
> For example,
> 
> static void
> filter_partitions(List *table_infos)
> {
> ListCell   *lc;
> 
> foreach(lc, table_infos)
> {
> published_rel    *table_info = (published_rel *) lfirst(lc);
> 
> if (get_rel_relispartition(table_info->relid))
> {
> bool skip = false;
> List *ancestors = get_partition_ancestors(table_info->relid);
> ListCell *lc2;
> 
> foreach(lc2, ancestors)
> {
> Oid ancestor = lfirst_oid(lc2);
> ListCell   *lc3;
> /* Check if the parent table exists in the published table list. */
> foreach(lc3, table_infos)
> {
> Oid relid = ((published_rel *) lfirst(lc3))->relid;
> 
> if (relid == ancestor)
> {
> skip = true;
> break;
> }
> }
> if (skip)
> break;
> }
> 
> if (skip)
> table_infos = foreach_delete_current(table_infos, lc);
> }
> }
> }

Refactored this part of code based on other comments.

> ~~~
> 
> 3. pg_get_publication_tables
> 
>   /* Show all columns when the column list is not specified. */
> - if (nulls[1] == true)
> + if (nulls[2] == true)
> 
> Since you are changing this line anyway, you might as well change it
> to remove the redundant "== true" part.
> 
> SUGGESTION
> if (nulls[2])

Changed.

Regards,
Wang Wei

RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Mon, Mar 20, 2023 at 21:18 PM Kuroda, Hayato/黒田 隼人 <kuroda.hayato@fujitsu.com> wrote:
> Dear Wang,
> 
> I have tested about multilevel partitions, and it worked well.
> Followings are my comments for v18-0001.

Thanks for your comments and testing.

> 01. pg_get_publication_tables
> 
> ```
> +               ListCell   *lc;
> ```
> 
> This definition can be inside of the "for (i = 0; i < nelems; i++)".

Changed.

> 02. pg_get_publication_tables
> 
> ```
> -                        * If the publication publishes partition changes via their
> -                        * respective root partitioned tables, we must exclude partitions
> -                        * in favor of including the root partitioned tables. Otherwise,
> -                        * the function could return both the child and parent tables
> -                        * which could cause data of the child table to be
> -                        * double-published on the subscriber side.
> +                        * Publications support partitioned tables. If
> +                        * publish_via_partition_root is false, all changes are replicated
> +                        * using leaf partition identity and schema, so we only need those.
> +                        * Otherwise, get the partitioned table itself.
> ```
> 
> The comments can be  inside of the "else".

Since I think there are related operations in the function
GetAllTablesPublicationRelations, it might be better to write it above the
if-statement.

> 03. pg_get_publication_tables
> 
> ```
> +               pfree(elems);
> ```
> 
> Only elems is pfree()'d here, but how about other variable like pub_elem and
> pub_elem_tables?

Added releases to these two variables.

Regards,
Wang Wei

Re: Data is copied twice when specifying both child and parent table in publication

От
Peter Smith
Дата:
Here are some review comments for patch code of HEAD_v19-0001

======
doc/src/sgml/ref/create_publication.sgml

1.
+         <para>
+          There can be a case where a subscription combines multiple
+          publications. If a root partitioned table is published by any
+          subscribed publications which set
+          <literal>publish_via_partition_root</literal> = true, changes on this
+          root partitioned table (or on its partitions) will be published using
+          the identity and schema of this root partitioned table rather than
+          that of the individual partitions.
+         </para>

1a.
The paragraph prior to this one just refers to "partitioned tables"
instead of "root partitioned table", so IMO we should continue with
the same terminology.

I also modified the remaining text slightly. AFAIK my suggestion
conveys exactly the same information but is shorter.

SUGGESTION
There can be a case where one subscription combines multiple
publications. If any of those publications has set
<literal>publish_via_partition_root</literal> = true, then changes in
a partitioned table (or on its partitions) will be published using the
identity and schema of the partitioned table.

~

1b.
Shouldn't that paragraph (or possibly somewhere in the CREATE
SUBSCRIPTION notes?) also explain that in this scenario the logical
replication will only publish one set of changes? After all, this is
the whole point of the patch, but I am not sure if the user will know
of this behaviour from the current documentation.

======
src/backend/catalog/pg_publication.c

2. filter_partitions

BEFORE:
static void
filter_partitions(List *table_infos)
{
ListCell   *lc;

foreach(lc, table_infos)
{
bool skip = false;
List    *ancestors = NIL;
ListCell    *lc2;
published_rel    *table_info = (published_rel *) lfirst(lc);

if (get_rel_relispartition(table_info->relid))
ancestors = get_partition_ancestors(table_info->relid);

foreach(lc2, ancestors)
{
Oid ancestor = lfirst_oid(lc2);

/* Is ancestor exists in the published table list? */
if (is_ancestor_member_tableinfos(ancestor, table_infos))
{
skip = true;
break;
}
}

if (skip)
table_infos = foreach_delete_current(table_infos, lc);
}
}

~

2a.
My previous review [1] (see #1) suggested putting most code within the
condition. AFAICT my comment still is applicable but was not yet
addressed.

2b.
IMO the comment "/* Is ancestor exists in the published table list?
*/" is unnecessary because it is already clear what is the purpose of
the function named "is_ancestor_member_tableinfos".


SUGGESTION
static void
filter_partitions(List *table_infos)
{
    ListCell   *lc;

    foreach(lc, table_infos)
    {
        if (get_rel_relispartition(table_info->relid))
        {
            bool skip = false;
            ListCell    *lc2;
            published_rel    *table_info = (published_rel *) lfirst(lc);
            List *ancestors = get_partition_ancestors(table_info->relid);

            foreach(lc2, ancestors)
            {
                Oid ancestor = lfirst_oid(lc2);

                if (is_ancestor_member_tableinfos(ancestor, table_infos))
                {
                    skip = true;
                    break;
                }
            }

            if (skip)
                table_infos = foreach_delete_current(table_infos, lc);
        }
    }
}

======
src/backend/commands/subscriptioncmds.c

3.
 fetch_table_list(WalReceiverConn *wrconn, List *publications)
 {
  WalRcvExecResult *res;
- StringInfoData cmd;
+ StringInfoData cmd,
+ pub_names;
  TupleTableSlot *slot;
  Oid tableRow[3] = {TEXTOID, TEXTOID, NAMEARRAYOID};
  List    *tablelist = NIL;
- bool check_columnlist = (walrcv_server_version(wrconn) >= 150000);
+ int server_version = walrcv_server_version(wrconn);
+ bool check_columnlist = (server_version >= 150000);
+
+ initStringInfo(&pub_names);
+ get_publications_str(publications, &pub_names, true);

  initStringInfo(&cmd);
- appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, t.tablename \n");

- /* Get column lists for each relation if the publisher supports it */
- if (check_columnlist)
- appendStringInfoString(&cmd, ", t.attnames\n");
+ /* Get the list of tables from the publisher. */
+ if (server_version >= 160000)
+ {

~

I think the 'pub_names' is only needed within that ">= 160000" condition.

So all the below code can be moved into that scope can't it?

+ StringInfoData pub_names;
+ initStringInfo(&pub_names);
+ get_publications_str(publications, &pub_names, true);

+ pfree(pub_names.data);

------
[1] https://www.postgresql.org/message-id/CAHut%2BPuNsvO9o9XzeJuSLsAsndgCKVphDPBRqYuOTy2bR28E%2Bg%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Data is copied twice when specifying both child and parent table in publication

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Wang,

Thank you for updating patch! Following are comments form v19-0001.

01. logical-replication.sgml

I found a following statement in logical-replication.sgml. I think this may cause
mis-reading because it's OK when publishers list partitions and publish_via_root is true.

```
  <para>
   A subscriber node may have multiple subscriptions if desired.  It is
   possible to define multiple subscriptions between a single
   publisher-subscriber pair, in which case care must be taken to ensure
   that the subscribed publication objects don't overlap.
  </para>
```

How about adding "If publications are set publish_via_partition_root as true and
they publish partitions that have same partitioned table, only a change to partitioned
table is published from the publisher."or something like that?


02. filter_partitions

IIUC this function can refactor like following to avoid "skip" flag.
How do you think?

```
@@ -209,7 +209,6 @@ filter_partitions(List *table_infos)
 
        foreach(lc, table_infos)
        {
-               bool                            skip = false;
                List                       *ancestors = NIL;
                ListCell                   *lc2;
                published_rel      *table_info = (published_rel *) lfirst(lc);
@@ -224,13 +223,10 @@ filter_partitions(List *table_infos)
                        /* Is ancestor exists in the published table list? */
                        if (is_ancestor_member_tableinfos(ancestor, table_infos))
                        {
-                               skip = true;
+                               table_infos = foreach_delete_current(table_infos, lc);
                                break;
                        }
                }
-
-               if (skip)
-                       table_infos = foreach_delete_current(table_infos, lc);
        }
 }
```

03. fetch_table_list

```
+       /* Get the list of tables from the publisher. */
+       if (server_version >= 160000)
```

I think boolean variable can be used to check it like check_columnlist.
How about "use_extended_function" or something?


Best Regards,
Hayato Kuroda
FUJITSU LIMITED


RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Wed, Mar 22, 2023 at 12:50 PM Peter Smith <smithpb2250@gmail.com> wrote:
> Here are some review comments for patch code of HEAD_v19-0001

Thanks for your comments.

> ======
> doc/src/sgml/ref/create_publication.sgml
> 
> 1.
> +         <para>
> +          There can be a case where a subscription combines multiple
> +          publications. If a root partitioned table is published by any
> +          subscribed publications which set
> +          <literal>publish_via_partition_root</literal> = true, changes on this
> +          root partitioned table (or on its partitions) will be published using
> +          the identity and schema of this root partitioned table rather than
> +          that of the individual partitions.
> +         </para>
> 
> 1a.
> The paragraph prior to this one just refers to "partitioned tables"
> instead of "root partitioned table", so IMO we should continue with
> the same terminology.

Changed.

> I also modified the remaining text slightly. AFAIK my suggestion
> conveys exactly the same information but is shorter.
> 
> SUGGESTION
> There can be a case where one subscription combines multiple
> publications. If any of those publications has set
> <literal>publish_via_partition_root</literal> = true, then changes in
> a partitioned table (or on its partitions) will be published using the
> identity and schema of the partitioned table.

Sorry, I'm not sure about this.
I'm not a native speaker of English, but it seems like the following use case is
not explained:
```
create table t1 (a int primary key) partition by range (a);
create table t2 (a int primary key) partition by range (a);
create table t3 (a int primary key);
alter table t1 attach partition t2 default;
alter table t2 attach partition t3 default;

create publication p1 for table t1;
create publication p2_via for table t2 with(publish_via_partition_root);
create publication p3 for table t3;
```
If we subscribe to p1, p2_via and p3 at the same time, then t2's identity and
schema will be used instead of t1's (and of course not t3's).

> ~
> 
> 1b.
> Shouldn't that paragraph (or possibly somewhere in the CREATE
> SUBSCRIPTION notes?) also explain that in this scenario the logical
> replication will only publish one set of changes? After all, this is
> the whole point of the patch, but I am not sure if the user will know
> of this behaviour from the current documentation.

It seems to me that what you're explaining is what users expect. So, it seems we
don't need to explain it.
BTW IIUC, when user wants to use the "via_root" option, they should first read
the pg-doc to confirm the meaning and related notes of this option. So, I'm not
sure if adding this section in other documentation would be redundant.

> ======
> src/backend/catalog/pg_publication.c
> 
> 2. filter_partitions
> 
> BEFORE:
> static void
> filter_partitions(List *table_infos)
> {
> ListCell   *lc;
> 
> foreach(lc, table_infos)
> {
> bool skip = false;
> List    *ancestors = NIL;
> ListCell    *lc2;
> published_rel    *table_info = (published_rel *) lfirst(lc);
> 
> if (get_rel_relispartition(table_info->relid))
> ancestors = get_partition_ancestors(table_info->relid);
> 
> foreach(lc2, ancestors)
> {
> Oid ancestor = lfirst_oid(lc2);
> 
> /* Is ancestor exists in the published table list? */
> if (is_ancestor_member_tableinfos(ancestor, table_infos))
> {
> skip = true;
> break;
> }
> }
> 
> if (skip)
> table_infos = foreach_delete_current(table_infos, lc);
> }
> }
> 
> ~
> 
> 2a.
> My previous review [1] (see #1) suggested putting most code within the
> condition. AFAICT my comment still is applicable but was not yet
> addressed.

Personally, I prefer the current style because the approach you mentioned adds
some indentations.

> 2b.
> IMO the comment "/* Is ancestor exists in the published table list?
> */" is unnecessary because it is already clear what is the purpose of
> the function named "is_ancestor_member_tableinfos".

Removed.

> ======
> src/backend/commands/subscriptioncmds.c
> 
> 3.
>  fetch_table_list(WalReceiverConn *wrconn, List *publications)
>  {
>   WalRcvExecResult *res;
> - StringInfoData cmd;
> + StringInfoData cmd,
> + pub_names;
>   TupleTableSlot *slot;
>   Oid tableRow[3] = {TEXTOID, TEXTOID, NAMEARRAYOID};
>   List    *tablelist = NIL;
> - bool check_columnlist = (walrcv_server_version(wrconn) >= 150000);
> + int server_version = walrcv_server_version(wrconn);
> + bool check_columnlist = (server_version >= 150000);
> +
> + initStringInfo(&pub_names);
> + get_publications_str(publications, &pub_names, true);
> 
>   initStringInfo(&cmd);
> - appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, t.tablename
> \n");
> 
> - /* Get column lists for each relation if the publisher supports it */
> - if (check_columnlist)
> - appendStringInfoString(&cmd, ", t.attnames\n");
> + /* Get the list of tables from the publisher. */
> + if (server_version >= 160000)
> + {
> 
> ~
> 
> I think the 'pub_names' is only needed within that ">= 160000" condition.
> 
> So all the below code can be moved into that scope can't it?
> 
> + StringInfoData pub_names;
> + initStringInfo(&pub_names);
> + get_publications_str(publications, &pub_names, true);
> 
> + pfree(pub_names.data);

Changed.

Also, I've run pgindent for the new patch set.
Attach the new patch set.

Regards,
Wang Wei

Вложения

RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Wed, Mar 22, 2023 at 14:32 PM Kuroda, Hayato/黒田 隼人 <kuroda.hayato@fujitsu.com> wrote:
> Dear Wang,
> 
> Thank you for updating patch! Following are comments form v19-0001.

Thanks for your comments.

> 01. logical-replication.sgml
> 
> I found a following statement in logical-replication.sgml. I think this may cause
> mis-reading because it's OK when publishers list partitions and publish_via_root
> is true.
> 
> ```
>   <para>
>    A subscriber node may have multiple subscriptions if desired.  It is
>    possible to define multiple subscriptions between a single
>    publisher-subscriber pair, in which case care must be taken to ensure
>    that the subscribed publication objects don't overlap.
>   </para>
> ```
> 
> How about adding "If publications are set publish_via_partition_root as true and
> they publish partitions that have same partitioned table, only a change to
> partitioned
> table is published from the publisher."or something like that?

I think these seem to be two different scenarios: The scenario mentioned here is
multiple subscriptions at the subscription node, while the scenario we fixed
this time is a single subscription at the subscription node. So, it seems that
these two notes are not strongly related.

> 02. filter_partitions
> 
> IIUC this function can refactor like following to avoid "skip" flag.
> How do you think?
> 
> ```
> @@ -209,7 +209,6 @@ filter_partitions(List *table_infos)
> 
>         foreach(lc, table_infos)
>         {
> -               bool                            skip = false;
>                 List                       *ancestors = NIL;
>                 ListCell                   *lc2;
>                 published_rel      *table_info = (published_rel *) lfirst(lc);
> @@ -224,13 +223,10 @@ filter_partitions(List *table_infos)
>                         /* Is ancestor exists in the published table list? */
>                         if (is_ancestor_member_tableinfos(ancestor, table_infos))
>                         {
> -                               skip = true;
> +                               table_infos = foreach_delete_current(table_infos, lc);
>                                 break;
>                         }
>                 }
> -
> -               if (skip)
> -                       table_infos = foreach_delete_current(table_infos, lc);
>         }
>  }
> ```

I think this approach deletes the cell of the list of the outer loop in the
inner loop. IIUC, we can only use function foreach_delete_current in the current
loop to delete the cell of the current loop.

> 03. fetch_table_list
> 
> ```
> +       /* Get the list of tables from the publisher. */
> +       if (server_version >= 160000)
> ```
> 
> I think boolean variable can be used to check it like check_columnlist.
> How about "use_extended_function" or something?

Since we only need it once, I think it's fine not to add a new variable.

Regards,
Wang Wei

Re: Data is copied twice when specifying both child and parent table in publication

От
Peter Smith
Дата:
Here are some review comments for patch v20-0001.

======
General.

1.
That function 'pg_get_publication_tables' does not seem to be
described in the PG documentation. Why isn't it in the "System Catalog
Information Functions" table [1] ?

I asked this same question a long time ago but then the reply [2] was
like "it doesn't seem to be a function provided to users".

Well, perhaps that just means that the documentation has been
accidentally missing for a long time. Does anybody know for sure if
the omission of this function from the documentation is deliberate? If
nobody here knows, then maybe this can be asked/addressed in a
separate thread.

======
src/backend/catalog/pg_publication.c

2. filter_partitions

(review comment from my v19 review)

> 2a.
> My previous review [1] (see #1) suggested putting most code within the
> condition. AFAICT my comment still is applicable but was not yet
> addressed.

22/3 Wang-san replied: "Personally, I prefer the current style because
the approach you mentioned adds some indentations."

Sure, but there is more than just indentation/style differences here.
Currently, there is some unnecessary code executed if the table is not
a partition. And the reader cannot tell at-a-glance if (skip) will be
true/false without looking more closely at the loop logic. So, I think
changing it would be better, but anyway I won’t debate about it any
more because it's not a functional problem.

======
src/backend/commands/subscriptioncmds.c

3. fetch_table_list

+ /* Get the list of tables from the publisher. */
+ if (server_version >= 160000)
+ {
+ StringInfoData pub_names;

- appendStringInfoString(&cmd, "FROM pg_catalog.pg_publication_tables t\n"
-    " WHERE t.pubname IN (");
- get_publications_str(publications, &cmd, true);
- appendStringInfoChar(&cmd, ')');
+ initStringInfo(&pub_names);
+ get_publications_str(publications, &pub_names, true);
+
+ /*
+ * From version 16, we allowed passing multiple publications to the
+ * function pg_get_publication_tables. This helped to filter out the
+ * partition table whose ancestor is also published in this
+ * publication array.
+ *
+ * Join pg_get_publication_tables with pg_publication to exclude
+ * non-existing publications.
+ */
+ appendStringInfo(&cmd, "SELECT DISTINCT N.nspname, C.relname,\n"
+ "              ( SELECT array_agg(a.attname ORDER BY a.attnum)\n"
+ "                FROM pg_attribute a\n"
+ "                WHERE a.attrelid = GPT.relid AND\n"
+ "                      a.attnum = ANY(GPT.attrs)\n"
+ "              ) AS attnames\n"
+ " FROM pg_class C\n"
+ "   JOIN pg_namespace N ON N.oid = C.relnamespace\n"
+ "   JOIN ( SELECT (pg_get_publication_tables(VARIADIC
array_agg(pubname::text))).*\n"
+ "          FROM pg_publication\n"
+ "          WHERE pubname IN ( %s )) as GPT\n"
+ "       ON GPT.relid = C.oid\n",
+ pub_names.data);
+
+ pfree(pub_names.data);
+ }
+ else
+ {
+ appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, t.tablename \n");
+
+ /* Get column lists for each relation if the publisher supports it */
+ if (check_columnlist)
+ appendStringInfoString(&cmd, ", t.attnames\n");
+
+ appendStringInfoString(&cmd, "FROM pg_catalog.pg_publication_tables t\n"
+    " WHERE t.pubname IN (");
+ get_publications_str(publications, &cmd, true);
+ appendStringInfoChar(&cmd, ')');
+ }

I noticed the SQL "if" part is using uppercase aliases, but the SQL in
the "else" part is using lowercase aliases. I think it would be better
to be consistent (pick one).

======
src/test/subscription/t/013_partition.pl

4.
-# for tab4, we publish changes through the "middle" partitioned table
+# If we subscribe only to pub_lower_level, changes for tab4 will be published
+# through the "middle" partition table. However, since we will be subscribing
+# to both pub_lower_level and pub_all (see subscription sub2 below), we will
+# publish changes via the root table (tab4).
 $node_publisher->safe_psql('postgres',
  "CREATE PUBLICATION pub_lower_level FOR TABLE tab4_1 WITH
(publish_via_partition_root = true)"
 );

~

This comment seemed a bit overkill IMO. I don't think you need to say
much here except maybe:

# Note that subscription "sub2" will later subscribe simultaneously to
both pub_lower_level (i.e. just table tab4_1) and  pub_all.

~~~

5.
I think maybe you could have another test scenario where you INSERT
something into tab4_1_1 while only the publication for tab4_1 has
publish_via_partition_root=true

~~~

6.
AFAICT the tab4 tests are only testing the initial sync, but are not
testing normal replication. Maybe some more normal (post sync) INSERTS
are needed to tab4, tab4_1, tab4_1_1 ?


======
src/test/subscription/t/028_row_filter.pl

7.
+# insert data into partitioned table.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_rowfilter_viaroot_part(a) VALUES(13), (17)");
+
 $node_subscriber->safe_psql('postgres',
  "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr
application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2,
tap_pub_3, tap_pub_4a, tap_pub_4b, tap_pub_5a, tap_pub_5b,
tap_pub_toast, tap_pub_inherits, tap_pub_viaroot_2, tap_pub_viaroot_1"
 );
@@ -707,13 +711,17 @@ is($result, qq(t|1), 'check replicated rows to
tab_rowfilter_toast');
 # the row filter for the top-level ancestor:
 #
 # tab_rowfilter_viaroot_part filter is: (a > 15)
+# - INSERT (13)        NO, 13 < 15
 # - INSERT (14)        NO, 14 < 15
 # - INSERT (15)        NO, 15 = 15
 # - INSERT (16)        YES, 16 > 15
+# - INSERT (17)        YES, 17 > 15
 $result =
   $node_subscriber->safe_psql('postgres',
- "SELECT a FROM tab_rowfilter_viaroot_part");
-is($result, qq(16), 'check replicated rows to tab_rowfilter_viaroot_part');
+ "SELECT a FROM tab_rowfilter_viaroot_part ORDER BY 1");
+is( $result, qq(16
+17),
+ 'check replicated rows to tab_rowfilter_viaroot_part');

~

I'm not 100% sure this is testing quite what you want to test. AFAICT
the subscription is created like:

"CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr
application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2,
tap_pub_3, tap_pub_4a, tap_pub_4b, tap_pub_5a, tap_pub_5b,
tap_pub_toast, tap_pub_inherits, tap_pub_viaroot_2, tap_pub_viaroot_1"

Notice in this case  BOTH the partitioned table and the partition had
been published using "WITH (publish_via_partition_root)". But, IIUC
won't it be better to test when only the partition's publication was
using that option?

For example, I think then it would be a better test of this "At least one" code:

/* At least one publication is using publish_via_partition_root. */
if (pub_elem->pubviaroot)
    viaroot = true;
======
src/test/subscription/t/031_column_list.pl

8.
- CREATE PUBLICATION pub_root_true FOR TABLE test_root (a) WITH
(publish_via_partition_root = true);
+ CREATE PUBLICATION pub_root_true_1 FOR TABLE test_root (a) WITH
(publish_via_partition_root = true);
+ CREATE PUBLICATION pub_root_true_2 FOR TABLE test_root_1 (a, b) WITH
(publish_via_partition_root = true);

  -- initial data
  INSERT INTO test_root VALUES (1, 2, 3);
  INSERT INTO test_root VALUES (10, 20, 30);
 ));

+# Subscribe to pub_root_true_1 and pub_root_true_2 at the same time, which
+# means that the initial data will be synced once, and only the column list of
+# the parent table (test_root) in the publication pub_root_true_1 will be used
+# for both table sync and data replication.
 $node_subscriber->safe_psql(
  'postgres', qq(
- CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION
pub_root_true;
+ CREATE

~

(This is simlar to the previous review comment #7 above)

Won't it be a better test of the "At least one" code when only the
publication of partition (test_root_1) is using "WITH
(publish_via_partition_root = true)".

e.g
CREATE PUBLICATION pub_root_true_1 FOR TABLE test_root (a);
CREATE PUBLICATION pub_root_true_2 FOR TABLE test_root_1 (a, b) WITH
(publish_via_partition_root = true);

------
[1] https://www.postgresql.org/docs/devel/functions-info.html#FUNCTIONS-INFO-CATALOG-TABLE
[2]
https://www.postgresql.org/message-id/OS3PR01MB6275FB5397C6A647F262A3A69E009%40OS3PR01MB6275.jpnprd01.prod.outlook.com

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Thu, Mar 23, 2023 at 9:57 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are some review comments for patch v20-0001.
>
> ======
> General.
>
> 1.
> That function 'pg_get_publication_tables' does not seem to be
> described in the PG documentation. Why isn't it in the "System Catalog
> Information Functions" table [1] ?
>
> I asked this same question a long time ago but then the reply [2] was
> like "it doesn't seem to be a function provided to users".
>
> Well, perhaps that just means that the documentation has been
> accidentally missing for a long time. Does anybody know for sure if
> the omission of this function from the documentation is deliberate? If
> nobody here knows, then maybe this can be asked/addressed in a
> separate thread.
>

It is better that you start a separate thread to discuss this question
as it is not related to this patch.

--
With Regards,
Amit Kapila.



RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Thu, Mar 23, 2023 at 12:27 PM Peter Smith <smithpb2250@gmail.com> wrote:
> Here are some review comments for patch v20-0001.

Thanks for your comments.

> ======
> src/backend/commands/subscriptioncmds.c
> 
> 3. fetch_table_list
> 
> + /* Get the list of tables from the publisher. */
> + if (server_version >= 160000)
> + {
> + StringInfoData pub_names;
> 
> - appendStringInfoString(&cmd, "FROM pg_catalog.pg_publication_tables t\n"
> -    " WHERE t.pubname IN (");
> - get_publications_str(publications, &cmd, true);
> - appendStringInfoChar(&cmd, ')');
> + initStringInfo(&pub_names);
> + get_publications_str(publications, &pub_names, true);
> +
> + /*
> + * From version 16, we allowed passing multiple publications to the
> + * function pg_get_publication_tables. This helped to filter out the
> + * partition table whose ancestor is also published in this
> + * publication array.
> + *
> + * Join pg_get_publication_tables with pg_publication to exclude
> + * non-existing publications.
> + */
> + appendStringInfo(&cmd, "SELECT DISTINCT N.nspname, C.relname,\n"
> + "              ( SELECT array_agg(a.attname ORDER BY a.attnum)\n"
> + "                FROM pg_attribute a\n"
> + "                WHERE a.attrelid = GPT.relid AND\n"
> + "                      a.attnum = ANY(GPT.attrs)\n"
> + "              ) AS attnames\n"
> + " FROM pg_class C\n"
> + "   JOIN pg_namespace N ON N.oid = C.relnamespace\n"
> + "   JOIN ( SELECT (pg_get_publication_tables(VARIADIC
> array_agg(pubname::text))).*\n"
> + "          FROM pg_publication\n"
> + "          WHERE pubname IN ( %s )) as GPT\n"
> + "       ON GPT.relid = C.oid\n",
> + pub_names.data);
> +
> + pfree(pub_names.data);
> + }
> + else
> + {
> + appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, t.tablename
> \n");
> +
> + /* Get column lists for each relation if the publisher supports it */
> + if (check_columnlist)
> + appendStringInfoString(&cmd, ", t.attnames\n");
> +
> + appendStringInfoString(&cmd, "FROM pg_catalog.pg_publication_tables t\n"
> +    " WHERE t.pubname IN (");
> + get_publications_str(publications, &cmd, true);
> + appendStringInfoChar(&cmd, ')');
> + }
> 
> I noticed the SQL "if" part is using uppercase aliases, but the SQL in
> the "else" part is using lowercase aliases. I think it would be better
> to be consistent (pick one).

Unified them into lowercase aliases.

> ======
> src/test/subscription/t/013_partition.pl
> 
> 4.
> -# for tab4, we publish changes through the "middle" partitioned table
> +# If we subscribe only to pub_lower_level, changes for tab4 will be published
> +# through the "middle" partition table. However, since we will be subscribing
> +# to both pub_lower_level and pub_all (see subscription sub2 below), we will
> +# publish changes via the root table (tab4).
>  $node_publisher->safe_psql('postgres',
>   "CREATE PUBLICATION pub_lower_level FOR TABLE tab4_1 WITH
> (publish_via_partition_root = true)"
>  );
> 
> ~
> 
> This comment seemed a bit overkill IMO. I don't think you need to say
> much here except maybe:
> 
> # Note that subscription "sub2" will later subscribe simultaneously to
> both pub_lower_level (i.e. just table tab4_1) and  pub_all.

Changed.

> ~~~
> 
> 5.
> I think maybe you could have another test scenario where you INSERT
> something into tab4_1_1 while only the publication for tab4_1 has
> publish_via_partition_root=true

I'm not sure if this scenario is necessary.

> ~~~
> 
> 6.
> AFAICT the tab4 tests are only testing the initial sync, but are not
> testing normal replication. Maybe some more normal (post sync) INSERTS
> are needed to tab4, tab4_1, tab4_1_1 ?

Since I think the scenario we fixed is sync and not replication, it doesn't seem
like we should extend the test you mentioned.

> ======
> src/test/subscription/t/028_row_filter.pl
> 
> 7.
> +# insert data into partitioned table.
> +$node_publisher->safe_psql('postgres',
> + "INSERT INTO tab_rowfilter_viaroot_part(a) VALUES(13), (17)");
> +
>  $node_subscriber->safe_psql('postgres',
>   "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr
> application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2,
> tap_pub_3, tap_pub_4a, tap_pub_4b, tap_pub_5a, tap_pub_5b,
> tap_pub_toast, tap_pub_inherits, tap_pub_viaroot_2, tap_pub_viaroot_1"
>  );
> @@ -707,13 +711,17 @@ is($result, qq(t|1), 'check replicated rows to
> tab_rowfilter_toast');
>  # the row filter for the top-level ancestor:
>  #
>  # tab_rowfilter_viaroot_part filter is: (a > 15)
> +# - INSERT (13)        NO, 13 < 15
>  # - INSERT (14)        NO, 14 < 15
>  # - INSERT (15)        NO, 15 = 15
>  # - INSERT (16)        YES, 16 > 15
> +# - INSERT (17)        YES, 17 > 15
>  $result =
>    $node_subscriber->safe_psql('postgres',
> - "SELECT a FROM tab_rowfilter_viaroot_part");
> -is($result, qq(16), 'check replicated rows to tab_rowfilter_viaroot_part');
> + "SELECT a FROM tab_rowfilter_viaroot_part ORDER BY 1");
> +is( $result, qq(16
> +17),
> + 'check replicated rows to tab_rowfilter_viaroot_part');
> 
> ~
> 
> I'm not 100% sure this is testing quite what you want to test. AFAICT
> the subscription is created like:
> 
> "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr
> application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2,
> tap_pub_3, tap_pub_4a, tap_pub_4b, tap_pub_5a, tap_pub_5b,
> tap_pub_toast, tap_pub_inherits, tap_pub_viaroot_2, tap_pub_viaroot_1"

I think this is the scenario we fixed : Simultaneously subscribing to two
publications that publish the parent and child respectively, then want to use
the parent's identity and schema).

> Notice in this case  BOTH the partitioned table and the partition had
> been published using "WITH (publish_via_partition_root)". But, IIUC
> won't it be better to test when only the partition's publication was
> using that option?
> 
> For example, I think then it would be a better test of this "At least one" code:
> 
> /* At least one publication is using publish_via_partition_root. */
> if (pub_elem->pubviaroot)
>     viaroot = true;
>
> ======
> src/test/subscription/t/031_column_list.pl
> 
> 8.
> - CREATE PUBLICATION pub_root_true FOR TABLE test_root (a) WITH
> (publish_via_partition_root = true);
> + CREATE PUBLICATION pub_root_true_1 FOR TABLE test_root (a) WITH
> (publish_via_partition_root = true);
> + CREATE PUBLICATION pub_root_true_2 FOR TABLE test_root_1 (a, b) WITH
> (publish_via_partition_root = true);
> 
>   -- initial data
>   INSERT INTO test_root VALUES (1, 2, 3);
>   INSERT INTO test_root VALUES (10, 20, 30);
>  ));
> 
> +# Subscribe to pub_root_true_1 and pub_root_true_2 at the same time, which
> +# means that the initial data will be synced once, and only the column list of
> +# the parent table (test_root) in the publication pub_root_true_1 will be used
> +# for both table sync and data replication.
>  $node_subscriber->safe_psql(
>   'postgres', qq(
> - CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION
> pub_root_true;
> + CREATE
> 
> ~
> 
> (This is simlar to the previous review comment #7 above)
> 
> Won't it be a better test of the "At least one" code when only the
> publication of partition (test_root_1) is using "WITH
> (publish_via_partition_root = true)".
> 
> e.g
> CREATE PUBLICATION pub_root_true_1 FOR TABLE test_root (a);
> CREATE PUBLICATION pub_root_true_2 FOR TABLE test_root_1 (a, b) WITH
> (publish_via_partition_root = true);

I think specifying one or both is the same scenario here.
But it seemed clearer if only the "via_root" option is specified in the
publication that publishes the parent, so I changed this point in
"031_column_list.pl". Since the publications in "028_row_filter.pl" were
introduced by other commits, I didn't change it.

Attach the new patch set.

Regards,
Wang Wei

Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Peter Smith
Дата:
On Thu, Mar 23, 2023 at 5:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Mar 23, 2023 at 9:57 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > Here are some review comments for patch v20-0001.
> >
> > ======
> > General.
> >
> > 1.
> > That function 'pg_get_publication_tables' does not seem to be
> > described in the PG documentation. Why isn't it in the "System Catalog
> > Information Functions" table [1] ?
> >
> > I asked this same question a long time ago but then the reply [2] was
> > like "it doesn't seem to be a function provided to users".
> >
> > Well, perhaps that just means that the documentation has been
> > accidentally missing for a long time. Does anybody know for sure if
> > the omission of this function from the documentation is deliberate? If
> > nobody here knows, then maybe this can be asked/addressed in a
> > separate thread.
> >
>
> It is better that you start a separate thread to discuss this question
> as it is not related to this patch.
>

OK. I have asked this question in a new thread here [1].

------
[1] https://www.postgresql.org/message-id/CAHut%2BPvGQER0rbNWii1U4c-npDhP-HxfX5yj5fmfBo%3D45z9pPA%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Data is copied twice when specifying both child and parent table in publication

От
Peter Smith
Дата:
Hi Wang-san. I looked at the v21-0001 patch.

I don't have any new review comments -- only follow-ups for some of my
previous v20 comments that were rejected.

On Thu, Mar 23, 2023 at 8:11 PM wangw.fnst@fujitsu.com
<wangw.fnst@fujitsu.com> wrote:
>
> On Thu, Mar 23, 2023 at 12:27 PM Peter Smith <smithpb2250@gmail.com> wrote:
> > Here are some review comments for patch v20-0001.
>
...
> > ======
> > src/test/subscription/t/013_partition.pl
> >
> > 5.
> > I think maybe you could have another test scenario where you INSERT
> > something into tab4_1_1 while only the publication for tab4_1 has
> > publish_via_partition_root=true
>
> I'm not sure if this scenario is necessary.
>

Please see my reply for #7 below.

> > ~~~
> >
> > 6.
> > AFAICT the tab4 tests are only testing the initial sync, but are not
> > testing normal replication. Maybe some more normal (post sync) INSERTS
> > are needed to tab4, tab4_1, tab4_1_1 ?
>
> Since I think the scenario we fixed is sync and not replication, it doesn't seem
> like we should extend the test you mentioned.
>

Maybe you are right. I only thought it would be better to have testing
which verifies that the sync phase and the normal replication phase
are using the same rules.

> > ======
> > src/test/subscription/t/028_row_filter.pl
> >
> > 7.
> > +# insert data into partitioned table.
> > +$node_publisher->safe_psql('postgres',
> > + "INSERT INTO tab_rowfilter_viaroot_part(a) VALUES(13), (17)");
> > +
> >  $node_subscriber->safe_psql('postgres',
> >   "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr
> > application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2,
> > tap_pub_3, tap_pub_4a, tap_pub_4b, tap_pub_5a, tap_pub_5b,
> > tap_pub_toast, tap_pub_inherits, tap_pub_viaroot_2, tap_pub_viaroot_1"
> >  );
> > @@ -707,13 +711,17 @@ is($result, qq(t|1), 'check replicated rows to
> > tab_rowfilter_toast');
> >  # the row filter for the top-level ancestor:
> >  #
> >  # tab_rowfilter_viaroot_part filter is: (a > 15)
> > +# - INSERT (13)        NO, 13 < 15
> >  # - INSERT (14)        NO, 14 < 15
> >  # - INSERT (15)        NO, 15 = 15
> >  # - INSERT (16)        YES, 16 > 15
> > +# - INSERT (17)        YES, 17 > 15
> >  $result =
> >    $node_subscriber->safe_psql('postgres',
> > - "SELECT a FROM tab_rowfilter_viaroot_part");
> > -is($result, qq(16), 'check replicated rows to tab_rowfilter_viaroot_part');
> > + "SELECT a FROM tab_rowfilter_viaroot_part ORDER BY 1");
> > +is( $result, qq(16
> > +17),
> > + 'check replicated rows to tab_rowfilter_viaroot_part');
> >
> > ~
> >
> > I'm not 100% sure this is testing quite what you want to test. AFAICT
> > the subscription is created like:
> >
> > "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr
> > application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2,
> > tap_pub_3, tap_pub_4a, tap_pub_4b, tap_pub_5a, tap_pub_5b,
> > tap_pub_toast, tap_pub_inherits, tap_pub_viaroot_2, tap_pub_viaroot_1"
>
> I think this is the scenario we fixed : Simultaneously subscribing to two
> publications that publish the parent and child respectively, then want to use
> the parent's identity and schema).
>

Yeah, but currently BOTH the tap_pub_viaroot_2, tap_pub_viaroot_1 are
using "WITH (publish_via_partition_root)", so IMO the user would
surely expect that only the root table would be published even when a
subscription combines those publications. OTOH, I thought a subtle
point of this patch is that now the same result will happen even if
only ONE of the publications was using "WITH
(publish_via_partition_root)". So it’s that scenario of “only ONE
publication is using the option” that I thought ought to be explicitly
tested.

This was the same also reason for my comment #5 above.

> > ======
> > src/test/subscription/t/031_column_list.pl
> >
> > 8.
> > - CREATE PUBLICATION pub_root_true FOR TABLE test_root (a) WITH
> > (publish_via_partition_root = true);
> > + CREATE PUBLICATION pub_root_true_1 FOR TABLE test_root (a) WITH
> > (publish_via_partition_root = true);
> > + CREATE PUBLICATION pub_root_true_2 FOR TABLE test_root_1 (a, b) WITH
> > (publish_via_partition_root = true);
> >
> >   -- initial data
> >   INSERT INTO test_root VALUES (1, 2, 3);
> >   INSERT INTO test_root VALUES (10, 20, 30);
> >  ));
> >
> > +# Subscribe to pub_root_true_1 and pub_root_true_2 at the same time, which
> > +# means that the initial data will be synced once, and only the column list of
> > +# the parent table (test_root) in the publication pub_root_true_1 will be used
> > +# for both table sync and data replication.
> >  $node_subscriber->safe_psql(
> >   'postgres', qq(
> > - CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION
> > pub_root_true;
> > + CREATE
> >
> > ~
> >
> > (This is similar to the previous review comment #7 above)
> >
> > Won't it be a better test of the "At least one" code when only the
> > publication of partition (test_root_1) is using "WITH
> > (publish_via_partition_root = true)".
> >
> > e.g
> > CREATE PUBLICATION pub_root_true_1 FOR TABLE test_root (a);
> > CREATE PUBLICATION pub_root_true_2 FOR TABLE test_root_1 (a, b) WITH
> > (publish_via_partition_root = true);
>
> I think specifying one or both is the same scenario here.
> But it seemed clearer if only the "via_root" option is specified in the
> publication that publishes the parent, so I changed this point in
> "031_column_list.pl". Since the publications in "028_row_filter.pl" were
> introduced by other commits, I didn't change it.
>

In hindsight, I think those publications should be renamed to
something more appropriate. The name "pub_root_true_2" seems
misleading now since the publish_via_partition_root = false

e.g.1. pub_test_root, pub_test_root_1
or
e.g.2. pub_root_true, pub_root_1_false
etc.

------
Kind Regards,
Peter Smith.
Fujitsu Australia.



Re: Data is copied twice when specifying both child and parent table in publication

От
Peter Smith
Дата:
BTW, since this patch changes the signature of the API
pg_get_publication_tables, I assume the example in the CREATE
SUBSCRIPTION Notes [1] may not work anymore.

Meanwhile, Tom Lane suggested [2] that the example could be re-written
to avoid even mentioning pg_get_publication_tables at all.

------
[1] https://www.postgresql.org/docs/devel/sql-createsubscription.html
[2] https://www.postgresql.org/message-id/2106581.1679610361%40sss.pgh.pa.us

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Fri, Mar 24, 2023 at 7:19 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Wang-san. I looked at the v21-0001 patch.
>
> I don't have any new review comments -- only follow-ups for some of my
> previous v20 comments that were rejected.
>
> On Thu, Mar 23, 2023 at 8:11 PM wangw.fnst@fujitsu.com
> <wangw.fnst@fujitsu.com> wrote:
> >
> > On Thu, Mar 23, 2023 at 12:27 PM Peter Smith <smithpb2250@gmail.com> wrote:
> > > Here are some review comments for patch v20-0001.
> >
> ...
> > > ======
> > > src/test/subscription/t/013_partition.pl
> > >
> > > 5.
> > > I think maybe you could have another test scenario where you INSERT
> > > something into tab4_1_1 while only the publication for tab4_1 has
> > > publish_via_partition_root=true
> >
> > I'm not sure if this scenario is necessary.
> >
>
> Please see my reply for #7 below.
>
> > > ~~~
> > >
> > > 6.
> > > AFAICT the tab4 tests are only testing the initial sync, but are not
> > > testing normal replication. Maybe some more normal (post sync) INSERTS
> > > are needed to tab4, tab4_1, tab4_1_1 ?
> >
> > Since I think the scenario we fixed is sync and not replication, it doesn't seem
> > like we should extend the test you mentioned.
> >
>
> Maybe you are right. I only thought it would be better to have testing
> which verifies that the sync phase and the normal replication phase
> are using the same rules.
>

Yeah, we could extend such tests if we want but I think it is not a
must as the patch didn't change this behavior.

> > > ======
> > > src/test/subscription/t/028_row_filter.pl
> > >
> > > 7.
> > > +# insert data into partitioned table.
> > > +$node_publisher->safe_psql('postgres',
> > > + "INSERT INTO tab_rowfilter_viaroot_part(a) VALUES(13), (17)");
> > > +
> > >  $node_subscriber->safe_psql('postgres',
> > >   "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr
> > > application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2,
> > > tap_pub_3, tap_pub_4a, tap_pub_4b, tap_pub_5a, tap_pub_5b,
> > > tap_pub_toast, tap_pub_inherits, tap_pub_viaroot_2, tap_pub_viaroot_1"
> > >  );
> > > @@ -707,13 +711,17 @@ is($result, qq(t|1), 'check replicated rows to
> > > tab_rowfilter_toast');
> > >  # the row filter for the top-level ancestor:
> > >  #
> > >  # tab_rowfilter_viaroot_part filter is: (a > 15)
> > > +# - INSERT (13)        NO, 13 < 15
> > >  # - INSERT (14)        NO, 14 < 15
> > >  # - INSERT (15)        NO, 15 = 15
> > >  # - INSERT (16)        YES, 16 > 15
> > > +# - INSERT (17)        YES, 17 > 15
> > >  $result =
> > >    $node_subscriber->safe_psql('postgres',
> > > - "SELECT a FROM tab_rowfilter_viaroot_part");
> > > -is($result, qq(16), 'check replicated rows to tab_rowfilter_viaroot_part');
> > > + "SELECT a FROM tab_rowfilter_viaroot_part ORDER BY 1");
> > > +is( $result, qq(16
> > > +17),
> > > + 'check replicated rows to tab_rowfilter_viaroot_part');
> > >
> > > ~
> > >
> > > I'm not 100% sure this is testing quite what you want to test. AFAICT
> > > the subscription is created like:
> > >
> > > "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr
> > > application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2,
> > > tap_pub_3, tap_pub_4a, tap_pub_4b, tap_pub_5a, tap_pub_5b,
> > > tap_pub_toast, tap_pub_inherits, tap_pub_viaroot_2, tap_pub_viaroot_1"
> >
> > I think this is the scenario we fixed : Simultaneously subscribing to two
> > publications that publish the parent and child respectively, then want to use
> > the parent's identity and schema).
> >
>
> Yeah, but currently BOTH the tap_pub_viaroot_2, tap_pub_viaroot_1 are
> using "WITH (publish_via_partition_root)", so IMO the user would
> surely expect that only the root table would be published even when a
> subscription combines those publications. OTOH, I thought a subtle
> point of this patch is that now the same result will happen even if
> only ONE of the publications was using "WITH
> (publish_via_partition_root)". So it’s that scenario of “only ONE
> publication is using the option” that I thought ought to be explicitly
> tested.
>

The current change to existing tests is difficult to understand. I
suggest writing a separate test for row filter and then cover the
scenario you have suggested.

--
With Regards,
Amit Kapila.



RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Fri, Mar 24, 2023 at 10:14 AM Peter Smith <smithpb2250@gmail.com> wrote:
>

Thanks for the information.

> BTW, since this patch changes the signature of the API
> pg_get_publication_tables, I assume the example in the CREATE
> SUBSCRIPTION Notes [1] may not work anymore.

The use case you mentioned is still work.

> Meanwhile, Tom Lane suggested [2] that the example could be re-written
> to avoid even mentioning pg_get_publication_tables at all.

I'm not against this.

Regards,
Wang Wei

RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Fri, Mar 24, 2023 at 14:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > ======
> > > > src/test/subscription/t/028_row_filter.pl
> > > >
> > > > 7.
> > > > +# insert data into partitioned table.
> > > > +$node_publisher->safe_psql('postgres',
> > > > + "INSERT INTO tab_rowfilter_viaroot_part(a) VALUES(13), (17)");
> > > > +
> > > >  $node_subscriber->safe_psql('postgres',
> > > >   "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr
> > > > application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2,
> > > > tap_pub_3, tap_pub_4a, tap_pub_4b, tap_pub_5a, tap_pub_5b,
> > > > tap_pub_toast, tap_pub_inherits, tap_pub_viaroot_2, tap_pub_viaroot_1"
> > > >  );
> > > > @@ -707,13 +711,17 @@ is($result, qq(t|1), 'check replicated rows to
> > > > tab_rowfilter_toast');
> > > >  # the row filter for the top-level ancestor:
> > > >  #
> > > >  # tab_rowfilter_viaroot_part filter is: (a > 15)
> > > > +# - INSERT (13)        NO, 13 < 15
> > > >  # - INSERT (14)        NO, 14 < 15
> > > >  # - INSERT (15)        NO, 15 = 15
> > > >  # - INSERT (16)        YES, 16 > 15
> > > > +# - INSERT (17)        YES, 17 > 15
> > > >  $result =
> > > >    $node_subscriber->safe_psql('postgres',
> > > > - "SELECT a FROM tab_rowfilter_viaroot_part");
> > > > -is($result, qq(16), 'check replicated rows to tab_rowfilter_viaroot_part');
> > > > + "SELECT a FROM tab_rowfilter_viaroot_part ORDER BY 1");
> > > > +is( $result, qq(16
> > > > +17),
> > > > + 'check replicated rows to tab_rowfilter_viaroot_part');
> > > >
> > > > ~
> > > >
> > > > I'm not 100% sure this is testing quite what you want to test. AFAICT
> > > > the subscription is created like:
> > > >
> > > > "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr
> > > > application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2,
> > > > tap_pub_3, tap_pub_4a, tap_pub_4b, tap_pub_5a, tap_pub_5b,
> > > > tap_pub_toast, tap_pub_inherits, tap_pub_viaroot_2, tap_pub_viaroot_1"
> > >
> > > I think this is the scenario we fixed : Simultaneously subscribing to two
> > > publications that publish the parent and child respectively, then want to use
> > > the parent's identity and schema).
> > >
> >
> > Yeah, but currently BOTH the tap_pub_viaroot_2, tap_pub_viaroot_1 are
> > using "WITH (publish_via_partition_root)", so IMO the user would
> > surely expect that only the root table would be published even when a
> > subscription combines those publications. OTOH, I thought a subtle
> > point of this patch is that now the same result will happen even if
> > only ONE of the publications was using "WITH
> > (publish_via_partition_root)". So it’s that scenario of “only ONE
> > publication is using the option” that I thought ought to be explicitly
> > tested.
> >
> 
> The current change to existing tests is difficult to understand. I
> suggest writing a separate test for row filter and then cover the
> scenario you have suggested.

Changed as suggested.

And I found there is a problem in the three back-branch patches (HEAD_v21_0002*,
REL15_* and REL14_*):
In the function fetch_table_list, we use pg_partition_ancestors to get the list
of tables from the publisher. But the pg_partition_ancestors was introduced in
v12, which means that if the publisher is v11 and the subscriber is v14+, this
will cause an error.
Since we are going to first submit the fix for the publisher >= 16 case on HEAD,
I think we could discuss this issue later if needed. Also, I will update these
three patches later if needed.

Attach the new patch.

Regards,
Wang Wei

Вложения

RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Fri, Mar 24, 2023 at 9:49 AM Peter Smith <smithpb2250@gmail.com> wrote:
> On Thu, Mar 23, 2023 at 8:11 PM wangw.fnst@fujitsu.com
> <wangw.fnst@fujitsu.com> wrote:
> >
> > On Thu, Mar 23, 2023 at 12:27 PM Peter Smith <smithpb2250@gmail.com>
> wrote:
> > > Here are some review comments for patch v20-0001.
> >
> ...
> > > ======
> > > src/test/subscription/t/031_column_list.pl
> > >
> > > 8.
> > > - CREATE PUBLICATION pub_root_true FOR TABLE test_root (a) WITH
> > > (publish_via_partition_root = true);
> > > + CREATE PUBLICATION pub_root_true_1 FOR TABLE test_root (a) WITH
> > > (publish_via_partition_root = true);
> > > + CREATE PUBLICATION pub_root_true_2 FOR TABLE test_root_1 (a, b) WITH
> > > (publish_via_partition_root = true);
> > >
> > >   -- initial data
> > >   INSERT INTO test_root VALUES (1, 2, 3);
> > >   INSERT INTO test_root VALUES (10, 20, 30);
> > >  ));
> > >
> > > +# Subscribe to pub_root_true_1 and pub_root_true_2 at the same time,
> which
> > > +# means that the initial data will be synced once, and only the column list of
> > > +# the parent table (test_root) in the publication pub_root_true_1 will be
> used
> > > +# for both table sync and data replication.
> > >  $node_subscriber->safe_psql(
> > >   'postgres', qq(
> > > - CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr'
> PUBLICATION
> > > pub_root_true;
> > > + CREATE
> > >
> > > ~
> > >
> > > (This is similar to the previous review comment #7 above)
> > >
> > > Won't it be a better test of the "At least one" code when only the
> > > publication of partition (test_root_1) is using "WITH
> > > (publish_via_partition_root = true)".
> > >
> > > e.g
> > > CREATE PUBLICATION pub_root_true_1 FOR TABLE test_root (a);
> > > CREATE PUBLICATION pub_root_true_2 FOR TABLE test_root_1 (a, b) WITH
> > > (publish_via_partition_root = true);
> >
> > I think specifying one or both is the same scenario here.
> > But it seemed clearer if only the "via_root" option is specified in the
> > publication that publishes the parent, so I changed this point in
> > "031_column_list.pl". Since the publications in "028_row_filter.pl" were
> > introduced by other commits, I didn't change it.
> >
> 
> In hindsight, I think those publications should be renamed to
> something more appropriate. The name "pub_root_true_2" seems
> misleading now since the publish_via_partition_root = false
> 
> e.g.1. pub_test_root, pub_test_root_1
> or
> e.g.2. pub_root_true, pub_root_1_false
> etc.

I prefer your first suggestion.
Changed.

Regards,
Wang Wei

Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Fri, Mar 24, 2023 at 2:36 PM wangw.fnst@fujitsu.com
<wangw.fnst@fujitsu.com> wrote:
>
> On Fri, Mar 24, 2023 at 14:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> And I found there is a problem in the three back-branch patches (HEAD_v21_0002*,
> REL15_* and REL14_*):
> In the function fetch_table_list, we use pg_partition_ancestors to get the list
> of tables from the publisher. But the pg_partition_ancestors was introduced in
> v12, which means that if the publisher is v11 and the subscriber is v14+, this
> will cause an error.
>

Yeah, I am also not sure how to fix this for back-branches. I didn't
see any field report for this so I am hesitant to make any complicated
changes in back-branches that will deviate it from HEAD. Let's try to
fix it for HEAD at this stage. I have slightly modified the attached
patch, the changes are (a) I have removed the retail pfrees added in
pg_get_publication_tables() as that memory will anyway be freed when
we call SRF_RETURN_DONE(). It is also inconsistent to sometimes do
retail pfree and not other times in the same function. I have also
referred few similar functions and didn't find them doing retail
pfree. (b) Changed the comments in a few places.

The patch looks good to me. So, I am planning to push this sometime
early next week unless there are more suggestions or comments.

--
With Regards,
Amit Kapila.

Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Peter Smith
Дата:
Here are some review comments for v23-0001.

======
src/test/subscription/t/028_row_filter.pl

1.
+# two publications, one publishing through ancestor and another one directly
+# publsihing the partition, with different row filters
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_viaroot_sync_1 FOR TABLE
tab_rowfilter_viaroot_part_sync WHERE (a > 15) WITH
(publish_via_partition_root)"
+);
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_viaroot_sync_2 FOR TABLE
tab_rowfilter_viaroot_part_sync_1 WHERE (a < 15)"
+);
+

1a.
Typo "publsihing"

~

1b.
IMO these table and publication names could be better.

I thought it was confusing to have the word "sync" in these table
names and publication names. To the casual reader, it looks like these
are synchronous replication tests but they are not.

Similarly, I thought it was confusing that 2nd publication and table
have names with the word "viaroot" when the option
publish_via_partition_root is not even true.

~~~

2.

 # The following commands are executed after CREATE SUBSCRIPTION, so these SQL
 # commands are for testing normal logical replication behavior.
 #

~

I think you should add a couple of INSERTS for the newly added table/s
also. IMO it is not only better for test completeness, but it causes
readers to question why there are INSERTS for every other table except
these ones.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Mon, Mar 27, 2023 at 7:03 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
>
> 1.
> +# two publications, one publishing through ancestor and another one directly
> +# publsihing the partition, with different row filters
> +$node_publisher->safe_psql('postgres',
> + "CREATE PUBLICATION tap_pub_viaroot_sync_1 FOR TABLE
> tab_rowfilter_viaroot_part_sync WHERE (a > 15) WITH
> (publish_via_partition_root)"
> +);
> +$node_publisher->safe_psql('postgres',
> + "CREATE PUBLICATION tap_pub_viaroot_sync_2 FOR TABLE
> tab_rowfilter_viaroot_part_sync_1 WHERE (a < 15)"
> +);
> +
>
> 1a.
> Typo "publsihing"
>
> ~
>
> 1b.
> IMO these table and publication names could be better.
>
> I thought it was confusing to have the word "sync" in these table
> names and publication names. To the casual reader, it looks like these
> are synchronous replication tests but they are not.
>

Hmm, sync here is for initial sync, so I don't think it is too much of
a problem to understand if one is aware that these are logical
replication related tests.

> Similarly, I thought it was confusing that 2nd publication and table
> have names with the word "viaroot" when the option
> publish_via_partition_root is not even true.
>

I think the better names for tables could be
"tab_rowfilter_parent_sync, tab_rowfilter_child_sync" and for
publications "tap_pub_parent_sync_1,
tap_pub_child_sync_1"

> ~~~
>
> 2.
>
>  # The following commands are executed after CREATE SUBSCRIPTION, so these SQL
>  # commands are for testing normal logical replication behavior.
>  #
>
> ~
>
> I think you should add a couple of INSERTS for the newly added table/s
> also. IMO it is not only better for test completeness, but it causes
> readers to question why there are INSERTS for every other table except
> these ones.
>

The purpose of the test is to test the initial sync's interaction with
'publish_via_partition_root' option. So, adding Inserts after that for
replication doesn't serve any purpose and it also consumes test cycles
without any additional benefit. So, -1 for extending it further.

--
With Regards,
Amit Kapila.



RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Mon, Mar 27, 2023 at 11:32 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Mon, Mar 27, 2023 at 7:03 AM Peter Smith <smithpb2250@gmail.com>
> wrote:
> >
> >
> > 1.
> > +# two publications, one publishing through ancestor and another one directly
> > +# publsihing the partition, with different row filters
> > +$node_publisher->safe_psql('postgres',
> > + "CREATE PUBLICATION tap_pub_viaroot_sync_1 FOR TABLE
> > tab_rowfilter_viaroot_part_sync WHERE (a > 15) WITH
> > (publish_via_partition_root)"
> > +);
> > +$node_publisher->safe_psql('postgres',
> > + "CREATE PUBLICATION tap_pub_viaroot_sync_2 FOR TABLE
> > tab_rowfilter_viaroot_part_sync_1 WHERE (a < 15)"
> > +);
> > +
> >
> > 1a.
> > Typo "publsihing"

Changed.

> > ~
> >
> > 1b.
> > IMO these table and publication names could be better.
> >
> > I thought it was confusing to have the word "sync" in these table
> > names and publication names. To the casual reader, it looks like these
> > are synchronous replication tests but they are not.
> >
> 
> Hmm, sync here is for initial sync, so I don't think it is too much of
> a problem to understand if one is aware that these are logical
> replication related tests.
> 
> > Similarly, I thought it was confusing that 2nd publication and table
> > have names with the word "viaroot" when the option
> > publish_via_partition_root is not even true.
> >
> 
> I think the better names for tables could be
> "tab_rowfilter_parent_sync, tab_rowfilter_child_sync" and for
> publications "tap_pub_parent_sync_1,
> tap_pub_child_sync_1"

Changed. And removed "_1" in the suggested publication names.
Previously, I added "_1" and "_2" to distinguish between two publications with
the same name. However, the publication names are now different, so I think we
could remove "_1".

Attach the new patch.

Regards,
Wang Wei

Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Peter Smith
Дата:
I looked at v24-0001.

======
src/test/subscription/t/028_row_filter.pl

+# Check expected replicated rows for tap_pub_parent_sync and
+# tap_pub_child_sync
+# Since the option publish_via_partition_root of tap_pub_parent_sync is true,
+# so the row filter of tap_pub_parent_sync will be used:
+# tap_pub_parent_sync filter is: (a > 15)
+# tap_pub_child_sync filter is: (a < 15)

Maybe wrapping can be improved in the above comment and a full stop
added to the first sentence.

Otherwise, I have no more comments for v24.

------
Kind Regards,
Peter Smith



Re: Data is copied twice when specifying both child and parent table in publication

От
Jacob Champion
Дата:
On Mon, Mar 20, 2023 at 11:22 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:
> If the tests you have in mind are only related to this patch set then
> feel free to propose them here if you feel the current ones are not
> sufficient.

I think the new tests added by Wang cover my concerns (thanks!). I share
Peter's comment that we don't seem to have a regression test covering
only the bug description itself -- just ones that combine that case with
row and column restrictions -- but if you're all happy with the existing
approach then I have nothing much to add there.

I was staring at this subquery in fetch_table_list():

> +                        "              ( SELECT array_agg(a.attname ORDER BY a.attnum)\n"
> +                        "                FROM pg_attribute a\n"
> +                        "                WHERE a.attrelid = gpt.relid AND\n"
> +                        "                      a.attnum = ANY(gpt.attrs)\n"
> +                        "              ) AS attnames\n"

On my machine this takes up roughly 90% of the runtime of the query,
which makes for a noticeable delay with a bigger test case (a couple of
FOR ALL TABLES subscriptions on the regression database). And it seems
like we immediately throw all that work away: if I understand correctly,
we only use the third column for its interaction with DISTINCT. Would it
be enough to just replace that whole thing with gpt.attrs?

Thanks,
--Jacob



RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Tues, Mar 28, 2023 at 7:02 AM Jacob Champion <jchampion@timescale.com> wrote:
> On Mon, Mar 20, 2023 at 11:22 PM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> > If the tests you have in mind are only related to this patch set then
> > feel free to propose them here if you feel the current ones are not
> > sufficient.
> 
> I think the new tests added by Wang cover my concerns (thanks!). I share
> Peter's comment that we don't seem to have a regression test covering
> only the bug description itself -- just ones that combine that case with
> row and column restrictions -- but if you're all happy with the existing
> approach then I have nothing much to add there.

The scenario of this bug is to subscribe to two publications at the same time,
and these two publications publish parent table and child table respectively.
And option via_root is specified in both publications or only in the publication
of the parent table. At this time, the data on the publisher-side will be copied
twice (the data will be copied to the two tables on the subscribe-side
respectively).
So, I think we have covered this bug itself in 013_partition.pl. We inserted the
initial data into the parent table tab4 on the publisher-side, and checked
whether the sync is completed as we expected (there is data in table tab4, but
there is no data in table tab4_1).

> I was staring at this subquery in fetch_table_list():
> 
> > +                        "              ( SELECT array_agg(a.attname ORDER BY a.attnum)\n"
> > +                        "                FROM pg_attribute a\n"
> > +                        "                WHERE a.attrelid = gpt.relid AND\n"
> > +                        "                      a.attnum = ANY(gpt.attrs)\n"
> > +                        "              ) AS attnames\n"
> 
> On my machine this takes up roughly 90% of the runtime of the query,
> which makes for a noticeable delay with a bigger test case (a couple of
> FOR ALL TABLES subscriptions on the regression database). And it seems
> like we immediately throw all that work away: if I understand correctly,
> we only use the third column for its interaction with DISTINCT. Would it
> be enough to just replace that whole thing with gpt.attrs?

Make sense.
Changed as suggested.

Attach the new patch.

Regards,
Wang Wei

Вложения

RE: Data is copied twice when specifying both child and parent table in publication

От
"wangw.fnst@fujitsu.com"
Дата:
On Tues, Mar 28, 2023 at 18:00 PM Wang, Wei/王 威 <wangw.fnst@fujitsu.com> wrote:
> Attach the new patch.

Sorry, I attached the wrong patch.
Here is the correct new version patch which addressed all comments so far.

Regards,
Wang Wei

Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Jacob Champion
Дата:
On Tue, Mar 28, 2023 at 2:59 AM wangw.fnst@fujitsu.com
<wangw.fnst@fujitsu.com> wrote:
> The scenario of this bug is to subscribe to two publications at the same time,
> and these two publications publish parent table and child table respectively.
> And option via_root is specified in both publications or only in the publication
> of the parent table.

Ah, reading the initial mail again, that makes sense. I came to this
thread with the alternative reproduction in mind (subscribing to one
publication with viaroot=true, and another publication with
viaroot=false) and misread the report accordingly... In the end, I'm
comfortable with the current level of coverage.

> > Would it
> > be enough to just replace that whole thing with gpt.attrs?
>
> Make sense.
> Changed as suggested.

LGTM, by inspection. Thanks!

--Jacob



Re: Data is copied twice when specifying both child and parent table in publication

От
Peter Smith
Дата:
A minor review comment for v25-0001.

======
src/backend/commands/subscriptioncmds.c

1.
@@ -1936,21 +1936,56 @@ fetch_table_list(WalReceiverConn *wrconn, List
*publications)
  WalRcvExecResult *res;
  StringInfoData cmd;
  TupleTableSlot *slot;
- Oid tableRow[3] = {TEXTOID, TEXTOID, NAMEARRAYOID};
+ Oid tableRow[3] = {TEXTOID, TEXTOID, InvalidOid};

The patch could be slightly less invasive if you did not make this
change, but instead, only overwrite tableRow[2] for the >= PG16 case.

Or vice versa, if you prefer.

The point is, there are only 2 cases, so you might as well initialize
a default tableRow[2] that is valid for one case and overwrite it only
for the other case, instead of overwriting it in 2 places.

YMMV.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Wed, Mar 29, 2023 at 7:44 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> A minor review comment for v25-0001.
>
> ======
> src/backend/commands/subscriptioncmds.c
>
> 1.
> @@ -1936,21 +1936,56 @@ fetch_table_list(WalReceiverConn *wrconn, List
> *publications)
>   WalRcvExecResult *res;
>   StringInfoData cmd;
>   TupleTableSlot *slot;
> - Oid tableRow[3] = {TEXTOID, TEXTOID, NAMEARRAYOID};
> + Oid tableRow[3] = {TEXTOID, TEXTOID, InvalidOid};
>
> The patch could be slightly less invasive if you did not make this
> change, but instead, only overwrite tableRow[2] for the >= PG16 case.
>
> Or vice versa, if you prefer.
>

The current coding pattern looks neat to me.

--
With Regards,
Amit Kapila.



Re: Data is copied twice when specifying both child and parent table in publication

От
Amit Kapila
Дата:
On Tue, Mar 28, 2023 at 11:14 PM Jacob Champion <jchampion@timescale.com> wrote:
>
> On Tue, Mar 28, 2023 at 2:59 AM wangw.fnst@fujitsu.com
>
> > > Would it
> > > be enough to just replace that whole thing with gpt.attrs?
> >
> > Make sense.
> > Changed as suggested.
>
> LGTM, by inspection. Thanks!
>

Pushed.

--
With Regards,
Amit Kapila.



Re: Data is copied twice when specifying both child and parent table in publication

От
Jacob Champion
Дата:
On Wed, Mar 29, 2023 at 2:00 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> Pushed.

While rebasing my logical-roots patch over the top of this, I ran into
another situation where mixed viaroot settings can duplicate data. The
key idea is to subscribe to two publications with mixed settings, as
before, and add a partition root that's already been replicated with
viaroot=false to the other publication with viaroot=true.

  pub=# CREATE TABLE part (a int) PARTITION BY RANGE (a);
  pub=# CREATE PUBLICATION pub_all FOR ALL TABLES;
  pub=# CREATE PUBLICATION pub_other FOR TABLE other WITH
(publish_via_partition_root);
  -- populate with data, then switch to subscription side
  sub=# CREATE SUBSCRIPTION sub CONNECTION ... PUBLICATION pub_all, pub_other;
  -- switch back to publication
  pub=# ALTER PUBLICATION pub_other ADD TABLE part;
  -- and back to subscription
  sub=# ALTER SUBSCRIPTION sub REFRESH PUBLICATION;
  -- data is now duplicated

(Standalone reproduction attached.)

This is similar to what happens if you alter the
publish_via_partition_root setting for an existing publication, but
I'd argue it's easier to hit by accident. Is this part of the same
class of bugs, or is it different (or even expected) behavior?

Thanks,
--Jacob

Вложения

Re: Data is copied twice when specifying both child and parent table in publication

От
Peter Smith
Дата:
On Fri, Mar 31, 2023 at 5:15 AM Jacob Champion <jchampion@timescale.com> wrote:
>
> On Wed, Mar 29, 2023 at 2:00 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > Pushed.
>
> While rebasing my logical-roots patch over the top of this, I ran into
> another situation where mixed viaroot settings can duplicate data. The
> key idea is to subscribe to two publications with mixed settings, as
> before, and add a partition root that's already been replicated with
> viaroot=false to the other publication with viaroot=true.
>
>   pub=# CREATE TABLE part (a int) PARTITION BY RANGE (a);
>   pub=# CREATE PUBLICATION pub_all FOR ALL TABLES;
>   pub=# CREATE PUBLICATION pub_other FOR TABLE other WITH
> (publish_via_partition_root);
>   -- populate with data, then switch to subscription side
>   sub=# CREATE SUBSCRIPTION sub CONNECTION ... PUBLICATION pub_all, pub_other;
>   -- switch back to publication
>   pub=# ALTER PUBLICATION pub_other ADD TABLE part;
>   -- and back to subscription
>   sub=# ALTER SUBSCRIPTION sub REFRESH PUBLICATION;
>   -- data is now duplicated
>
> (Standalone reproduction attached.)
>
> This is similar to what happens if you alter the
> publish_via_partition_root setting for an existing publication, but
> I'd argue it's easier to hit by accident. Is this part of the same
> class of bugs, or is it different (or even expected) behavior?
>

Hi Jacob. I tried your example. And I can see after the REFRESH the
added table 'part' tablesync is launched and so does the copy causing
duplicate data.

sub=# ALTER SUBSCRIPTION sub REFRESH PUBLICATION;
ALTER SUBSCRIPTION
sub=# 2023-03-31 13:09:30.348 AEDT [334] LOG:  logical replication
table synchronization worker for subscription "sub", table "part" has
started
...

Duplicate data happens because REFRESH PUBLICATION has the default
"refresh_option of copy_data=true.

Although the result is at first a bit unexpected, I am not sure if
anything can be done to make it do what you probably hoped it would
do:

For example, Just imagine if logic could be made smarter to recognize
that since there was already the 'part_def' being subscribed so it
should NOT use the default 'copy_data=true' when the REFRESH launches
the ancestor table 'part'...

Even if that logic was implemented, I have a feeling you could *still*
run into problems if the 'part' table was made of multiple partitions.
I think you might get to a situation where you DO want some partition
data copied (because you did not have it yet but now you are
subscribing to the root you want it) while at the same time, you DON'T
want to get duplicated data from other partitions (because you already
knew about those ones  -- like your example does).

So, I am not sure what the answer is, or maybe there isn't one.

At least, we need to check there are sufficient "BE CAREFUL" warnings
in the documentation for scenarios like this.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



RE: Data is copied twice when specifying both child and parent table in publication

От
"shiy.fnst@fujitsu.com"
Дата:
On Fri, Mar 31, 2023 2:16 AM Jacob Champion <jchampion@timescale.com> wrote:
> 
> On Wed, Mar 29, 2023 at 2:00 AM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> > Pushed.
> 
> While rebasing my logical-roots patch over the top of this, I ran into
> another situation where mixed viaroot settings can duplicate data. The
> key idea is to subscribe to two publications with mixed settings, as
> before, and add a partition root that's already been replicated with
> viaroot=false to the other publication with viaroot=true.
> 
>   pub=# CREATE TABLE part (a int) PARTITION BY RANGE (a);
>   pub=# CREATE PUBLICATION pub_all FOR ALL TABLES;
>   pub=# CREATE PUBLICATION pub_other FOR TABLE other WITH
> (publish_via_partition_root);
>   -- populate with data, then switch to subscription side
>   sub=# CREATE SUBSCRIPTION sub CONNECTION ... PUBLICATION pub_all,
> pub_other;
>   -- switch back to publication
>   pub=# ALTER PUBLICATION pub_other ADD TABLE part;
>   -- and back to subscription
>   sub=# ALTER SUBSCRIPTION sub REFRESH PUBLICATION;
>   -- data is now duplicated
> 
> (Standalone reproduction attached.)
> 
> This is similar to what happens if you alter the
> publish_via_partition_root setting for an existing publication, but
> I'd argue it's easier to hit by accident. Is this part of the same
> class of bugs, or is it different (or even expected) behavior?
> 

I noticed that a similar problem has been discussed in this thread, see [1] [2]
[3] [4]. It seems complicated to fix it if we want to automatically skip tables
that have been synchronized previously by code, and this may overkill in some
cases (e.g. The target table in subscriber is not a partitioned table, and the
user want to synchronize all data in the partitioned table from the publisher).
Besides, it seems not a common case. So I'm not sure we should fix it. Maybe we
can just add some documentation for it as Peter mentioned.

[1] https://www.postgresql.org/message-id/CAJcOf-eQR_%3Dq0f4ZVHd342QdLvBd_995peSr4xCU05hrS3TeTg%40mail.gmail.com
[2]
https://www.postgresql.org/message-id/OS0PR01MB5716C756312959F293A822C794869%40OS0PR01MB5716.jpnprd01.prod.outlook.com
(thesecond issue in it)
 
[3] https://www.postgresql.org/message-id/CA%2BHiwqHnDHcT4OOcga9rDFyc7TvDrpN5xFH9J2pyHQo9ptvjmQ%40mail.gmail.com
[4] https://www.postgresql.org/message-id/CAA4eK1%2BNWreG%3D2sKiMz8vFzTsFhEHCjgQMyAu6zj3sdLmcheYg%40mail.gmail.com

Regards,
Shi Yu

Re: Data is copied twice when specifying both child and parent table in publication

От
Jacob Champion
Дата:
On 3/30/23 20:01, Peter Smith wrote:
> For example, Just imagine if logic could be made smarter to recognize
> that since there was already the 'part_def' being subscribed so it
> should NOT use the default 'copy_data=true' when the REFRESH launches
> the ancestor table 'part'...
> 
> Even if that logic was implemented, I have a feeling you could *still*
> run into problems if the 'part' table was made of multiple partitions.
> I think you might get to a situation where you DO want some partition
> data copied (because you did not have it yet but now you are
> subscribing to the root you want it) while at the same time, you DON'T
> want to get duplicated data from other partitions (because you already
> knew about those ones  -- like your example does).

Hm, okay. My interest here is mainly because my logical-roots proposal
generalizes the problem (and therefore makes it worse).

For what it's worth, that patchset introduces the ability for the
subscriber to sync multiple tables into one. I wonder if that could be
used somehow to help fix this problem too?

> At least, we need to check there are sufficient "BE CAREFUL" warnings
> in the documentation for scenarios like this.

Agreed. These are sharp edges.

Thanks,
--Jacob



Re: Data is copied twice when specifying both child and parent table in publication

От
Jacob Champion
Дата:
On 3/31/23 03:04, shiy.fnst@fujitsu.com wrote:
> I noticed that a similar problem has been discussed in this thread, see [1] [2]
> [3] [4].

Ah, thank you. I didn't go far back enough in the thread...

> It seems complicated to fix it if we want to automatically skip tables
> that have been synchronized previously by code

I agree, this is looking very complex. I need to read through the
examples you sent more closely.

> and this may overkill in some
> cases (e.g. The target table in subscriber is not a partitioned table, and the
> user want to synchronize all data in the partitioned table from the publisher).

Hm. It seems like the setup process doesn't really capture the user's
intent. There are just so many things that they could be theoretically
trying to do.

> Besides, it seems not a common case. So I'm not sure we should fix it. Maybe we
> can just add some documentation for it as Peter mentioned.

I think we should absolutely document the pitfalls here. (I'm still
trying to figure out what they are, though, so I don't have any concrete
suggestions yet...)

Thanks!
--Jacob