Обсуждение: Ambiguous description on new columns

Поиск
Список
Период
Сортировка

Ambiguous description on new columns

От
PG Doc comments form
Дата:
The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/16/logical-replication-col-lists.html
Description:

The documentation on this page mentions:

"If no column list is specified, any columns added later are automatically
replicated."

It feels ambiguous what this could mean. Does it mean:

1/ That if you alter the table on the publisher and add a new column, it
will be replicated

2/ If you add a column list later and add a column to it, it will be
replicated

In both cases, does the subscriber automatically create this column if it
wasn't there before? I recall reading that the initial data synchronization
requires the schema of the publisher database to be created on the
subscriber first. But then later updates sync newly created columns? I don't
recall any pages on logical replication mentioning this, up to this point.

Regards,
Koen De Groote

Re: Ambiguous description on new columns

От
Guillaume Lelarge
Дата:
Hi,

Le mar. 21 mai 2024 à 12:40, PG Doc comments form <noreply@postgresql.org> a écrit :
The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/16/logical-replication-col-lists.html
Description:

The documentation on this page mentions:

"If no column list is specified, any columns added later are automatically
replicated."

It feels ambiguous what this could mean. Does it mean:

1/ That if you alter the table on the publisher and add a new column, it
will be replicated

2/ If you add a column list later and add a column to it, it will be
replicated

In both cases, does the subscriber automatically create this column if it
wasn't there before? I recall reading that the initial data synchronization
requires the schema of the publisher database to be created on the
subscriber first. But then later updates sync newly created columns? I don't
recall any pages on logical replication mentioning this, up to this point.


It feels ambiguous. DDL commands are not replicated, so the new columns don't appear automagically on the subscriber. You have to add them to the subscriber. But values of new columns are replicated, whether or not you have added the new columns on the subscriber.

Regards.


--
Guillaume.

Re: Ambiguous description on new columns

От
Peter Smith
Дата:
On Tue, May 21, 2024 at 8:40 PM PG Doc comments form
<noreply@postgresql.org> wrote:
>
> The following documentation comment has been logged on the website:
>
> Page: https://www.postgresql.org/docs/16/logical-replication-col-lists.html
> Description:
>
> The documentation on this page mentions:
>
> "If no column list is specified, any columns added later are automatically
> replicated."
>
> It feels ambiguous what this could mean. Does it mean:
>
> 1/ That if you alter the table on the publisher and add a new column, it
> will be replicated
>
> 2/ If you add a column list later and add a column to it, it will be
> replicated
>
> In both cases, does the subscriber automatically create this column if it
> wasn't there before?

No, the subscriber will not automatically create the column. That is
already clearly said at the top of the same page you linked "The table
on the subscriber side must have at least all the columns that are
published."

All that "If no column list..." paragraph was trying to say is:

CREATE PUBLICATION pub FOR TABLE T;

is not quite the same as:

CREATE PUBLICATION pub FOR TABLE T(a,b,c);

The difference is, in the 1st case if you then ALTER the TABLE T to
have a new column 'd' then that will automatically start replicating
the 'd' data without having to do anything to either the PUBLICATION
or the SUBSCRIPTION. Of course, if TABLE T at the subscriber side does
not have a column 'd' then you'll get an error because your subscriber
table needs to have *at least* all the replicated columns. (I
demonstrate this error below)

Whereas in the 2nd case, even though you ALTER'ed the TABLE T to have
a new column 'd' then that won't be replicated because 'd' was not
named in the PUBLICATION's column list.

~~~~

Here's an example where you can see this in action

Here is an example of the 1st case -- it shows 'd' is automatically
replicated and also shows the subscriber-side error caused by the
missing column:

test_pub=# CREATE TABLE T(a int,b int, c int);
test_pub=# CREATE PUBLICATION pub FOR TABLE T;

test_sub=# CREATE TABLE T(a int,b int, c int);
test_sub=# CREATE SUBSCRIPTION sub CONNECTION 'dbname=test_pub' PUBLICATION pub;

See the replication happening
test_pub=# INSERT INTO T VALUES (1,2,3);
test_sub=# SELECT * FROM t;
 a | b | c
---+---+---
 1 | 2 | 3
(1 row)

Now alter the publisher table T and insert some new data
test_pub=# ALTER TABLE T ADD COLUMN d int;
test_pub=# INSERT INTO T VALUES (5,6,7,8);

This will cause subscription errors like:
2024-05-22 11:53:19.098 AEST [16226] ERROR:  logical replication
target relation "public.t" is missing replicated column: "d"

~~~~

I think the following small change will remove any ambiguity:

BEFORE
If no column list is specified, any columns added later are
automatically replicated.

SUGGESTION
If no column list is specified, any columns added to the table later
are automatically replicated.

~~

I attached a small patch to make the above change.

Thoughts?

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: Ambiguous description on new columns

От
Peter Smith
Дата:
On Tue, May 21, 2024 at 8:40 PM PG Doc comments form
<noreply@postgresql.org> wrote:
>
> The following documentation comment has been logged on the website:
>
> Page: https://www.postgresql.org/docs/16/logical-replication-col-lists.html
> Description:
>
> The documentation on this page mentions:
>
> "If no column list is specified, any columns added later are automatically
> replicated."
>
> It feels ambiguous what this could mean. Does it mean:
>
> 1/ That if you alter the table on the publisher and add a new column, it
> will be replicated
>
> 2/ If you add a column list later and add a column to it, it will be
> replicated
>
> In both cases, does the subscriber automatically create this column if it
> wasn't there before?

No, the subscriber will not automatically create the column. That is
already clearly said at the top of the same page you linked "The table
on the subscriber side must have at least all the columns that are
published."

All that "If no column list..." paragraph was trying to say is:

CREATE PUBLICATION pub FOR TABLE T;

is not quite the same as:

CREATE PUBLICATION pub FOR TABLE T(a,b,c);

The difference is, in the 1st case if you then ALTER the TABLE T to
have a new column 'd' then that will automatically start replicating
the 'd' data without having to do anything to either the PUBLICATION
or the SUBSCRIPTION. Of course, if TABLE T at the subscriber side does
not have a column 'd' then you'll get an error because your subscriber
table needs to have *at least* all the replicated columns. (I
demonstrate this error below)

Whereas in the 2nd case, even though you ALTER'ed the TABLE T to have
a new column 'd' then that won't be replicated because 'd' was not
named in the PUBLICATION's column list.

~~~~

Here's an example where you can see this in action

Here is an example of the 1st case -- it shows 'd' is automatically
replicated and also shows the subscriber-side error caused by the
missing column:

test_pub=# CREATE TABLE T(a int,b int, c int);
test_pub=# CREATE PUBLICATION pub FOR TABLE T;

test_sub=# CREATE TABLE T(a int,b int, c int);
test_sub=# CREATE SUBSCRIPTION sub CONNECTION 'dbname=test_pub' PUBLICATION pub;

See the replication happening
test_pub=# INSERT INTO T VALUES (1,2,3);
test_sub=# SELECT * FROM t;
 a | b | c
---+---+---
 1 | 2 | 3
(1 row)

Now alter the publisher table T and insert some new data
test_pub=# ALTER TABLE T ADD COLUMN d int;
test_pub=# INSERT INTO T VALUES (5,6,7,8);

This will cause subscription errors like:
2024-05-22 11:53:19.098 AEST [16226] ERROR:  logical replication
target relation "public.t" is missing replicated column: "d"

~~~~

I think the following small change will remove any ambiguity:

BEFORE
If no column list is specified, any columns added later are
automatically replicated.

SUGGESTION
If no column list is specified, any columns added to the table later
are automatically replicated.

~~

I attached a small patch to make the above change.

Thoughts?

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: Ambiguous description on new columns

От
"David G. Johnston"
Дата:
On Tue, May 21, 2024 at 3:40 AM PG Doc comments form <noreply@postgresql.org> wrote:
The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/16/logical-replication-col-lists.html
Description:

The documentation on this page mentions:

"If no column list is specified, any columns added later are automatically
replicated."

It feels ambiguous what this could mean. Does it mean:

1/ That if you alter the table on the publisher and add a new column, it
will be replicated

Yes, this is the only thing in scope you can "add columns to later".


2/ If you add a column list later and add a column to it, it will be
replicated

I feel like we failed somewhere if the reader believes that it is possible to alter a publication in this way.

David J.

Re: Ambiguous description on new columns

От
"David G. Johnston"
Дата:
On Tue, May 21, 2024 at 7:48 PM Peter Smith <smithpb2250@gmail.com> wrote:

I think the following small change will remove any ambiguity:

BEFORE
If no column list is specified, any columns added later are
automatically replicated.

SUGGESTION
If no column list is specified, any columns added to the table later
are automatically replicated.

~~


Extended Before:

Each publication can optionally specify which columns of each table are replicated to subscribers. The table on the subscriber side must have at least all the columns that are published. If no column list is specified, then all columns on the publisher are replicated. See CREATE PUBLICATION for details on the syntax.

The choice of columns can be based on behavioral or performance reasons. However, do not rely on this feature for security: a malicious subscriber is able to obtain data from columns that are not specifically published. If security is a consideration, protections can be applied at the publisher side.

If no column list is specified, any columns added later are automatically replicated. This means that having a column list which names all columns is not the same as having no column list at all.

I'd suggest:

Each publication can optionally specify which columns of each table are replicated to subscribers. The table on the subscriber side must have at least all the columns that are published. If no column list is specified, then all columns on the publisher[, present and future,] are replicated. See CREATE PUBLICATION for details on the syntax.

...security...

...delete the entire "ambiguous" paragraph...

David J.

Re: Ambiguous description on new columns

От
Peter Smith
Дата:
On Wed, May 22, 2024 at 1:22 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Tue, May 21, 2024 at 7:48 PM Peter Smith <smithpb2250@gmail.com> wrote:
>>
>>
>> I think the following small change will remove any ambiguity:
>>
>> BEFORE
>> If no column list is specified, any columns added later are
>> automatically replicated.
>>
>> SUGGESTION
>> If no column list is specified, any columns added to the table later
>> are automatically replicated.
>>
>> ~~
>>
>
> Extended Before:
>
> Each publication can optionally specify which columns of each table are replicated to subscribers. The table on the
subscriberside must have at least all the columns that are published. If no column list is specified, then all columns
onthe publisher are replicated. See CREATE PUBLICATION for details on the syntax. 
>
> The choice of columns can be based on behavioral or performance reasons. However, do not rely on this feature for
security:a malicious subscriber is able to obtain data from columns that are not specifically published. If security is
aconsideration, protections can be applied at the publisher side. 
>
> If no column list is specified, any columns added later are automatically replicated. This means that having a column
listwhich names all columns is not the same as having no column list at all. 
>
> I'd suggest:
>
> Each publication can optionally specify which columns of each table are replicated to subscribers. The table on the
subscriberside must have at least all the columns that are published. If no column list is specified, then all columns
onthe publisher[, present and future,] are replicated. See CREATE PUBLICATION for details on the syntax. 
>
> ...security...
>
> ...delete the entire "ambiguous" paragraph...
>

The "ambiguous" paragraph was trying to make the point that although
(a) having no column-list at all and
(b) having a column list that names every table column

starts off looking and working the same, don't be tricked into
thinking they are exactly equivalent, because if the table ever gets
ALTERED later then the behaviour of those PUBLICATIONs begins to
differ.

~

Your suggested text doesn't seem quite as explicit about that subtle
point, but I guess since you can still infer the same meaning it is
fine.

But, maybe say "all columns on the published table" instead of "all
columns on the publisher".

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Ambiguous description on new columns

От
"David G. Johnston"
Дата:
On Tuesday, May 21, 2024, Peter Smith <smithpb2250@gmail.com> wrote:

>
> Each publication can optionally specify which columns of each table are replicated to subscribers. The table on the subscriber side must have at least all the columns that are published. If no column list is specified, then all columns on the publisher[, present and future,] are replicated. See CREATE PUBLICATION for details on the syntax.
>
> ...security...
>
> ...delete the entire "ambiguous" paragraph...
>

Your suggested text doesn't seem quite as explicit about that subtle
point, but I guess since you can still infer the same meaning it is
fine.

Right, it doesn’t seem that subtle so long as we point out what an absent column list means. if you specify a column list you get exactly what you asked for.  It’s like listing columns in select.  But if you don’t specify a column list you get whatever is there at runtime. Which I presume also means dropped columns no longer get replicated, but I haven’t tested and the docs don’t seem to cover column removal…

In contrast, if we don’t say this, one might reasonably assume that it behaves like:
Create view vw select * from tbl;
when it doesn’t.

So yes, I do think saying “present and future” sufficiently covers the intent of the removed paragraph and clearly ties that to the table columns in response to this complaint.

But, maybe say "all columns on the published table" instead of "all
columns on the publisher".

Agreed.

David J.

Re: Ambiguous description on new columns

От
Laurenz Albe
Дата:
On Wed, 2024-05-22 at 12:47 +1000, Peter Smith wrote:
> I think the following small change will remove any ambiguity:
>
> BEFORE
> If no column list is specified, any columns added later are
> automatically replicated.
>
> SUGGESTION
> If no column list is specified, any columns added to the table later
> are automatically replicated.
>
> ~~
>
> I attached a small patch to make the above change.

+1 on that change.

Yours,
Laurenz Albe



Re: Ambiguous description on new columns

От
vignesh C
Дата:
On Wed, 22 May 2024 at 08:18, Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Tue, May 21, 2024 at 8:40 PM PG Doc comments form
> <noreply@postgresql.org> wrote:
> >
> > The following documentation comment has been logged on the website:
> >
> > Page: https://www.postgresql.org/docs/16/logical-replication-col-lists.html
> > Description:
> >
> > The documentation on this page mentions:
> >
> > "If no column list is specified, any columns added later are automatically
> > replicated."
> >
> > It feels ambiguous what this could mean. Does it mean:
> >
> > 1/ That if you alter the table on the publisher and add a new column, it
> > will be replicated
> >
> > 2/ If you add a column list later and add a column to it, it will be
> > replicated
> >
> > In both cases, does the subscriber automatically create this column if it
> > wasn't there before?
>
> No, the subscriber will not automatically create the column. That is
> already clearly said at the top of the same page you linked "The table
> on the subscriber side must have at least all the columns that are
> published."
>
> All that "If no column list..." paragraph was trying to say is:
>
> CREATE PUBLICATION pub FOR TABLE T;
>
> is not quite the same as:
>
> CREATE PUBLICATION pub FOR TABLE T(a,b,c);
>
> The difference is, in the 1st case if you then ALTER the TABLE T to
> have a new column 'd' then that will automatically start replicating
> the 'd' data without having to do anything to either the PUBLICATION
> or the SUBSCRIPTION. Of course, if TABLE T at the subscriber side does
> not have a column 'd' then you'll get an error because your subscriber
> table needs to have *at least* all the replicated columns. (I
> demonstrate this error below)
>
> Whereas in the 2nd case, even though you ALTER'ed the TABLE T to have
> a new column 'd' then that won't be replicated because 'd' was not
> named in the PUBLICATION's column list.
>
> ~~~~
>
> Here's an example where you can see this in action
>
> Here is an example of the 1st case -- it shows 'd' is automatically
> replicated and also shows the subscriber-side error caused by the
> missing column:
>
> test_pub=# CREATE TABLE T(a int,b int, c int);
> test_pub=# CREATE PUBLICATION pub FOR TABLE T;
>
> test_sub=# CREATE TABLE T(a int,b int, c int);
> test_sub=# CREATE SUBSCRIPTION sub CONNECTION 'dbname=test_pub' PUBLICATION pub;
>
> See the replication happening
> test_pub=# INSERT INTO T VALUES (1,2,3);
> test_sub=# SELECT * FROM t;
>  a | b | c
> ---+---+---
>  1 | 2 | 3
> (1 row)
>
> Now alter the publisher table T and insert some new data
> test_pub=# ALTER TABLE T ADD COLUMN d int;
> test_pub=# INSERT INTO T VALUES (5,6,7,8);
>
> This will cause subscription errors like:
> 2024-05-22 11:53:19.098 AEST [16226] ERROR:  logical replication
> target relation "public.t" is missing replicated column: "d"
>
> ~~~~
>
> I think the following small change will remove any ambiguity:
>
> BEFORE
> If no column list is specified, any columns added later are
> automatically replicated.
>
> SUGGESTION
> If no column list is specified, any columns added to the table later
> are automatically replicated.
>
> ~~
>
> I attached a small patch to make the above change.
>
> Thoughts?

A minor suggestion, the rest looks good:
It would enhance clarity to include a line break following "If no
column list is specified, any columns added to the table later are":
-   If no column list is specified, any columns added later are automatically
+   If no column list is specified, any columns added to the table
later are automatically
    replicated. This means that having a column list which names all columns

Regards,
Vignesh



Re: Ambiguous description on new columns

От
Peter Smith
Дата:
On Wed, May 29, 2024 at 8:04 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, 22 May 2024 at 14:26, Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > On Tue, May 21, 2024 at 8:40 PM PG Doc comments form
> > <noreply@postgresql.org> wrote:
> > >
> > > The following documentation comment has been logged on the website:
> > >
> > > Page: https://www.postgresql.org/docs/16/logical-replication-col-lists.html
> > > Description:
> > >
> > > The documentation on this page mentions:
> > >
> > > "If no column list is specified, any columns added later are automatically
> > > replicated."
> > >
> > > It feels ambiguous what this could mean. Does it mean:
> > >
> > > 1/ That if you alter the table on the publisher and add a new column, it
> > > will be replicated
> > >
> > > 2/ If you add a column list later and add a column to it, it will be
> > > replicated
> > >
> > > In both cases, does the subscriber automatically create this column if it
> > > wasn't there before?
> >
> > No, the subscriber will not automatically create the column. That is
> > already clearly said at the top of the same page you linked "The table
> > on the subscriber side must have at least all the columns that are
> > published."
> >
> > All that "If no column list..." paragraph was trying to say is:
> >
> > CREATE PUBLICATION pub FOR TABLE T;
> >
> > is not quite the same as:
> >
> > CREATE PUBLICATION pub FOR TABLE T(a,b,c);
> >
> > The difference is, in the 1st case if you then ALTER the TABLE T to
> > have a new column 'd' then that will automatically start replicating
> > the 'd' data without having to do anything to either the PUBLICATION
> > or the SUBSCRIPTION. Of course, if TABLE T at the subscriber side does
> > not have a column 'd' then you'll get an error because your subscriber
> > table needs to have *at least* all the replicated columns. (I
> > demonstrate this error below)
> >
> > Whereas in the 2nd case, even though you ALTER'ed the TABLE T to have
> > a new column 'd' then that won't be replicated because 'd' was not
> > named in the PUBLICATION's column list.
> >
> > ~~~~
> >
> > Here's an example where you can see this in action
> >
> > Here is an example of the 1st case -- it shows 'd' is automatically
> > replicated and also shows the subscriber-side error caused by the
> > missing column:
> >
> > test_pub=# CREATE TABLE T(a int,b int, c int);
> > test_pub=# CREATE PUBLICATION pub FOR TABLE T;
> >
> > test_sub=# CREATE TABLE T(a int,b int, c int);
> > test_sub=# CREATE SUBSCRIPTION sub CONNECTION 'dbname=test_pub' PUBLICATION pub;
> >
> > See the replication happening
> > test_pub=# INSERT INTO T VALUES (1,2,3);
> > test_sub=# SELECT * FROM t;
> >  a | b | c
> > ---+---+---
> >  1 | 2 | 3
> > (1 row)
> >
> > Now alter the publisher table T and insert some new data
> > test_pub=# ALTER TABLE T ADD COLUMN d int;
> > test_pub=# INSERT INTO T VALUES (5,6,7,8);
> >
> > This will cause subscription errors like:
> > 2024-05-22 11:53:19.098 AEST [16226] ERROR:  logical replication
> > target relation "public.t" is missing replicated column: "d"
> >
> > ~~~~
> >
> > I think the following small change will remove any ambiguity:
> >
> > BEFORE
> > If no column list is specified, any columns added later are
> > automatically replicated.
> >
> > SUGGESTION
> > If no column list is specified, any columns added to the table later
> > are automatically replicated.
> >
> > ~~
> >
> > I attached a small patch to make the above change.
>
> A small recommendation:
> It would enhance clarity to include a line break following "If no
> column list is specified, any columns added to the table later are":
> -   If no column list is specified, any columns added later are automatically
> +   If no column list is specified, any columns added to the table
> later are automatically
>     replicated. This means that having a column list which names all columns

Hi Vignesh,

IIUC you're saying my v1 patch *content* and rendering is OK, but you
only wanted the SGML text to have better wrapping for < 80 chars
lines. So I have attached a patch v2 with improved wrapping. If you
meant something different then please explain.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения

Re: Ambiguous description on new columns

От
Peter Smith
Дата:
On Wed, May 29, 2024 at 8:04 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, 22 May 2024 at 14:26, Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > On Tue, May 21, 2024 at 8:40 PM PG Doc comments form
> > <noreply@postgresql.org> wrote:
> > >
> > > The following documentation comment has been logged on the website:
> > >
> > > Page: https://www.postgresql.org/docs/16/logical-replication-col-lists.html
> > > Description:
> > >
> > > The documentation on this page mentions:
> > >
> > > "If no column list is specified, any columns added later are automatically
> > > replicated."
> > >
> > > It feels ambiguous what this could mean. Does it mean:
> > >
> > > 1/ That if you alter the table on the publisher and add a new column, it
> > > will be replicated
> > >
> > > 2/ If you add a column list later and add a column to it, it will be
> > > replicated
> > >
> > > In both cases, does the subscriber automatically create this column if it
> > > wasn't there before?
> >
> > No, the subscriber will not automatically create the column. That is
> > already clearly said at the top of the same page you linked "The table
> > on the subscriber side must have at least all the columns that are
> > published."
> >
> > All that "If no column list..." paragraph was trying to say is:
> >
> > CREATE PUBLICATION pub FOR TABLE T;
> >
> > is not quite the same as:
> >
> > CREATE PUBLICATION pub FOR TABLE T(a,b,c);
> >
> > The difference is, in the 1st case if you then ALTER the TABLE T to
> > have a new column 'd' then that will automatically start replicating
> > the 'd' data without having to do anything to either the PUBLICATION
> > or the SUBSCRIPTION. Of course, if TABLE T at the subscriber side does
> > not have a column 'd' then you'll get an error because your subscriber
> > table needs to have *at least* all the replicated columns. (I
> > demonstrate this error below)
> >
> > Whereas in the 2nd case, even though you ALTER'ed the TABLE T to have
> > a new column 'd' then that won't be replicated because 'd' was not
> > named in the PUBLICATION's column list.
> >
> > ~~~~
> >
> > Here's an example where you can see this in action
> >
> > Here is an example of the 1st case -- it shows 'd' is automatically
> > replicated and also shows the subscriber-side error caused by the
> > missing column:
> >
> > test_pub=# CREATE TABLE T(a int,b int, c int);
> > test_pub=# CREATE PUBLICATION pub FOR TABLE T;
> >
> > test_sub=# CREATE TABLE T(a int,b int, c int);
> > test_sub=# CREATE SUBSCRIPTION sub CONNECTION 'dbname=test_pub' PUBLICATION pub;
> >
> > See the replication happening
> > test_pub=# INSERT INTO T VALUES (1,2,3);
> > test_sub=# SELECT * FROM t;
> >  a | b | c
> > ---+---+---
> >  1 | 2 | 3
> > (1 row)
> >
> > Now alter the publisher table T and insert some new data
> > test_pub=# ALTER TABLE T ADD COLUMN d int;
> > test_pub=# INSERT INTO T VALUES (5,6,7,8);
> >
> > This will cause subscription errors like:
> > 2024-05-22 11:53:19.098 AEST [16226] ERROR:  logical replication
> > target relation "public.t" is missing replicated column: "d"
> >
> > ~~~~
> >
> > I think the following small change will remove any ambiguity:
> >
> > BEFORE
> > If no column list is specified, any columns added later are
> > automatically replicated.
> >
> > SUGGESTION
> > If no column list is specified, any columns added to the table later
> > are automatically replicated.
> >
> > ~~
> >
> > I attached a small patch to make the above change.
>
> A small recommendation:
> It would enhance clarity to include a line break following "If no
> column list is specified, any columns added to the table later are":
> -   If no column list is specified, any columns added later are automatically
> +   If no column list is specified, any columns added to the table
> later are automatically
>     replicated. This means that having a column list which names all columns

Hi Vignesh,

IIUC you're saying my v1 patch *content* and rendering is OK, but you
only wanted the SGML text to have better wrapping for < 80 chars
lines. So I have attached a patch v2 with improved wrapping. If you
meant something different then please explain.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Ambiguous description on new columns

От
vignesh C
Дата:
On Thu, 30 May 2024 at 06:21, Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Wed, May 29, 2024 at 8:04 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Wed, 22 May 2024 at 14:26, Peter Smith <smithpb2250@gmail.com> wrote:
> > >
> > > On Tue, May 21, 2024 at 8:40 PM PG Doc comments form
> > > <noreply@postgresql.org> wrote:
> > > >
> > > > The following documentation comment has been logged on the website:
> > > >
> > > > Page: https://www.postgresql.org/docs/16/logical-replication-col-lists.html
> > > > Description:
> > > >
> > > > The documentation on this page mentions:
> > > >
> > > > "If no column list is specified, any columns added later are automatically
> > > > replicated."
> > > >
> > > > It feels ambiguous what this could mean. Does it mean:
> > > >
> > > > 1/ That if you alter the table on the publisher and add a new column, it
> > > > will be replicated
> > > >
> > > > 2/ If you add a column list later and add a column to it, it will be
> > > > replicated
> > > >
> > > > In both cases, does the subscriber automatically create this column if it
> > > > wasn't there before?
> > >
> > > No, the subscriber will not automatically create the column. That is
> > > already clearly said at the top of the same page you linked "The table
> > > on the subscriber side must have at least all the columns that are
> > > published."
> > >
> > > All that "If no column list..." paragraph was trying to say is:
> > >
> > > CREATE PUBLICATION pub FOR TABLE T;
> > >
> > > is not quite the same as:
> > >
> > > CREATE PUBLICATION pub FOR TABLE T(a,b,c);
> > >
> > > The difference is, in the 1st case if you then ALTER the TABLE T to
> > > have a new column 'd' then that will automatically start replicating
> > > the 'd' data without having to do anything to either the PUBLICATION
> > > or the SUBSCRIPTION. Of course, if TABLE T at the subscriber side does
> > > not have a column 'd' then you'll get an error because your subscriber
> > > table needs to have *at least* all the replicated columns. (I
> > > demonstrate this error below)
> > >
> > > Whereas in the 2nd case, even though you ALTER'ed the TABLE T to have
> > > a new column 'd' then that won't be replicated because 'd' was not
> > > named in the PUBLICATION's column list.
> > >
> > > ~~~~
> > >
> > > Here's an example where you can see this in action
> > >
> > > Here is an example of the 1st case -- it shows 'd' is automatically
> > > replicated and also shows the subscriber-side error caused by the
> > > missing column:
> > >
> > > test_pub=# CREATE TABLE T(a int,b int, c int);
> > > test_pub=# CREATE PUBLICATION pub FOR TABLE T;
> > >
> > > test_sub=# CREATE TABLE T(a int,b int, c int);
> > > test_sub=# CREATE SUBSCRIPTION sub CONNECTION 'dbname=test_pub' PUBLICATION pub;
> > >
> > > See the replication happening
> > > test_pub=# INSERT INTO T VALUES (1,2,3);
> > > test_sub=# SELECT * FROM t;
> > >  a | b | c
> > > ---+---+---
> > >  1 | 2 | 3
> > > (1 row)
> > >
> > > Now alter the publisher table T and insert some new data
> > > test_pub=# ALTER TABLE T ADD COLUMN d int;
> > > test_pub=# INSERT INTO T VALUES (5,6,7,8);
> > >
> > > This will cause subscription errors like:
> > > 2024-05-22 11:53:19.098 AEST [16226] ERROR:  logical replication
> > > target relation "public.t" is missing replicated column: "d"
> > >
> > > ~~~~
> > >
> > > I think the following small change will remove any ambiguity:
> > >
> > > BEFORE
> > > If no column list is specified, any columns added later are
> > > automatically replicated.
> > >
> > > SUGGESTION
> > > If no column list is specified, any columns added to the table later
> > > are automatically replicated.
> > >
> > > ~~
> > >
> > > I attached a small patch to make the above change.
> >
> > A small recommendation:
> > It would enhance clarity to include a line break following "If no
> > column list is specified, any columns added to the table later are":
> > -   If no column list is specified, any columns added later are automatically
> > +   If no column list is specified, any columns added to the table
> > later are automatically
> >     replicated. This means that having a column list which names all columns
>
> Hi Vignesh,
>
> IIUC you're saying my v1 patch *content* and rendering is OK, but you
> only wanted the SGML text to have better wrapping for < 80 chars
> lines. So I have attached a patch v2 with improved wrapping. If you
> meant something different then please explain.

Yes, that is what I meant and the updated patch looks good.

Regards,
Vignesh



Re: Ambiguous description on new columns

От
vignesh C
Дата:
On Fri, 31 May 2024 at 08:58, vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, 30 May 2024 at 06:21, Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > On Wed, May 29, 2024 at 8:04 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Wed, 22 May 2024 at 14:26, Peter Smith <smithpb2250@gmail.com> wrote:
> > > >
> > > > On Tue, May 21, 2024 at 8:40 PM PG Doc comments form
> > > > <noreply@postgresql.org> wrote:
> > > > >
> > > > > The following documentation comment has been logged on the website:
> > > > >
> > > > > Page: https://www.postgresql.org/docs/16/logical-replication-col-lists.html
> > > > > Description:
> > > > >
> > > > > The documentation on this page mentions:
> > > > >
> > > > > "If no column list is specified, any columns added later are automatically
> > > > > replicated."
> > > > >
> > > > > It feels ambiguous what this could mean. Does it mean:
> > > > >
> > > > > 1/ That if you alter the table on the publisher and add a new column, it
> > > > > will be replicated
> > > > >
> > > > > 2/ If you add a column list later and add a column to it, it will be
> > > > > replicated
> > > > >
> > > > > In both cases, does the subscriber automatically create this column if it
> > > > > wasn't there before?
> > > >
> > > > No, the subscriber will not automatically create the column. That is
> > > > already clearly said at the top of the same page you linked "The table
> > > > on the subscriber side must have at least all the columns that are
> > > > published."
> > > >
> > > > All that "If no column list..." paragraph was trying to say is:
> > > >
> > > > CREATE PUBLICATION pub FOR TABLE T;
> > > >
> > > > is not quite the same as:
> > > >
> > > > CREATE PUBLICATION pub FOR TABLE T(a,b,c);
> > > >
> > > > The difference is, in the 1st case if you then ALTER the TABLE T to
> > > > have a new column 'd' then that will automatically start replicating
> > > > the 'd' data without having to do anything to either the PUBLICATION
> > > > or the SUBSCRIPTION. Of course, if TABLE T at the subscriber side does
> > > > not have a column 'd' then you'll get an error because your subscriber
> > > > table needs to have *at least* all the replicated columns. (I
> > > > demonstrate this error below)
> > > >
> > > > Whereas in the 2nd case, even though you ALTER'ed the TABLE T to have
> > > > a new column 'd' then that won't be replicated because 'd' was not
> > > > named in the PUBLICATION's column list.
> > > >
> > > > ~~~~
> > > >
> > > > Here's an example where you can see this in action
> > > >
> > > > Here is an example of the 1st case -- it shows 'd' is automatically
> > > > replicated and also shows the subscriber-side error caused by the
> > > > missing column:
> > > >
> > > > test_pub=# CREATE TABLE T(a int,b int, c int);
> > > > test_pub=# CREATE PUBLICATION pub FOR TABLE T;
> > > >
> > > > test_sub=# CREATE TABLE T(a int,b int, c int);
> > > > test_sub=# CREATE SUBSCRIPTION sub CONNECTION 'dbname=test_pub' PUBLICATION pub;
> > > >
> > > > See the replication happening
> > > > test_pub=# INSERT INTO T VALUES (1,2,3);
> > > > test_sub=# SELECT * FROM t;
> > > >  a | b | c
> > > > ---+---+---
> > > >  1 | 2 | 3
> > > > (1 row)
> > > >
> > > > Now alter the publisher table T and insert some new data
> > > > test_pub=# ALTER TABLE T ADD COLUMN d int;
> > > > test_pub=# INSERT INTO T VALUES (5,6,7,8);
> > > >
> > > > This will cause subscription errors like:
> > > > 2024-05-22 11:53:19.098 AEST [16226] ERROR:  logical replication
> > > > target relation "public.t" is missing replicated column: "d"
> > > >
> > > > ~~~~
> > > >
> > > > I think the following small change will remove any ambiguity:
> > > >
> > > > BEFORE
> > > > If no column list is specified, any columns added later are
> > > > automatically replicated.
> > > >
> > > > SUGGESTION
> > > > If no column list is specified, any columns added to the table later
> > > > are automatically replicated.
> > > >
> > > > ~~
> > > >
> > > > I attached a small patch to make the above change.
> > >
> > > A small recommendation:
> > > It would enhance clarity to include a line break following "If no
> > > column list is specified, any columns added to the table later are":
> > > -   If no column list is specified, any columns added later are automatically
> > > +   If no column list is specified, any columns added to the table
> > > later are automatically
> > >     replicated. This means that having a column list which names all columns
> >
> > Hi Vignesh,
> >
> > IIUC you're saying my v1 patch *content* and rendering is OK, but you
> > only wanted the SGML text to have better wrapping for < 80 chars
> > lines. So I have attached a patch v2 with improved wrapping. If you
> > meant something different then please explain.
>
> Yes, that is what I meant and the updated patch looks good.

Adding Amit to get his opinion on the same.

Regards,
Vignesh



Re: Ambiguous description on new columns

От
Amit Kapila
Дата:
On Fri, May 31, 2024 at 10:54 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Wed, May 29, 2024 at 8:04 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > > >
> > > > The following documentation comment has been logged on the website:
> > > >
> > > > Page: https://www.postgresql.org/docs/16/logical-replication-col-lists.html
> > > > Description:
> > > >
> > > > The documentation on this page mentions:
> > > >
> > > > "If no column list is specified, any columns added later are automatically
> > > > replicated."
> > > >
> > > > It feels ambiguous what this could mean. Does it mean:
> > > >
> > > > 1/ That if you alter the table on the publisher and add a new column, it
> > > > will be replicated
> > > >
> > > > 2/ If you add a column list later and add a column to it, it will be
> > > > replicated
> > > >
> > > > In both cases, does the subscriber automatically create this column if it
> > > > wasn't there before?
> > >
> > > ~~~~
> > >
> > > I think the following small change will remove any ambiguity:
> > >
> > > BEFORE
> > > If no column list is specified, any columns added later are
> > > automatically replicated.
> > >
> > > SUGGESTION
> > > If no column list is specified, any columns added to the table later
> > > are automatically replicated.
> > >
> > > ~~
> > >
> > > I attached a small patch to make the above change.
> >
> > A small recommendation:
> > It would enhance clarity to include a line break following "If no
> > column list is specified, any columns added to the table later are":
> > -   If no column list is specified, any columns added later are automatically
> > +   If no column list is specified, any columns added to the table
> > later are automatically
> >     replicated. This means that having a column list which names all columns
>
> Hi Vignesh,
>
> IIUC you're saying my v1 patch *content* and rendering is OK, but you
> only wanted the SGML text to have better wrapping for < 80 chars
> lines. So I have attached a patch v2 with improved wrapping. If you
> meant something different then please explain.
>

Your patch is an improvement. Koen, does the proposed change make
things clear to you?

--
With Regards,
Amit Kapila.



Re: Ambiguous description on new columns

От
Amit Kapila
Дата:
On Tue, Jun 4, 2024 at 11:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> >
> > IIUC you're saying my v1 patch *content* and rendering is OK, but you
> > only wanted the SGML text to have better wrapping for < 80 chars
> > lines. So I have attached a patch v2 with improved wrapping. If you
> > meant something different then please explain.
> >
>
> Your patch is an improvement. Koen, does the proposed change make
> things clear to you?
>

I am planning to push and backpatch the latest patch by Peter Smith
unless there are any further comments or suggestions.

--
With Regards,
Amit Kapila.



Re: Ambiguous description on new columns

От
Koen De Groote
Дата:
Yes, this change is clear to me that the "columns added" applies to the table on the publisher.

Regards,
Koen De Groote

On Tue, Jun 4, 2024 at 7:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, May 31, 2024 at 10:54 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Wed, May 29, 2024 at 8:04 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > > >
> > > > The following documentation comment has been logged on the website:
> > > >
> > > > Page: https://www.postgresql.org/docs/16/logical-replication-col-lists.html
> > > > Description:
> > > >
> > > > The documentation on this page mentions:
> > > >
> > > > "If no column list is specified, any columns added later are automatically
> > > > replicated."
> > > >
> > > > It feels ambiguous what this could mean. Does it mean:
> > > >
> > > > 1/ That if you alter the table on the publisher and add a new column, it
> > > > will be replicated
> > > >
> > > > 2/ If you add a column list later and add a column to it, it will be
> > > > replicated
> > > >
> > > > In both cases, does the subscriber automatically create this column if it
> > > > wasn't there before?
> > >
> > > ~~~~
> > >
> > > I think the following small change will remove any ambiguity:
> > >
> > > BEFORE
> > > If no column list is specified, any columns added later are
> > > automatically replicated.
> > >
> > > SUGGESTION
> > > If no column list is specified, any columns added to the table later
> > > are automatically replicated.
> > >
> > > ~~
> > >
> > > I attached a small patch to make the above change.
> >
> > A small recommendation:
> > It would enhance clarity to include a line break following "If no
> > column list is specified, any columns added to the table later are":
> > -   If no column list is specified, any columns added later are automatically
> > +   If no column list is specified, any columns added to the table
> > later are automatically
> >     replicated. This means that having a column list which names all columns
>
> Hi Vignesh,
>
> IIUC you're saying my v1 patch *content* and rendering is OK, but you
> only wanted the SGML text to have better wrapping for < 80 chars
> lines. So I have attached a patch v2 with improved wrapping. If you
> meant something different then please explain.
>

Your patch is an improvement. Koen, does the proposed change make
things clear to you?

--
With Regards,
Amit Kapila.

Re: Ambiguous description on new columns

От
Amit Kapila
Дата:
On Fri, Jun 7, 2024 at 3:23 PM Koen De Groote <kdg.dev@gmail.com> wrote:
>
> Yes, this change is clear to me that the "columns added" applies to the table on the publisher.
>

Thanks for the confirmation. I have pushed and backpatched the fix.

--
With Regards,
Amit Kapila.