Re: Skipping schema changes in publication

Поиск
Список
Период
Сортировка
От Peter Smith
Тема Re: Skipping schema changes in publication
Дата
Msg-id CAHut+Pv2P6dJ7hZj_fmzN+=xzjvpOpgkAJvDZg3TD2xpvmY1NQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Skipping schema changes in publication  (Shlok Kyal <shlok.kyal.oss@gmail.com>)
Ответы Re: Skipping schema changes in publication
Список pgsql-hackers
Hi Shlok.

Some review comments for v15-0003.

======
doc/src/sgml/catalogs.sgml

1.
       <para>
-       True if the relation must be excluded
+       True if the column list or relation must be excluded from publication.
+       If a column list is specified in <literal>prattrs</literal>, then
+       exclude only those columns. If <literal>prattrs</literal> is NULL,
+       then exclude the entire relation.
       </para></entry>

I noticed other fields on this page say "null" instead of "NULL". It
seems like "null" is more conventional.

======
doc/src/sgml/logical-replication.sgml

2.
   <para>
    If no column list is specified, any columns added to the table later are
    automatically replicated. This means that having a column list which names
-   all columns is not the same as having no column list at all.
+   all columns is not the same as having no column list at all.
Similarly, if an
+   column list is specified with EXCEPT, any columns added to the table later
+   are also replicated automatically.
   </para>

2a.
CURRENTLY
If no column list or a column list with EXCEPT is specified, any
columns added to the table later are automatically replicated. This
means that having a column list which names all columns is not the
same as having no column list at all. If an column list is specified,
any columns added to the table later are automatically replicated.

~

That still doesn't quite make sense. I think instead of saying "This
means..." it needs to say something a bit like below:

However, a normal column list (without EXCEPT) only replicates the
specified columns and no more. Therefore, having a column list that
names all columns is not the same as having no column list at all, as
more columns may be added to the table later.

~

2b.
And the final sentence "If an column list..." looks like a cut/paste error (??)

~

2c.
Maybe here EXCEPT should be written as <literal>EXCEPT</literal>

~~~

2.5A.
The description about generated columns still says this:

CURRENT:
Generated columns can also be specified in a column list. This allows
generated columns to be published, regardless of the publication
parameter publish_generated_columns. See Section 29.6 for details.

~

But I don't think it is quite correct. IMO gencols behaviour is much
more subtle...

e.g.

a) Normal collist - these named cols are published REGARDLESS of the
'publish_generated_cols' parameter (same as before)

b) EXCEPT collist - you can specify gencols in the list REGARDLESS of
the 'publish_generated_cols' parameter, because since they are named
as "except" then they will not be published anyhow....

c) BUT for EXCEPT collist case, I think any gencols that are *not*
covered by that EXCEPT collist should follow the rules according to
the 'publish_generated_cols' parameter.

So, it is much more tricky than the docs currently say:

Also

2.5B.
- The text says "See Section 29.6 for details," but there are no
examples of these combinations (e.g. EXCEPT collist and diff parameter
setting)

2.5C,
- The regression tests also need to be more complex to cover these

2.5D.
- You might need to add something in the CREATE PUBLICATION "NOTES"
section after all -- even if it just refers to here.

~~~

3.
    <para>
     Create a publication <literal>p1</literal>. A column list is defined for
-    table <literal>t1</literal> to reduce the number of columns that will be
-    replicated. Notice that the order of column names in the column list does
-    not matter.
+    table <literal>t1</literal>, and another column list is defined for table
+    <literal>t2</literal> using the EXCEPT clause to reduce the number of
+    columns that will be replicated. Note that the order of column names in
+    the column lists does not matter.
 <programlisting>
-/* pub # */ CREATE PUBLICATION p1 FOR TABLE t1 (id, b, a, d);
+/* pub # */ CREATE PUBLICATION p1 FOR TABLE t1 (id, b, a, d), t2 EXCEPT (d, a);
 </programlisting></para>

Maybe here EXCEPT should be written as <literal>EXCEPT</literal>

======
doc/src/sgml/ref/create_publication.sgml

4.
      <para>
-      When a column list is specified, only the named columns are replicated.
-      The column list can contain stored generated columns as well. If the
-      column list is omitted, the publication will replicate all non-generated
-      columns (including any added in the future) by default. Stored generated
-      columns can also be replicated if
<literal>publish_generated_columns</literal>
-      is set to <literal>stored</literal>. Specifying a column list has no
-      effect on <literal>TRUNCATE</literal> commands. See
+      When a column list without EXCEPT is specified, only the named
columns are
+      replicated. The column list can contain stored generated columns as well.
+      If the column list is omitted, the publication will replicate
+      all non-generated columns (including any added in the future) by default.
+      Stored generated columns can also be replicated if
+      <literal>publish_generated_columns</literal> is set to
+      <literal>stored</literal>. Specifying a column list has no effect on
+      <literal>TRUNCATE</literal> commands. See
       <xref linkend="logical-replication-col-lists"/> for details about column
       lists.
      </para>

Maybe here EXCEPT should be written as <literal>EXCEPT</literal>

~~~

5.
+     <para>
+      When a column list is specified with EXCEPT, the named columns are not
+      replicated. Specifying a column list has no effect on
+      <literal>TRUNCATE</literal> commands.
+     </para>

Maybe here EXCEPT should be written as <literal>EXCEPT</literal>.

** Note all the extra subtleties that I mentioned in the review
comment #2.5 above --- e.g. IMO any *un-listed* gencols still should
follow the parameter rules.

~~~

6.
   <para>
    Any column list must include the <literal>REPLICA IDENTITY</literal> columns
-   in order for <command>UPDATE</command> or <command>DELETE</command>
-   operations to be published. There are no column list restrictions if the
-   publication publishes only <command>INSERT</command> operations.
+   and any column list specified with EXCEPT must not include the
+   <literal>REPLICA IDENTITY</literal> columns in order for
+   <command>UPDATE</command> or <command>DELETE</command> operations to be
+   published. There are no column list restrictions if the
publication publishes
+   only <command>INSERT</command> operations.
   </para>

6a.
CURRENT:
Any column list must include the REPLICA IDENTITY columns, and any
column list specified with EXCEPT must not include the REPLICA
IDENTITY columns in order for UPDATE or DELETE operations to be
published.

~

I felt that might be better expressed the other way around. Also, it
might be better to say "not name" instead of "not include" because
EXCEPT + include seemed a bit contrary.


SUGGESTION (maybe like this)
In order for UPDATE or DELETE operations to work, all the REPLICA
IDENTITY columns must be published. So, any column list must name all
REPLICA IDENTITY columns, and any EXCEPT column list must not name any
REPLICA IDENTITY columns.

~~

6b.
Maybe here EXCEPT should be written as <literal>EXCEPT</literal>

======
src/backend/catalog/pg_publication.c

check_and_fetch_column_list:

7.
+ /* Lookup the except attribute */
+ cfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, cftuple,
+   Anum_pg_publication_rel_prexcept, &isnull);
+
+ if (!isnull)
+ {
+ Assert(!pub->alltables);
+ *except_columns = DatumGetBool(cfdatum);
+ }
+

I felt it would be safer to also assign *except_columns = false;
up-front so the caller could be sure this flag was meaningful on
return.

~~~

pub_form_cols_map:

8.
Maybe use snake case like for other params, so /excepcols/except_cols/

~~~

pg_get_publication_tables:

9.

I felt all the logic in this function maybe can be simpler:

e.g. If you just have "Bitmapset *except_columns = NULL;" then null
nmeans there is no except columns; otherwise there is. This means you
don't need a separate 'bool except_column' variable.

e.g. Assign the Bitmapset *except_columns after you already have the
values[2], instead of doing it later.

e.g. The skip code if (except_columns && bms_is_member(att->attnum,
columns)) could just check the list member, I think, without the
additional bool.

~~~

10.
+ /*
+ * We fetch pubtuple if publication is not FOR ALL TABLES and not
+ * FOR TABLES IN SCHEMA. So if prexcept is true, it indicate that
+ * prattrs contains columns to be excluded for replication.
+ */
+ if (!isnull)
+ except_columns = DatumGetBool(exceptDatum);


/indicate/indicates/

======
src/backend/parser/gram.y

11.
+ | TABLE relation_expr EXCEPT opt_except_column_list OptWhereClause
+ {
+ $$ = makeNode(PublicationObjSpec);
+ $$->pubobjtype = PUBLICATIONOBJ_TABLE;
+ $$->pubtable = makeNode(PublicationTable);
+ $$->pubtable->relation = $2;
+ $$->pubtable->columns = $4;
+ $$->pubtable->whereClause = $5;
+ $$->pubtable->except = true;
+ $$->location = @1;
+ }

I wasn't expecting you would need another 'opt_except_column_list' and
all the code duplication that causes. AFAIK, the syntax is identical
for 'opt_column_list' apart from the preceding EXCEPT so I thought all
you need is to allow the 'opt_column_list' to have an optional EXCEPT
qualifier.

======
src/backend/replication/pgoutput/pgoutput.c

12.
+
+ /*
+ * Indicates whether no columns are published for a given relation. With
+ * the introduction of the EXCEPT clause in column lists, it is now
+ * possible to define a publication that excludes all columns of a table.
+ * However, the 'columns' attribute cannot represent this case, since a
+ * NULL value implies that all columns are published. To distinguish this
+ * scenario, the 'no_cols_published' flag is introduced.
+ */
+ bool no_cols_published;
 } RelationSyncEntry;

But, what about when Bitmapset *columns is not null, but has no bits
set -- doesn't that mean the same as "no columns"?

======
src/include/catalog/pg_publication.h

13.
 extern Bitmapset *pub_form_cols_map(Relation relation,
- PublishGencolsType include_gencols_type);
+ PublishGencolsType include_gencols_type,
+ Bitmapset *exceptcols);

Maybe snake-case like the other params: /exceptcols/except_cols/

======
src/test/regress/sql/publication.sql

14.
+-- Verify that publication is created with EXCEPT
+CREATE PUBLICATION testpub_except FOR TABLE pub_test_except1,
pub_sch1.pub_test_except2 EXCEPT (b, c);
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except';
+

I think tests should also use psql \dRp+ commands in places to show
that the "describe" stuff is working correctly.

~~~

15.
+-- Check for invalid cases
+CREATE PUBLICATION testpub_except2 FOR TABLES IN SCHEMA pub_sch1,
TABLE pub_test_except1 EXCEPT (b, c);
+CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except1 EXCEPT;

Should explain more about what you are testing here:
a) cannot use EXCEPT col-lists combined with TABLES IN SCHEMA
b) syntax error EXCEPT without a col-list

~~~

16.
+-- Verify that publication can be altered with EXCEPT
+ALTER PUBLICATION testpub_except SET TABLE pub_test_except1 EXCEPT
(a, b), pub_sch1.pub_test_except2;
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except';

The comment is a bit misleading because there are many kinds of
"alter". Maybe say more like
Verify ok - ALTER PUBLICATION ... SET ... EXCEPT (col-list)

~~~

17.
+-- Verify ALTER PUBLICATION ... DROP
+ALTER PUBLICATION testpub_except DROP TABLE pub_test_except1 EXCEPT (a, b);
+ALTER PUBLICATION testpub_except DROP TABLE pub_test_except1;

Should explain more:
+-- Verify fails - ALTER PUBLICATION ... DROP ... EXCEPT (col-list)
+-- Verify ok - ALTER PUBLICATION ... DROP ...

~~~

18.
+ALTER PUBLICATION testpub_except ADD TABLE pub_test_except1 EXCEPT (c, d);
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except';

Missing comment:
+-- Verify ok - ALTER PUBLICATION ... ADD ... EXCEPT (col-list)

~~~

19.
+-- Verify excluded columns cannot be part of REPLICA IDENTITY
+ALTER TABLE pub_test_except1 REPLICA IDENTITY FULL;
+UPDATE pub_test_except1 SET a = 3 WHERE a = 1;

+CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a, c);
+ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
pub_test_except1_a_idx;
+UPDATE pub_test_except1 SET a = 3 WHERE a = 1;

+DROP INDEX pub_test_except1_a_idx;
+CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a);
+ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
pub_test_except1_a_idx;
+UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
+
+DROP INDEX pub_test_except1_a_idx;

19a.
IIUC, really there are multiple tests here, so I think it should all
be split and commented separately.

a) Verify that EXCEPT col-list cannot contain RI cols (when using RI FULL)
b) Verify that EXCEPT col-list cannot contain RI cols (when using INDEX)
c) Verify that so long as no clash between RI cols and the EXCEPT
col-list, then it is ok

~

19b.
IMO, some index names could be better:

CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a, c);
How about 'pub_test_except1_ac_idx'?

~~~

20.
+DROP PUBLICATION testpub_except;
+DROP TABLE pub_test_except1;
+DROP TABLE pub_sch1.pub_test_except2;

Add a "cleanup" comment.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



В списке pgsql-hackers по дате отправления: