Re: Skipping schema changes in publication
От | Peter Smith |
---|---|
Тема | Re: Skipping schema changes in publication |
Дата | |
Msg-id | CAHut+Pv2P6dJ7hZj_fmzN+=xzjvpOpgkAJvDZg3TD2xpvmY1NQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Skipping schema changes in publication (Shlok Kyal <shlok.kyal.oss@gmail.com>) |
Ответы |
Re: Skipping schema changes in publication
|
Список | pgsql-hackers |
Hi Shlok. Some review comments for v15-0003. ====== doc/src/sgml/catalogs.sgml 1. <para> - True if the relation must be excluded + True if the column list or relation must be excluded from publication. + If a column list is specified in <literal>prattrs</literal>, then + exclude only those columns. If <literal>prattrs</literal> is NULL, + then exclude the entire relation. </para></entry> I noticed other fields on this page say "null" instead of "NULL". It seems like "null" is more conventional. ====== doc/src/sgml/logical-replication.sgml 2. <para> If no column list is specified, any columns added to the table later are automatically replicated. This means that having a column list which names - all columns is not the same as having no column list at all. + all columns is not the same as having no column list at all. Similarly, if an + column list is specified with EXCEPT, any columns added to the table later + are also replicated automatically. </para> 2a. CURRENTLY If no column list or a column list with EXCEPT is specified, any columns added to the table later are automatically replicated. This means that having a column list which names all columns is not the same as having no column list at all. If an column list is specified, any columns added to the table later are automatically replicated. ~ That still doesn't quite make sense. I think instead of saying "This means..." it needs to say something a bit like below: However, a normal column list (without EXCEPT) only replicates the specified columns and no more. Therefore, having a column list that names all columns is not the same as having no column list at all, as more columns may be added to the table later. ~ 2b. And the final sentence "If an column list..." looks like a cut/paste error (??) ~ 2c. Maybe here EXCEPT should be written as <literal>EXCEPT</literal> ~~~ 2.5A. The description about generated columns still says this: CURRENT: Generated columns can also be specified in a column list. This allows generated columns to be published, regardless of the publication parameter publish_generated_columns. See Section 29.6 for details. ~ But I don't think it is quite correct. IMO gencols behaviour is much more subtle... e.g. a) Normal collist - these named cols are published REGARDLESS of the 'publish_generated_cols' parameter (same as before) b) EXCEPT collist - you can specify gencols in the list REGARDLESS of the 'publish_generated_cols' parameter, because since they are named as "except" then they will not be published anyhow.... c) BUT for EXCEPT collist case, I think any gencols that are *not* covered by that EXCEPT collist should follow the rules according to the 'publish_generated_cols' parameter. So, it is much more tricky than the docs currently say: Also 2.5B. - The text says "See Section 29.6 for details," but there are no examples of these combinations (e.g. EXCEPT collist and diff parameter setting) 2.5C, - The regression tests also need to be more complex to cover these 2.5D. - You might need to add something in the CREATE PUBLICATION "NOTES" section after all -- even if it just refers to here. ~~~ 3. <para> Create a publication <literal>p1</literal>. A column list is defined for - table <literal>t1</literal> to reduce the number of columns that will be - replicated. Notice that the order of column names in the column list does - not matter. + table <literal>t1</literal>, and another column list is defined for table + <literal>t2</literal> using the EXCEPT clause to reduce the number of + columns that will be replicated. Note that the order of column names in + the column lists does not matter. <programlisting> -/* pub # */ CREATE PUBLICATION p1 FOR TABLE t1 (id, b, a, d); +/* pub # */ CREATE PUBLICATION p1 FOR TABLE t1 (id, b, a, d), t2 EXCEPT (d, a); </programlisting></para> Maybe here EXCEPT should be written as <literal>EXCEPT</literal> ====== doc/src/sgml/ref/create_publication.sgml 4. <para> - When a column list is specified, only the named columns are replicated. - The column list can contain stored generated columns as well. If the - column list is omitted, the publication will replicate all non-generated - columns (including any added in the future) by default. Stored generated - columns can also be replicated if <literal>publish_generated_columns</literal> - is set to <literal>stored</literal>. Specifying a column list has no - effect on <literal>TRUNCATE</literal> commands. See + When a column list without EXCEPT is specified, only the named columns are + replicated. The column list can contain stored generated columns as well. + If the column list is omitted, the publication will replicate + all non-generated columns (including any added in the future) by default. + Stored generated columns can also be replicated if + <literal>publish_generated_columns</literal> is set to + <literal>stored</literal>. Specifying a column list has no effect on + <literal>TRUNCATE</literal> commands. See <xref linkend="logical-replication-col-lists"/> for details about column lists. </para> Maybe here EXCEPT should be written as <literal>EXCEPT</literal> ~~~ 5. + <para> + When a column list is specified with EXCEPT, the named columns are not + replicated. Specifying a column list has no effect on + <literal>TRUNCATE</literal> commands. + </para> Maybe here EXCEPT should be written as <literal>EXCEPT</literal>. ** Note all the extra subtleties that I mentioned in the review comment #2.5 above --- e.g. IMO any *un-listed* gencols still should follow the parameter rules. ~~~ 6. <para> Any column list must include the <literal>REPLICA IDENTITY</literal> columns - in order for <command>UPDATE</command> or <command>DELETE</command> - operations to be published. There are no column list restrictions if the - publication publishes only <command>INSERT</command> operations. + and any column list specified with EXCEPT must not include the + <literal>REPLICA IDENTITY</literal> columns in order for + <command>UPDATE</command> or <command>DELETE</command> operations to be + published. There are no column list restrictions if the publication publishes + only <command>INSERT</command> operations. </para> 6a. CURRENT: Any column list must include the REPLICA IDENTITY columns, and any column list specified with EXCEPT must not include the REPLICA IDENTITY columns in order for UPDATE or DELETE operations to be published. ~ I felt that might be better expressed the other way around. Also, it might be better to say "not name" instead of "not include" because EXCEPT + include seemed a bit contrary. SUGGESTION (maybe like this) In order for UPDATE or DELETE operations to work, all the REPLICA IDENTITY columns must be published. So, any column list must name all REPLICA IDENTITY columns, and any EXCEPT column list must not name any REPLICA IDENTITY columns. ~~ 6b. Maybe here EXCEPT should be written as <literal>EXCEPT</literal> ====== src/backend/catalog/pg_publication.c check_and_fetch_column_list: 7. + /* Lookup the except attribute */ + cfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, cftuple, + Anum_pg_publication_rel_prexcept, &isnull); + + if (!isnull) + { + Assert(!pub->alltables); + *except_columns = DatumGetBool(cfdatum); + } + I felt it would be safer to also assign *except_columns = false; up-front so the caller could be sure this flag was meaningful on return. ~~~ pub_form_cols_map: 8. Maybe use snake case like for other params, so /excepcols/except_cols/ ~~~ pg_get_publication_tables: 9. I felt all the logic in this function maybe can be simpler: e.g. If you just have "Bitmapset *except_columns = NULL;" then null nmeans there is no except columns; otherwise there is. This means you don't need a separate 'bool except_column' variable. e.g. Assign the Bitmapset *except_columns after you already have the values[2], instead of doing it later. e.g. The skip code if (except_columns && bms_is_member(att->attnum, columns)) could just check the list member, I think, without the additional bool. ~~~ 10. + /* + * We fetch pubtuple if publication is not FOR ALL TABLES and not + * FOR TABLES IN SCHEMA. So if prexcept is true, it indicate that + * prattrs contains columns to be excluded for replication. + */ + if (!isnull) + except_columns = DatumGetBool(exceptDatum); /indicate/indicates/ ====== src/backend/parser/gram.y 11. + | TABLE relation_expr EXCEPT opt_except_column_list OptWhereClause + { + $$ = makeNode(PublicationObjSpec); + $$->pubobjtype = PUBLICATIONOBJ_TABLE; + $$->pubtable = makeNode(PublicationTable); + $$->pubtable->relation = $2; + $$->pubtable->columns = $4; + $$->pubtable->whereClause = $5; + $$->pubtable->except = true; + $$->location = @1; + } I wasn't expecting you would need another 'opt_except_column_list' and all the code duplication that causes. AFAIK, the syntax is identical for 'opt_column_list' apart from the preceding EXCEPT so I thought all you need is to allow the 'opt_column_list' to have an optional EXCEPT qualifier. ====== src/backend/replication/pgoutput/pgoutput.c 12. + + /* + * Indicates whether no columns are published for a given relation. With + * the introduction of the EXCEPT clause in column lists, it is now + * possible to define a publication that excludes all columns of a table. + * However, the 'columns' attribute cannot represent this case, since a + * NULL value implies that all columns are published. To distinguish this + * scenario, the 'no_cols_published' flag is introduced. + */ + bool no_cols_published; } RelationSyncEntry; But, what about when Bitmapset *columns is not null, but has no bits set -- doesn't that mean the same as "no columns"? ====== src/include/catalog/pg_publication.h 13. extern Bitmapset *pub_form_cols_map(Relation relation, - PublishGencolsType include_gencols_type); + PublishGencolsType include_gencols_type, + Bitmapset *exceptcols); Maybe snake-case like the other params: /exceptcols/except_cols/ ====== src/test/regress/sql/publication.sql 14. +-- Verify that publication is created with EXCEPT +CREATE PUBLICATION testpub_except FOR TABLE pub_test_except1, pub_sch1.pub_test_except2 EXCEPT (b, c); +SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except'; + I think tests should also use psql \dRp+ commands in places to show that the "describe" stuff is working correctly. ~~~ 15. +-- Check for invalid cases +CREATE PUBLICATION testpub_except2 FOR TABLES IN SCHEMA pub_sch1, TABLE pub_test_except1 EXCEPT (b, c); +CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except1 EXCEPT; Should explain more about what you are testing here: a) cannot use EXCEPT col-lists combined with TABLES IN SCHEMA b) syntax error EXCEPT without a col-list ~~~ 16. +-- Verify that publication can be altered with EXCEPT +ALTER PUBLICATION testpub_except SET TABLE pub_test_except1 EXCEPT (a, b), pub_sch1.pub_test_except2; +SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except'; The comment is a bit misleading because there are many kinds of "alter". Maybe say more like Verify ok - ALTER PUBLICATION ... SET ... EXCEPT (col-list) ~~~ 17. +-- Verify ALTER PUBLICATION ... DROP +ALTER PUBLICATION testpub_except DROP TABLE pub_test_except1 EXCEPT (a, b); +ALTER PUBLICATION testpub_except DROP TABLE pub_test_except1; Should explain more: +-- Verify fails - ALTER PUBLICATION ... DROP ... EXCEPT (col-list) +-- Verify ok - ALTER PUBLICATION ... DROP ... ~~~ 18. +ALTER PUBLICATION testpub_except ADD TABLE pub_test_except1 EXCEPT (c, d); +SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except'; Missing comment: +-- Verify ok - ALTER PUBLICATION ... ADD ... EXCEPT (col-list) ~~~ 19. +-- Verify excluded columns cannot be part of REPLICA IDENTITY +ALTER TABLE pub_test_except1 REPLICA IDENTITY FULL; +UPDATE pub_test_except1 SET a = 3 WHERE a = 1; +CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a, c); +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX pub_test_except1_a_idx; +UPDATE pub_test_except1 SET a = 3 WHERE a = 1; +DROP INDEX pub_test_except1_a_idx; +CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a); +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX pub_test_except1_a_idx; +UPDATE pub_test_except1 SET a = 3 WHERE a = 1; + +DROP INDEX pub_test_except1_a_idx; 19a. IIUC, really there are multiple tests here, so I think it should all be split and commented separately. a) Verify that EXCEPT col-list cannot contain RI cols (when using RI FULL) b) Verify that EXCEPT col-list cannot contain RI cols (when using INDEX) c) Verify that so long as no clash between RI cols and the EXCEPT col-list, then it is ok ~ 19b. IMO, some index names could be better: CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a, c); How about 'pub_test_except1_ac_idx'? ~~~ 20. +DROP PUBLICATION testpub_except; +DROP TABLE pub_test_except1; +DROP TABLE pub_sch1.pub_test_except2; Add a "cleanup" comment. ====== Kind Regards, Peter Smith. Fujitsu Australia
В списке pgsql-hackers по дате отправления: