Re: row filtering for logical replication

Поиск
Список
Период
Сортировка
От Peter Smith
Тема Re: row filtering for logical replication
Дата
Msg-id CAHut+PucFM3Bt-gaTT7Pr-Y_x+R0y=L7uqbhjPMUsSPhdLhRpA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: row filtering for logical replication  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Ответы Re: row filtering for logical replication  (Peter Smith <smithpb2250@gmail.com>)
Список pgsql-hackers
Here are my review comments for v62-0001

~~~

1. src/backend/catalog/pg_publication.c - GetTopMostAncestorInPublication

@@ -276,17 +276,46 @@ GetPubPartitionOptionRelations(List *result,
PublicationPartOpt pub_partopt,
 }

 /*
+ * Check if any of the ancestors are published in the publication. If so,
+ * return the relid of the topmost ancestor that is published via this
+ * publication, otherwise InvalidOid.
+ */

The GetTopMostAncestorInPublication function header comment seems to
be saying the same thing twice. I think it can be simplified

Suggested function comment:

"Return the relid of the topmost ancestor that is published via this
publication, otherwise return InvalidOid."

~~~

2. src/backend/commands/publicationcmds.c - AlterPublicationTables

- /* Calculate which relations to drop. */
+ /*
+ * In order to recreate the relation list for the publication, look
+ * for existing relations that need not be dropped.
+ */

Suggested minor rewording of comment:

"... look for existing relations that do not need to be dropped."

~~~

3. src/backend/commands/publicationcmds.c - AlterPublicationTables

+
+ /*
+ * Look if any of the new set of relations match with the
+ * existing relations in the publication. Additionally, if the
+ * relation has an associated where-clause, check the
+ * where-clauses also match. Drop the rest.
+ */
  if (RelationGetRelid(newpubrel->relation) == oldrelid)
Suggested minor rewording of comment:

"Look if any..." --> "Check if any..."

~~~

4. src/backend/executor/execReplication.c - CheckCmdReplicaIdentity

+ /*
+ * It is only safe to execute UPDATE/DELETE when all columns referenced in
+ * the row filters from publications which the relation is in are valid,
+ * which means all referenced columns are part of REPLICA IDENTITY, or the
+ * table does not publish UPDATES or DELETES.
+ */

Suggested minor rewording of comment:

"... in are valid, which means all ..." --> "... in are valid - i.e.
when all ..."

~~~

5. src/backend/parser/gram.y - ColId OptWhereClause

+ /*
+ * The OptWhereClause must be stored here but it is
+ * valid only for tables. If the ColId was mistakenly
+ * not a table this will be detected later in
+ * preprocess_pubobj_list() and an error is thrown.
+ */

Suggested minor rewording of comment:

"... and an error is thrown." --> "... and an error will be thrown."
~~~

6. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter

+ * ALL TABLES IN SCHEMA implies "don't use row filter expression" if
+ * the schema is the same as the table schema.
+ */
+ foreach(lc, data->publications)
+ {
+ Publication *pub = lfirst(lc);
+ HeapTuple rftuple = NULL;
+ Datum rfdatum = 0;
+ bool rfisnull;
+ bool pub_no_filter = false;
+ List    *schemarelids = NIL;

Not all of these variables need to be declared at the top of the loop
like this. Consider moving some of them (e.g. rfisnull, schemarelids)
lower down to declare only in the scope that uses them.

~~~

7. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter

+ if (pub->pubactions.pubinsert)
+ no_filter[idx_ins] = true;
+ if (pub->pubactions.pubupdate)
+ no_filter[idx_upd] = true;
+ if (pub->pubactions.pubdelete)
+ no_filter[idx_del] = true;

This code can be simplified I think. e.g.

no_filter[idx_ins] |= pub->pubactions.pubinsert;
no_filter[idx_upd] |= pub->pubactions.pubupdate;
no_filter[idx_del] |= pub->pubactions.pubdelete;

~~~

8. src/backend/replication/pgoutput/pgoutput.c - get_rel_sync_entry

@@ -1245,9 +1650,6 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
  entry->pubactions.pubtruncate |= pub->pubactions.pubtruncate;
  }

- if (entry->pubactions.pubinsert && entry->pubactions.pubupdate &&
- entry->pubactions.pubdelete && entry->pubactions.pubtruncate)
- break;
  }

I was not sure why that code was removed. Is it deliberate/correct?

~~~

9. src/backend/utils/cache/relcache.c - rowfilter_column_walker

@@ -5521,39 +5535,98 @@ RelationGetExclusionInfo(Relation indexRelation,
  MemoryContextSwitchTo(oldcxt);
 }

+
+
 /*
- * Get publication actions for the given relation.
+ * Check if any columns used in the row filter WHERE clause are not part of
+ * REPLICA IDENTITY and save the invalid column number in
+ * rf_context::invalid_rfcolnum.
  */

There is an extra blank line before the function comment.

~~~

10. src/backend/utils/cache/relcache.c - GetRelationPublicationInfo

+ if (HeapTupleIsValid(rftuple))
+ {
+ Datum rfdatum;
+ bool rfisnull;
+ Node    *rfnode;
+
+ context.pubviaroot = pubform->pubviaroot;
+ context.parentid = publish_as_relid;
+ context.relid = relid;
+
+ rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
+   Anum_pg_publication_rel_prqual,
+   &rfisnull);
+
+ if (!rfisnull)
+ {
+ rfnode = stringToNode(TextDatumGetCString(rfdatum));
+ rfcol_valid = !rowfilter_column_walker(rfnode, &context);
+ invalid_rfcolnum = context.invalid_rfcolnum;
+ pfree(rfnode);
+ }
+
+ ReleaseSysCache(rftuple);
+ }

Those 3 assignments to the context.pubviaroot/parentid/relid can be
moved to be inside the if (!rfisnull) block, because IIUC they don't
get used otherwise. Or, maybe better to just leave as-is; I am not
sure what is best. Please consider.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jaime Casanova
Дата:
Сообщение: Re: pg_upgrade parallelism
Следующее
От: Julien Rouhaud
Дата:
Сообщение: 2022-01 Commitfest