Re: row filtering for logical replication

Поиск
Список
Период
Сортировка
От Peter Smith
Тема Re: row filtering for logical replication
Дата
Msg-id CAHut+PtCr6kgW6Z2JpX6FJFbxA+o9Lm6B4Yqja_xMWE1ie5kdw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: row filtering for logical replication  (Peter Smith <smithpb2250@gmail.com>)
Список pgsql-hackers
On Mon, Nov 15, 2021 at 12:01 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Fri, Nov 12, 2021 at 9:19 PM Ajin Cherian <itsajin@gmail.com> wrote:
> >
> > Attaching version 39-
>
> Here are some review comments for v39-0006:
>
> 1)
> @@ -261,9 +261,9 @@ rowfilter_expr_replident_walker(Node *node,
> rf_context *context)
>   * Rule 1. Walk the parse-tree and reject anything other than very simple
>   * expressions (See rowfilter_validator for details on what is permitted).
>   *
> - * Rule 2. If the publish operation contains "delete" then only columns that
> - * are allowed by the REPLICA IDENTITY rules are permitted to be used in the
> - * row-filter WHERE clause.
> + * Rule 2. If the publish operation contains "delete" or "delete" then only
> + * columns that are allowed by the REPLICA IDENTITY rules are permitted to
> + * be used in the row-filter WHERE clause.
>   */
>  static void
>  rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node
> *rfnode, Relation rel)
> @@ -276,12 +276,10 @@ rowfilter_expr_checker(Publication *pub,
> ParseState *pstate, Node *rfnode, Relat
>   rowfilter_validator(relname, rfnode);
>
>   /*
> - * Rule 2: For "delete", check that filter cols are also valid replica
> + * Rule 2: For "delete" and "update", check that filter cols are also
> valid replica
>   * identity cols.
> - *
> - * TODO - check later for publish "update" case.
>   */
> - if (pub->pubactions.pubdelete)
>
> 1a)
> Typo - the function comment: "delete" or "delete"; should say:
> "delete" or "update"
>
> 1b)
> I felt it would be better (for the comment in the function body) to
> write it as "or" instead of "and" because then it matches with the
> code "if ||" that follows this comment.
>
> ====
>
> 2)
> @@ -746,6 +780,92 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
>  }
>
>  /*
> + * Write a tuple to the outputstream using cached slot, in the most
> efficient format possible.
> + */
> +static void
> +logicalrep_write_tuple_cached(StringInfo out, Relation rel,
> TupleTableSlot *slot, bool binary)
>
> The function logicalrep_write_tuple_cached seems to have almost all of
> its function body in common with logicalrep_write_tuple. Is there any
> good way to combine these functions to avoid ~80 lines mostly
> duplicated code?
>
> ====
>
> 3)
> + if (!old_matched && !new_matched)
> + return false;
> +
> + if (old_matched && new_matched)
> + *action = REORDER_BUFFER_CHANGE_UPDATE;
> + else if (old_matched && !new_matched)
> + *action = REORDER_BUFFER_CHANGE_DELETE;
> + else if (new_matched && !old_matched)
> + *action = REORDER_BUFFER_CHANGE_INSERT;
> +
> + return true;
>
> I felt it is slightly confusing to have inconsistent ordering of the
> old_matched and new_matched in those above conditions.
>
> I suggest to use the order like:
> * old-row (no match) new-row (no match)
> * old-row (no match) new row (match)
> * old-row (match) new-row (no match)
> * old-row (match) new row (match)
>
> And then be sure to keep consistent ordering in all places it is mentioned:
> * in the code
> * in the function header comment
> * in the commit comment
> * in docs?
>
> ====
>
> 4)
> +/*
> + * Change is checked against the row filter, if any.
> + *
> + * If it returns true, the change is replicated, otherwise, it is not.
> + */
> +static bool
> +pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot,
> RelationSyncEntry *entry)
> +{
> + EState    *estate;
> + ExprContext *ecxt;
> + bool result = true;
> + Oid         relid = RelationGetRelid(relation);
> +
> + /* Bail out if there is no row filter */
> + if (!entry->exprstate)
> + return true;
> +
> + elog(DEBUG3, "table \"%s.%s\" has row filter",
> + get_namespace_name(get_rel_namespace(relid)),
> + get_rel_name(relid));
>
> It seems like that elog may consume unnecessary CPU most of the time.
> I think it might be better to remove the relid declaration and rewrite
> that elog as:
>
> if (message_level_is_interesting(DEBUG3))
>     elog(DEBUG3, "table \"%s.%s\" has row filter",
>             get_namespace_name(get_rel_namespace(entry->relid)),
>             get_rel_name(entry->relid));
>
> ====
>
> 5)
> diff --git a/src/include/replication/reorderbuffer.h
> b/src/include/replication/reorderbuffer.h
> index 5b40ff7..aec0059 100644
> --- a/src/include/replication/reorderbuffer.h
> +++ b/src/include/replication/reorderbuffer.h
> @@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
>   * respectively.  They're used by INSERT .. ON CONFLICT .. UPDATE.  Users of
>   * logical decoding don't have to care about these.
>   */
> -enum ReorderBufferChangeType
> +typedef enum ReorderBufferChangeType
>  {
>   REORDER_BUFFER_CHANGE_INSERT,
>   REORDER_BUFFER_CHANGE_UPDATE,
> @@ -65,7 +65,7 @@ enum ReorderBufferChangeType
>   REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
>   REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
>   REORDER_BUFFER_CHANGE_TRUNCATE
> -};
> +} ReorderBufferChangeType;
>
> This new typedef can be added to src/tools/pgindent/typedefs.list.
>

All above are fixed by Ajin Cherian in V40-0006 [1].

-----
[1] https://www.postgresql.org/message-id/CAHut%2BPv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Smith
Дата:
Сообщение: Re: row filtering for logical replication
Следующее
От: Peter Smith
Дата:
Сообщение: Re: row filtering for logical replication