Re: row filtering for logical replication

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: row filtering for logical replication
Дата
Msg-id CAD21AoDwgj3t_6muphpT=Yspd785aT99ivzRXar-kNYwmthJFg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: row filtering for logical replication  (Önder Kalacı <onderkalaci@gmail.com>)
Ответы Re: row filtering for logical replication  (Önder Kalacı <onderkalaci@gmail.com>)
Список pgsql-hackers
Hi Önder,

On Thu, Dec 17, 2020 at 3:43 PM Önder Kalacı <onderkalaci@gmail.com> wrote:
>
> Hi all,
>
> I'm also interested in this patch. I rebased the changes to the current master branch and attached. The rebase had
twoissues. First, patch-8 was conflicting, and that seems only helpful for debugging purposes during development. So, I
droppedit for simplicity. Second, the changes have a conflict with `publish_via_partition_root` changes. I tried to fix
theissues, but ended-up having a limitation for now. The limitation is that "cannot create publication with WHERE
clauseon the partitioned table without publish_via_partition_root is set to true". This restriction can be lifted,
thoughI left out for the sake of focusing on the some issues that I observed on this patch. 
>
> Please see my review:
>
> +       if (list_length(relentry->qual) > 0)
> +       {
> +               HeapTuple       old_tuple;
> +               HeapTuple       new_tuple;
> +               TupleDesc       tupdesc;
> +               EState     *estate;
> +               ExprContext *ecxt;
> +               MemoryContext oldcxt;
> +               ListCell   *lc;
> +               bool            matched = true;
> +
> +               old_tuple = change->data.tp.oldtuple ? &change->data.tp.oldtuple->tuple : NULL;
> +               new_tuple = change->data.tp.newtuple ? &change->data.tp.newtuple->tuple : NULL;
> +               tupdesc = RelationGetDescr(relation);
> +               estate = create_estate_for_relation(relation);
> +
> +               /* prepare context per tuple */
> +               ecxt = GetPerTupleExprContext(estate);
> +               oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
> +               ecxt->ecxt_scantuple = ExecInitExtraTupleSlot(estate, tupdesc, &TTSOpsHeapTuple);
> +
> +               ExecStoreHeapTuple(new_tuple ? new_tuple : old_tuple, ecxt->ecxt_scantuple, false);
> +
> +               foreach(lc, relentry->qual)
> +               {
> +                       Node       *qual;
> +                       ExprState  *expr_state;
> +                       Expr       *expr;
> +                       Oid                     expr_type;
> +                       Datum           res;
> +                       bool            isnull;
> +
> +                       qual = (Node *) lfirst(lc);
> +
> +                       /* evaluates row filter */
> +                       expr_type = exprType(qual);
> +                       expr = (Expr *) coerce_to_target_type(NULL, qual, expr_type, BOOLOID, -1,
COERCION_ASSIGNMENT,COERCE_IMPLICIT_CAST, -1); 
> +                       expr = expression_planner(expr);
> +                       expr_state = ExecInitExpr(expr, NULL);
> +                       res = ExecEvalExpr(expr_state, ecxt, &isnull);
> +
> +                       /* if tuple does not match row filter, bail out */
> +                       if (!DatumGetBool(res) || isnull)
> +                       {
> +                               matched = false;
> +                               break;
> +                       }
> +               }
> +
> +               MemoryContextSwitchTo(oldcxt);
> +
>
>
> The above part can be considered the core of the logic, executed per tuple. As far as I can see, it has two
downsides.
>
> First, calling `expression_planner()` for every tuple can be quite expensive. I created a sample table, loaded data
andran a quick benchmark to see its effect. I attached the very simple script that I used to reproduce the issue on my
laptop.I'm pretty sure you can find nicer ways of doing similar perf tests, just sharing as a reference. 
>
> The idea of the test is to add a WHERE clause to a table, but none of the tuples are filtered out. They just go
throughthis code-path and send it to the remote node. 
>
> #rows       Patched    | Master
> 1M  00:00:25.067536    | 00:00:16.633988
> 10M  00:04:50.770791    | 00:02:40.945358
>
>
> So, it seems a significant overhead to me. What do you think?
>
> Secondly, probably more importantly, allowing any operator is as dangerous as allowing any function as users can
create/overloadoperator(s). For example, assume that users create an operator which modifies the table that is being
filteredout: 
>
> ```
> CREATE OR REPLACE FUNCTION function_that_modifies_table(left_art INTEGER, right_arg INTEGER)
> RETURNS BOOL AS
> $$
> BEGIN
>
>   INSERT INTO test SELECT * FROM test;
>
>   return left_art > right_arg;
>  END;
> $$ LANGUAGE PLPGSQL VOLATILE;
>
> CREATE OPERATOR >>= (
>   PROCEDURE = function_that_modifies_table,
>   LEFTARG = INTEGER,
>   RIGHTARG = INTEGER
> );
>
> CREATE PUBLICATION pub FOR TABLE test WHERE (key >>= 0);
> ``
>
> With the above, we seem to be in trouble. Although the above is an extreme example, it felt useful to share to the
extentof the problem. We probably cannot allow any free-form SQL to be on the filters. 
>
> To overcome these issues, one approach could be to rely on known safe operators and functions. I believe the btree
andhash operators should provide a pretty strong coverage across many use cases. As far as I can see, the procs that
thefollowing query returns can be our baseline: 
>
> ```
> select   DISTINCT amproc.amproc::regproc AS opfamily_procedure
> from     pg_am am,
>          pg_opfamily opf,
>          pg_amproc amproc
> where    opf.opfmethod = am.oid
> and      amproc.amprocfamily = opf.oid
> order by
>          opfamily_procedure;
> ```
>
> With that, we aim to prevent users easily shooting themselves by the foot.
>
> The other problematic area was the performance, as calling `expression_planner()` for every tuple can be very
expensive.To avoid that, it might be considered to ask users to provide a function instead of a free form WHERE clause,
suchthat if the function returns true, the tuple is sent. The allowed functions need to be immutable SQL functions with
boolreturn type. As we can parse the  SQL functions, we should be able to allow only functions that rely on the above
mentionedprocs. We can apply as many restrictions (such as no modification query) as possible. For example, see below: 
> ```
>
> CREATE OR REPLACE function filter_tuples_for_test(int) returns bool as
> $body$
>     select $1 > 100;
> $body$
> language sql immutable;
>
> CREATE PUBLICATION pub FOR TABLE test FILTER = filter_tuples_for_tes(key);
> ```
>
> In  terms of performance, calling the function should avoid calling the `expression_planner()` and yield better
performance.Though, this needs to be verified. 
>
> If such an approach makes sense, I'd be happy to work on the patch. Please provide me feedback.
>

You sent in your patch to pgsql-hackers on Dec 17, but you did not
post it to the next CommitFest[1] (I found the old entry of this
patch[2] but it's marked as "Returned with feedback"). If this was
intentional, then you need to take no action.  However, if you want
your patch to be reviewed as part of the upcoming CommitFest, then you
need to add it yourself before 2021-01-01 AoE[3]. Thanks for your
contributions.

Regards,

[1] https://commitfest.postgresql.org/31/
[2] https://commitfest.postgresql.org/20/1862/
[2] https://en.wikipedia.org/wiki/Anywhere_on_Earth

--
Masahiko Sawada
EnterpriseDB:  https://www.enterprisedb.com/



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: Double partition lock in bufmgr
Следующее
От: Daniel Gustafsson
Дата:
Сообщение: Re: OpenSSL connection setup debug callback issue