Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

Поиск
Список
Период
Сортировка
От Ashutosh Bapat
Тема Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)
Дата
Msg-id CAFjFpRfgbfK9Mfv_4x2UCH6twjdROaV78qQZCWzzyYCG2PqL_g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Ответы Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)  (Jeevan Chalke <jeevan.chalke@enterprisedb.com>)
Список pgsql-hackers


On Mon, Feb 8, 2016 at 4:15 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
On 2016/02/05 17:50, Ashutosh Bapat wrote:

    Btw, IIUC, I think the patch fails to adjust the targetlist of the
    top plan created that way, to output the fdw_scan_tlist, as
    discussed in [1] (ie, I think the attached patch is needed, which is
    created on top of your patch pg_fdw_join_v8.patch).

fdw_scan_tlist represents the output fetched from the foreign server and
is not necessarily the output of ForeignScan. ForeignScan node's output
is represented by tlist argument to.

1119     return make_foreignscan(tlist,
1120                             local_exprs,
1121                             scan_relid,
1122                             params_list,
1123                             fdw_private,
1124                             fdw_scan_tlist,
1125                             remote_exprs,
1126                             outer_plan);

This tlist is built using build_path_tlist() for all join plans. IIUC,
all of them output the same targetlist. We don't need to make sure that
targetlist match as long as we are using the targetlist passed in by
create_scan_plan(). Do you have a counter example?

Maybe my explanation was not correct, but I'm saying that the targertlist of the above outer_plan should be set to the fdw_scan_tlist, to avoid misbehavior.  Here is such an example (add() in the example is a user defined function that simply adds two arguments, defined by: create function add(integer, integer) returns integer as '/path/to/func', 'add' language c strict):

postgres=# create foreign table foo (a int) server myserver options (table_name 'foo');
postgres=# create foreign table bar (a int) server myserver options (table_name 'bar');
postgres=# create table tab (a int, b int);
postgres=# insert into foo select a from generate_series(1, 1000) a;
postgres=# insert into bar select a from generate_series(1, 1000) a;
postgres=# insert into tab values (1, 1);
postgres=# analyze foo;
postgres=# analyze bar;
postgres=# analyze tab;

[Terminal 1]
postgres=# begin;
BEGIN
postgres=# update tab set b = b + 1 where a = 1;
UPDATE 1

[Terminal 2]
postgres=# explain verbose select tab.* from tab, foo, bar where foo.a = bar.a and add(foo.a, bar.a) > 0 limit 1 for update;

                QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------
 Limit  (cost=100.00..107.70 rows=1 width=70)
   Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
   ->  LockRows  (cost=100.00..2663.48 rows=333 width=70)
         Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
         ->  Nested Loop  (cost=100.00..2660.15 rows=333 width=70)
               Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
               ->  Foreign Scan  (cost=100.00..2654.97 rows=333 width=56)
                     Output: foo.*, bar.*
                     Filter: (add(foo.a, bar.a) > 0)
                     Relations: (public.foo) INNER JOIN (public.bar)
                     Remote SQL: SELECT ROW(r2.a), ROW(r3.a), r2.a, r3.a FROM (public.foo r2 INNER JOIN public.bar r3 ON (TRUE)) WHERE ((r2.a = r3.a)) F
OR UPDATE OF r2 FOR UPDATE OF r3
                     ->  Hash Join  (cost=247.50..301.25 rows=333 width=56)
                           Output: foo.*, bar.*
                           Hash Cond: (foo.a = bar.a)
                           Join Filter: (add(foo.a, bar.a) > 0)
                           ->  Foreign Scan on public.foo (cost=100.00..135.00 rows=1000 width=32)
                                 Output: foo.*, foo.a
                                 Remote SQL: SELECT a FROM public.foo FOR UPDATE
                           ->  Hash  (cost=135.00..135.00 rows=1000 width=32)
                                 Output: bar.*, bar.a
                                 ->  Foreign Scan on public.bar (cost=100.00..135.00 rows=1000 width=32)
                                       Output: bar.*, bar.a
                                       Remote SQL: SELECT a FROM public.bar FOR UPDATE
               ->  Materialize  (cost=0.00..1.01 rows=1 width=14)
                     Output: tab.a, tab.b, tab.ctid
                     ->  Seq Scan on public.tab  (cost=0.00..1.01 rows=1 width=14)
                           Output: tab.a, tab.b, tab.ctid
(27 rows)

postgres=# select tab.* from tab, foo, bar where foo.a = bar.a and add(foo.a, bar.a) > 0 limit 1 for update;

[Terminal 1]
postgres=# commit;
COMMIT

[Terminal 2] (After the commit in Terminal 1, Terminal 2 will show the following.)
 a | b
---+---
(0 rows)

This is wrong.  (Note that since the SELECT FOR UPDATE doesn't impose any condition on a tuple from the local table tab, the EvalPlanQual recheck executed should succeed.)  The reason for that is that the targetlist of the local join plan is the same as for the ForeignScan, which outputs neither foo.a nor bar.a required as an argument of the function add().


I see what you are trying to say now. In ExecScan, ExecScanFetch will execute the outer plan for EvalPlanQual check and then at
 208         if (!qual || ExecQual(qual, econtext, false))
it will try to evaluate the local conditions, where it needs the foo.a and bar.a which are not part of the projected output for ForeignScan and the outer plan.

But then aren't the local conditions being evaluated twice, once by the outer plan and then again by ExecScan? Is this OK? What happens when the local conditions have side effects? We should probably delete them from the outer_plan's quals.

The patch attached fixes the targetlist as per mail from Robert and the quals as explained above.


--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: proposal: schema PL session variables
Следующее
От: "Shulgin, Oleksandr"
Дата:
Сообщение: Re: More stable query plans via more predictable column statistics