Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)

Поиск
Список
Период
Сортировка
От Ashutosh Bapat
Тема Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)
Дата
Msg-id CAFjFpRexnwj6s+3eevVE6Aq3AAkWwPX+Ymm93qkb6V5QGHM=9g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)  (Shigeru HANADA <shigeru.hanada@gmail.com>)
Ответы Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)  (Shigeru HANADA <shigeru.hanada@gmail.com>)
Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)  (Shigeru HANADA <shigeru.hanada@gmail.com>)
Список pgsql-hackers
I reviewed the foreign_join_v13 patch. Here are my comments

Thanks for this work. It's good to see that the The foreign_join patch includes extensive tests for postgres_fdw. Thanks for the same.

Sanity
---------
The patch foreign_join didn't get applied cleanly with "git apply" but got applied using "patch". The patch has "trailing whitespace"s.

The patch compiles cleanly with pgsql-v9.5-custom-join.v11.patch.

make check in regress and postgres_fdw folders passes without any failures.

Tests
-------
1.The postgres_fdw test is re/setting enable_mergejoin at various places. The goal of these tests seems to be to test the sanity of foreign plans generated. So, it might be better to reset enable_mergejoin (and may be all of enable_hashjoin, enable_nestloop_join etc.) to false at the beginning of the testcase and set them again at the end. That way, we will also make sure that foreign plans are chosen irrespective of future planner changes.
2. In the patch, I see that the inheritance testcases have been deleted from postgres_fdw.sql, is that intentional? I do not see those being replaced anywhere else.
3. We need one test for each join type (or at least for INNER and LEFT OUTER) where there are unsafe to push conditions in ON clause along-with safe-to-push conditions. For INNER join, the join should get pushed down with the safe conditions and for OUTER join it shouldn't be. Same goes for WHERE clause, in which case the join will be pushed down but the unsafe-to-push conditions will be applied locally.
4. All the tests have ORDER BY, LIMIT in them, so the setref code is being exercised. But, something like aggregates would test the setref code better. So, we should add at-least one test like select avg(ft1.c1 + ft2.c2) from ft1 join ft2 on (ft1.c1 = ft2.c1).
5. It will be good to add some test which contain join between few foreign and few local tables to see whether we are able to push down the largest possible foreign join tree to the foreign server.

Code
-------
In classifyConditions(), the code is now appending RestrictInfo::clause rather than RestrictInfo itself. But the callers of classifyConditions() have not changed. Is this change intentional? The functions which consume the lists produced by this function handle expressions as well RestrictInfo, so you may not have noticed it. Because of this change, we might be missing some optimizations e.g. in function postgresGetForeignPlan()
 793         if (list_member_ptr(fpinfo->remote_conds, rinfo))
 794             remote_conds = lappend(remote_conds, rinfo->clause);
 795         else if (list_member_ptr(fpinfo->local_conds, rinfo))
 796             local_exprs = lappend(local_exprs, rinfo->clause);
 797         else if (is_foreign_expr(root, baserel, rinfo->clause))
 798             remote_conds = lappend(remote_conds, rinfo->clause);
 799         else
 800             local_exprs = lappend(local_exprs, rinfo->clause);
Finding a RestrictInfo in remote_conds avoids another call to is_foreign_expr(). So with this change, I think we are doing an extra call to is_foreign_expr().

The function get_jointype_name() returns an empty string for unsupported join types. Instead of that it should throw an error, if some code path accidentally calls the function with unsupported join type e.g. SEMI_JOIN.

While deparsing the SQL with rowmarks, the placement of FOR UPDATE/SHARE clause in the original query is not being honored, which means that we will end up locking the rows which are not part of the join result even when the join is pushed to the foreign server. E.g take the following query (it uses the tables created in postgres_fdw.sql tests)
contrib_regression=# explain verbose select * from ft1 join ft2 on (ft1.c1 = ft2.c1) for update of ft1;
                                                                                                                                                      
                                                                                                                                                      
                                             QUERY PLAN                                                                                               
                                                                                                                                                      
                                                                                                   
-------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
 LockRows  (cost=100.00..124.66 rows=822 width=426)
   Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7, ft1.c8, ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft1.*, ft2.*
   ->  Foreign Scan  (cost=100.00..116.44 rows=822 width=426)
         Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7, ft1.c8, ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft1.*,
 ft2.*
         Relations: (public.ft1) INNER JOIN (public.ft2)
         Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, l.a5, l.a6, l.a7, l.a8, l.a9, r.a1, r.a2, r.a3, r.a4, r.a5, r.a6, r.a7, r.a8, r.a9 FROM (SELECT l.a
10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17, ROW(l.a10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17) FROM (SELECT "C 1" a10, c2 a11, c3 a12
, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1" FOR UPDATE)
l) l (a1, a2, a3, a4, a5, a6, a7, a8, a9) INNER JOIN (SELECT r.a9, r.a10, r.a12,
r.a13, r.a14, r.a15, r.a16, r.a17, ROW(r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17) FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6
 a15, c7 a16, c8 a17 FROM "S 1"."T 1") r) r (a1, a2, a3, a4, a5, a6, a7, a8, a9) ON ((l.a1 = r.a1))
(6 rows)
It's expected that only the rows which are part of join result will be locked by FOR UPDATE clause. The query sent to the foreign server has attached the FOR UPDATE clause to the sub-query for table ft1 ("S 1"."T 1" on foreign server). As per the postgresql documentation, "When a locking clause appears in a sub-SELECT, the rows locked are those returned to the outer query by the sub-query.". So it's going to lock all rows from "S 1"."T 1", rather than only the rows which are part of join. This is going to increase probability of deadlocks, if the join is between a big table and small table where big table is being used in many queries and the join is going to have only a single row in the result.

Since there is no is_first argument to appendConditions(), we should remove corresponding line from the function prologue.

The name TO_RELATIVE() doesn't convey the full meaning of the macro. May be GET_RELATIVE_ATTNO() or something like that.

In postgresGetForeignJoinPaths(), while separating the conditions into join quals and other quals,
3014     if (IS_OUTER_JOIN(jointype))
3015     {
3016         extract_actual_join_clauses(joinclauses, &joinclauses, &otherclauses);
3017     }
3018     else
3019     {
3020         joinclauses = extract_actual_clauses(joinclauses, false);
3021         otherclauses = NIL;
3022     }
we shouldn't differentiate between outer and inner join. For inner join the join quals can be treated as other clauses and they will be returned as other clauses, which is fine. Also, the following condition
3050     /*
3051      * Other condition for the join must be safe to push down.
3052      */
3053     foreach(lc, otherclauses)
3054     {
3055         Expr *expr = (Expr *) lfirst(lc);
3056
3057         if (!is_foreign_expr(root, joinrel, expr))
3058         {
3059             ereport(DEBUG3, (errmsg("filter contains unsafe conditions")));
3060             return;
3061         }
3062     }
is unnecessary. I there are filter conditions which are unsafe to push down, they can be applied locally after obtaining the join result from the foreign server. The join quals are all needed to be safe to push down, since they decide which rows will contain NULL inner side in an OUTER join.


--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

On Fri, Apr 17, 2015 at 10:13 AM, Shigeru HANADA <shigeru.hanada@gmail.com> wrote:
Kaigai-san,

2015/04/17 10:13、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール:

> Hanada-san,
>
>> I merged explain patch into foreign_join patch.
>>
>> Now v12 is the latest patch.
>>
> It contains many garbage lines... Please ensure the
> patch is correctly based on tOhe latest master +
> custom_join patch.

Oops, sorry.  I’ve re-created the patch as v13, based on Custom/Foreign join v11 patch and latest master.

It contains EXPLAIN enhancement that new subitem “Relations” shows relations and joins, including order and type, processed by the foreign scan.

--
Shigeru HANADA
shigeru.hanada@gmail.com









--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Petr Jelinek
Дата:
Сообщение: Re: Replication identifiers, take 4
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: Improve sleep processing of pg_rewind TAP tests