Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

Поиск
Список
Период
Сортировка
От Ashutosh Bapat
Тема Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)
Дата
Msg-id CAFjFpRfMFRiaCwXypW7eRL+H1W7peFEij=Fb5aDBwDeAx7cLsQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Ответы Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Список pgsql-hackers


On Thu, Dec 3, 2015 at 12:36 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
Hi Ashutosh,

On 2015/12/02 20:45, Ashutosh Bapat wrote:
It's been a long time since last patch on this thread was posted. I have
started
to work on supporting join pushdown for postgres_fdw.

Thanks for the work!

Generating paths
================

A join between two foreign relations is considered safe to push down if
1. The joining sides are pushable
2. The type of join is OUTER or INNER (LEFT/RIGHT/FULL/INNER). SEMI and ANTI
    joins are not considered right now, because of difficulties in
constructing
    the queries involving those. The join clauses of SEMI/ANTI joins are
not in a
    form that can be readily converted to IN/EXISTS/NOT EXIST kind of
expression.
    We might consider this as future optimization.
3. Joining sides do not have clauses which can not be pushed down to the
foreign
    server. For an OUTER join this is important since those clauses need
to be
    applied before performing the join and thus join can not be pushed
to the
    foreign server. An example is
    SELECT * FROM ft1 LEFT JOIN (SELECT * FROM ft2 where local_cond) ft2
ON (join clause)
    Here the local_cond on ft2 needs to be executed before performing
LEFT JOIN
    between ft1 and ft2.
    This condition can be relaxed for an INNER join by pulling the local
clauses
    up the join tree. But this needs more investigation and is not
considered in
    this version.
4. The join conditions (e.g. conditions in ON clause) are all safe to
push down.
    This is important for OUTER joins as pushing down join clauses
partially and
    applying rest locally changes the result. There are ways [1] by
which partial
    OUTER join can be completed by applying unpushable clauses locally
and then
    nullifying the nullable side and eliminating duplicate non-nullable side
    rows. But that's again out of scope of first version of postgres_fdw
join
    pushdown.

As for 4, as commented in the patch, we could relax the requirement that all the join conditions (given by JoinPathExtraData's restrictlist) need to be safe to push down to the remote server;
* In case of inner join, all the conditions would not need to be safe.
* In case of outer join, all the "otherclauses" would not need to be safe, while I think all the "joinclauses" need to be safe to get the right results (where "joinclauses" and "otherclauses" are defined by extract_actual_join_clauses).  And I think we should do this relaxation to some extent for 9.6, to allow more joins to be pushed down. 

agreed. I will work on those.
 
I don't know about [1].  May I see more information about [1]?
 
Generating plan
===============

Rest of this section describes the logic to construct
the SQL
for join; the logic is implemented as function deparseSelectSqlForRel().

deparseSelectSqlForRel() builds the SQL for given joinrel (and now for
baserel
asd well) recursively.
For joinrels
1. it constructs SQL representing either side of join, by calling itself
    in recursive fashion.
2. These SQLs are converted into subqueries and become part of the FROM
clause
    with appropriate JOIN type and clauses. The left and right
subqueries are
    given aliases "l" and "r" respectively. The columns in each subquery are
    aliased as "a1", "a2", "a3" and so on. Thus the third column on left
side can
    be referenced as "l.a3" at any recursion level.
3. Targetlist is added representing the columns in the join result
expected at
    that level.
4. The join clauses are added as part of ON clause
5. Any clauses that planner has deemed fit to be evaluated at that level
of join
    are added as part of WHERE clause.

Honestly, I'm not sure that that is a good idea.  One reason for that is that a query string constructed by the procedure is difficult to read especially when the procedure is applied recursively.  So, I'm thinking to revise the procedure so as to construct a query string with a flattened FROM clause, as discussed in eg, [2].


Just to confirm, the hook discussed in [2] is not in place right? I can find only one hook for foreign join
 50 typedef void (*GetForeignJoinPaths_function) (PlannerInfo *root,
 51                                                           RelOptInfo *joinrel,
 52                                                         RelOptInfo *outerrel,
 53                                                         RelOptInfo *innerrel,
 54                                                           JoinType jointype,
 55                                                    JoinPathExtraData *extra);
This hook takes an inner and outer relation, so can not be used for N-way join as discussed in that thread.

Are you suggesting that we should add that hook before we implement join pushdown in postgres_fdw? Am I missing something?
 
TODOs
=====
This patch is very much WIP patch to show case the approach and invite early
comments. I will continue to improve the patch and some of the areas
that will
be improved are
1. Costing of foreign join paths.
2. Various TODOs in the patch, making it more readable, finishing etc.
3. Tests
4. Any comments/suggestions on approach or the attached patch.

That would be great!

In another thread Robert, Fujita-san and Kaigai-san are discussing about
EvalPlanQual support for foreign joins. Corresponding changes to
postgres_fdw
will need to be added once those changes get committed.

Yeah, we would need those changes including helper functions to create a local join execution plan for that support.  I'd like to add those changes to your updated patch if it's okay.


Right now, we do not have any support for postgres_fdw join pushdown. I was thinking of adding at least minimal support for the same using this patch, may be by preventing join pushdown in case there are row marks for now. That way, we at least have some way to play with postgres_fdw join pushdown. Once we have that, we can work on remaining items listed for 9.6 and also you can add suport for row marks with fix for EvalPlanQual independently. This will keep the first patch smaller. Do you agree or you want to see EvalPlanQual fix to be in the first patch itself?
 
Best regards,
Etsuro Fujita

[2] http://www.postgresql.org/message-id/CA+TgmoZH9PB8BC+Z3rE7wo8CwuxAF7VP3066iSG39QfR1jJ+UQ@mail.gmail.com





--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re:
Следующее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: Some bugs in psql_complete of psql