Re: Foreign join pushdown vs EvalPlanQual

Поиск
Список
Период
Сортировка
От Kouhei Kaigai
Тема Re: Foreign join pushdown vs EvalPlanQual
Дата
Msg-id 9A28C8860F777E439AA12E8AEA7694F80110FA61@BPXM15GP.gisp.nec.co.jp
обсуждение исходный текст
Ответ на Re: Foreign join pushdown vs EvalPlanQual  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Ответы Re: Foreign join pushdown vs EvalPlanQual  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Список pgsql-hackers
> > Let me introduce a few cases we should pay attention.
> >
> > Foreign/CustomScan node may stack; that means a Foreign/CustomScan node
> > may have child node that includes another Foreign/CustomScan node with
> > scanrelid==0.
> > (At this moment, ForeignScan cannot have child node, however, more
> > aggressive push-down [1] will need same feature to fetch tuples from
> > local relation and construct VALUES() clause.)
> > In this case, the highest Foreign/CustomScan node (that is also nearest
> > to LockRows or ModifyTuples) run the alternative sub-plan that includes
> > scan/join plans dominated by fdw_relids or custom_relids.
> >
> > For example:
> >    LockRows
> >     -> HashJoin
> >       -> CustomScan (AliceJoin)
> >         -> SeqScan on t1
> >         -> CustomScan (CarolJoin)
> >           -> SeqScan on t2
> >           -> SeqScan on t3
> >       -> Hash
> >         -> CustomScan (BobJoin)
> >           -> SeqScan on t4
> >           -> ForeignScan (remote join involves ft5, ft6)
> >
> > In this case, AliceJoin will have alternative sub-plan to join t1, t2
> > and t3, then it shall be used on EvalPlanQual(). Also, BobJoin will
> > have alternative sub-plan to join t4, ft5 and ft6. CarolJoin and the
> > ForeignScan will also have alternative sub-plan, however, these are
> > not used in this case.
> > Probably, it works fine.
> 
> Yeah, I think so too.
>
Sorry, I need to adjust my explanation above a bit:

In this case, AliceJoin will have alternative sub-plan to join t1 and
CarolJoin, then CarolJoin will have alternative sub-plan to join t2 and
t3. Also, BobJoin will have alternative sub-plan to join t4 and the
ForeignScan with remote join, and this ForeignScan node will have
alternative sub-plan to join ft5 and ft6.

Why this recursive design is better? Because it makes planner enhancement
much simple than overall approach. Please see my explanation in the
section below.

> > On the next step, how do we implement this design?
> > I guess that planner needs to keep a path that contains neither
> > foreign-join nor custom-join with scanrelid==0.
> > Probably, "cheapest_builtin_path" of RelOptInfo is needed that
> > never contains these remote/custom join logic, as a seed of
> > alternative sub-plan.
> 
> Yeah, I think so too, but I've not fugiured out how to implement this yet.
>
> To be honest, ISTM that it's difficult to do that simply and efficiently
> for the foreign/custom-join-pushdown API that we have for 9.5.  It's a
> little late, but what I started thinking is to redesign that API so that
> that API is called at standard_join_search, as discussed in [2]; (1) to
> place that API call *after* the set_cheapest call and (2) to place
> another set_cheapest call after that API call for each joinrel.  By the
> first set_cheapest call, I think we could probably save an alternative
> path that we need in "cheapest_builtin_path".  By the second
> set_cheapest call following that API call, we could consider
> foreign/custom-join-pushdown paths also.  What do you think about this idea?
>
Disadvantage is larger than advantage, sorry.
The reason why we put foreign/custom-join hook on add_paths_to_joinrel()
is that the source relations (inner/outer) were not obvious, thus,
we cannot reproduce which relations are the source of this join.
So, I had to throw a spoon when I tried this approach before.


My idea is that we save the cheapest_total_path of RelOptInfo onto the
new cheapest_builtin_path just before the GetForeignJoinPaths() hook.
Why? It should be a built-in join logic, never be a foreign/custom-join,
because of the hook location; only built-in logic shall be added here.
Even if either/both of join sub-trees contains foreign/custom-join,
these path have own alternative sub-plan at their level, no need to
care about at current level. (It is the reason why I adjust my explanation
above.)
Once this built-in path is kept and foreign/custom-join get chosen by
set_cheapest(), it is easy to attach this sub-plan to ForeignScan or
CustomScan node.
I don't find any significant down-side in this approach.
How about your opinion?


Regarding to the development timeline, I prefer to put something
workaround not to kick Assert() on ExecScanFetch().
We may add a warning in the documentation not to replace built-in
join if either/both of sub-trees are target of UPDATE/DELETE or
FOR SHARE/UPDATE.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: raw output from copy