Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)

Поиск
Список
Период
Сортировка
От Kouhei Kaigai
Тема Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)
Дата
Msg-id 9A28C8860F777E439AA12E8AEA7694F8010C6E76@BPXM15GP.gisp.nec.co.jp
обсуждение исходный текст
Ответ на Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)  (Shigeru HANADA <shigeru.hanada@gmail.com>)
Список pgsql-hackers
> 2015/03/25 12:59、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール:
> 
> >>> At this moment, I'm not 100% certain about its logic. Especially, I didn't
> >>> test SEMI- and ANTI- join cases yet.
> >>> However, time is money - I want people to check overall design first, rather
> >>> than detailed debugging. Please tell me if I misunderstood the logic to break
> >>> down join relations.
> >>
> >> With applying your patch, regression tests of “updatable view” failed.
> >> regression.diff contains some errors like this:
> >> ! ERROR:  could not find RelOptInfo for given relids
> >>
> >> Could you check that?
> >>
> > It is a bug around the logic to find out two RelOptInfo that can construct
> > another RelOptInfo of joinrel.
> > Even though I'm now working to correct the logic, it is not obvious to
> > identify two relids that satisfy joinrel->relids.
> > (Yep, law of entropy enhancement…)
> 
> IIUC, this problem is in only non-INNER JOINs because we can treat relations joined
> with only INNER JOIN in arbitrary order.  But supporting OUTER JOINs would be
> necessary even for the first cut.
> 
Yep. In case when joinrel contains all inner-joined relations managed by same
FDW driver, job of get_joinrel_broken_down() is quite simple.
However, people want to support outer-join also, doesn't it?

> > On the other hands, we may have a solution that does not need a complicated
> > reconstruction process. The original concern was, FDW driver may add paths
> > that will replace entire join subtree by foreign-scan on remote join multiple
> > times, repeatedly, but these paths shall be identical.
> >
> > If we put a hook for FDW/CSP on bottom of build_join_rel(), we may be able
> > to solve the problem more straight-forward and simply way.
> > Because build_join_rel() finds a cache on root->join_rel_hash then returns
> > immediately if required joinrelids already has its RelOptInfo, bottom of
> > this function never called twice on a particular set of joinrelids.
> > Once FDW/CSP constructs a path that replaces entire join subtree towards
> > the joinrel just after construction, it shall be kept until cheaper built-in
> > paths are added (if exists).
> >
> > This idea has one other positive side-effect. We expect remote-join is cheaper
> > than local join with two remote scan in most cases. Once a much cheaper path
> > is added prior to local join consideration, add_path_precheck() breaks path
> > consideration earlier.
> >
> > Please comment on.
> 
> Or bottom of make_join_rel().  IMO build_join_rel() is responsible for just
> building (or searching from a list) a RelOptInfo for given relids.  After that
> make_join_rel() calls add_paths_to_joinrel() with appropriate arguments per join
> type to generate actual Paths implements the join.  make_join_rel() is called
> only once for particular relid combination, and there SpecialJoinInfo and
> restrictlist (conditions specified in JOIN-ON and WHERE), so it seems promising
> for FDW cases.
> 
As long as caller can know whether build_join_rel() actually construct a new
RelOptInfo object, or not, I think it makes more sense than putting a hook
within make_join_rel().

> Though I’m not sure that it also fits custom join provider’s requirements.
>
Join replaced by CSP has two scenarios. First one implements just an alternative
logic of built-in join, will takes underlying inner/outer node, so its hook
is located on add_paths_to_joinrel() as like built-in join logics.
Second one tries to replace entire join sub-tree by materialized view (for
example), like FDW remote join cases. So, it has to be hooked nearby the
location of GetForeignJoinPaths().
In case of the second scenario, CSP does not have private field in RelOptInfo,
so it may not obvious to check whether the given joinrel exactly matches with
a particular materialized-view or other caches.

At this moment, what I'm interested in is the first scenario, so priority of
the second case is not significant for me, at least.

Thanks.
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: assessing parallel-safety
Следующее
От: Rajeev rastogi
Дата:
Сообщение: Re: Parallel Seq Scan