Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)

Поиск
Список
Период
Сортировка
От Ashutosh Bapat
Тема Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)
Дата
Msg-id CAFjFpRe2EvDUpgfBFejo6DyDCSi7dryGQ7hE8EistcO7jwwvMw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)  (Shigeru Hanada <shigeru.hanada@gmail.com>)
Список pgsql-hackers


On Tue, Mar 17, 2015 at 10:28 AM, Shigeru Hanada <shigeru.hanada@gmail.com> wrote:
2015-03-14 7:18 GMT+09:00 Robert Haas <robertmhaas@gmail.com>:
> I think the foreign data wrapper join pushdown case, which also aims
> to substitute a scan for a join, is interesting to think about, even
> though it's likely to be handled by a new FDW method instead of via
> the hook.  Where should the FDW method get called from?  Currently,
> the FDW method in KaiGai's patch is GetForeignJoinPaths, and that gets
> called from add_paths_to_joinrel().  The patch at
> http://www.postgresql.org/message-id/CAEZqfEfy7p=uRpwN-Q-NNgzb8kwHbfqF82YSb9ztFZG7zN64Xw@mail.gmail.com
> uses that to implement join pushdown in postgres_fdw; if you have A
> JOIN B JOIN C all on server X, we'll notice that the join with A and B
> can be turned into a foreign scan on A JOIN B, and similarly for A-C
> and B-C.  Then, if it turns out that the cheapest path for A-B is the
> foreign join, and the cheapest path for C is a foreign scan, we'll
> arrive at the idea of a foreign scan on A-B-C, and we'll realize the
> same thing in each of the other combinations as well.  So, eventually
> the foreign join gets pushed down.

From the viewpoint of postgres_fdw, incremental approach seemed
natural way, although postgres_fdw should consider paths in pathlist
in additon to cheapest one as you mentioned in another thread.  This
approarch allows FDW to use SQL statement generated for underlying
scans as parts of FROM clause, as postgres_fdw does in the join
push-down patch.

> But there's another possible approach: suppose that
> join_search_one_level, after considering left-sided and right-sided
> joins and after considering bushy joins, checks whether every relation
> it's got is from the same foreign server, and if so, asks that foreign
> server whether it would like to contribute any paths. Would that be
> better or worse?  A disadvantage is that if you've got something like
> A LEFT JOIN B LEFT JOIN C LEFT JOIN D LEFT JOIN E LEFT JOIN F LEFT
> JOIN G LEFT JOIN H LEFT JOIN I but none of the joins can be pushed
> down (say, each join clause calls a non-pushdown-safe function) you'll
> end up examining a pile of joinrels - at every level of the join tree
> - and individually rejecting each one.  With the
> build-it-up-incrementally approach, you'll figure that all out at
> level 2, and then after that there's nothing to do but give up
> quickly.  On the other hand, I'm afraid the incremental approach might
> miss a trick: consider small LEFT JOIN (big INNER JOIN huge ON big.x =
> huge.x) ON small.y = big.y AND small.z = huge.z, where all three are
> foreign tables on the same server.  If the output of the big/huge join
> is big, none of those paths are going to survive at level 2, but the
> overall join size might be very small, so we surely want a chance to
> recover at level 3.  (We discussed test cases of this form quite a bit
> in the context of e2fa76d80ba571d4de8992de6386536867250474.)

Interesting, I overlooked that pattern.  As you pointed out, join
between big foregin tables might be dominated, perhaps by a MergeJoin
path.  Leaving dominated ForeignPath in pathlist for more optimization
in the future (in higher join level) is an idea, but it would make
planning time longer (and use more cycle and memory).

Tom's idea sounds good for saving the path b), but I worry that
whether FDW can get enough information at that timing, just before
set_cheapest.  It would not be good I/F if each FDW needs to copy many
code form joinrel.c...


Even I have the same concern. A simple joinrel doesn't contain much information about the individual two way joins involved in it, so FDW may not be able to construct a query (or execution plan) and hence judge whether a join is pushable or not, just by looking at the joinrel. There will be a lot of code duplication to reconstruct that information, within the FDW code.
 
--
Shigeru HANADA


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ashutosh Bapat
Дата:
Сообщение: Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)
Следующее
От: Jackson Isaac
Дата:
Сообщение: GSoC 2015 Idea Discussion