Обсуждение: WIP: Join push-down for foreign tables

Поиск
Список
Период
Сортировка

WIP: Join push-down for foreign tables

От
Shigeru Hanada
Дата:
Hi all,

I'd like to propose $SUBJECT for further foreign query optimization.
I've not finished development, but I'd appreciate it if I got someone's
review on my WIP code and its design.

Changes I made
==============

(1) Add foreign server OID to RelOptInfo
I think it would be nice to know whether a join comes from one foreign
server or not without digging into child nodes during considering paths
for a query.  So I added serverid field to RelOptInfo, which defaults to
InvalidOid ,and is set to OID of the server if the node and all of its
children are from same foreign server.  This also avoids looking catalog
up for foreign table entry to determine FDW routine.

(2) Add new planner node, ForeignJoinPath
ForeignJoinPath derives JoinPath, like other Join nodes, and holds
FdwPlan like ForeignPath node.

This node is used to represent a pushed-down join between foreign tables
and/or another foreign join in early planning phase, for all of
combination such as table-table, table-join, join-table and join-join
will be considered.  In addition, though they might generate same
internal (FDW-specific) plan, reversed combination is considered;
planner generates two ForeignJoinPath for both (A & B) and (B & A).

During creation of this node, planner calls new FDW handler function
PlanForeignJoin to get a FdwPlan which includes costs and internal plan
of a foreign join.  If a FDW can't (or doesn't want to) handle this
join, just return NULL is OK, and then planner gives such optimization
up and considers other usual join methods such as nested loop and hash join.

A subtree which has a ForeignJoin on its top is translated into a
ForeignScan node during constructing a plan tree.  This behavior is
different from other join path nodes such as NestPath and MergePath,
because they have child plan nodes correspond to path nodes.

(3) Add EXPALIN support for foreign join (currently just for debug)
ForeignScan might not be a simple foreign table scan, so
ExplainScanTarget() can't be used for it.  An idea I have is adding
ExplainForeignScanTarget() to handle ForeignScan separately from other
scan nodes.

(4) Add new GUC parameter, enable_foreignjoin
If this was off, planner never generates ForeignJoinPath.  In such case,
foreign tables will be joined with one of NestLoop, MergeJoin and HashJoin.

Known issue
===========

I'm sorry but attached patch, join_pushdown_v1.patch, is WIP, so
currently some kind of query fails.  Known failure patterns are:

*) SELECT * FROM A JOIN B (...) doesn't work.  Specifying columns in
SELECT clause explicitly like "SELECT A.col1, A.col2, ..." seems to work.
*) ORDER BY causes error if no column is specified in SELECT clause from
sort key's table.

Probably more problems still are there...

PG-wrapper as sample implementation
===================================

pgsql_fdw-0.1.0.tar.gz is an WIP implementation of PG-wrapper, which can
(hopefully) handle both simple foreign table scan and multiple foreign
joins.  You can build it with placing in contrib/, or using pgxs.  Note
that it has some issues such as memory leak of PGresult.  I'm planning
to propose this wrapper as a contrib module, but it would be after
clearing such issues.

Regards,
--
Shigeru Hanada

Вложения

Re: WIP: Join push-down for foreign tables

От
Michael Paquier
Дата:
Hi,<br /><br />I am interested in the development you are doing regarding join push down and fdw stuff for remote
postgreSQLservers.<br />Is there a way to get the postgres fdw you are providing here for common 9.1?<br />I saw that
thetar you are providing here is adapted only for your patch.<br /><br />Regards,<br /><br />Michael<br /><br /><div
class="gmail_quote">2011/9/14Shigeru Hanada <span dir="ltr"><<a
href="mailto:shigeru.hanada@gmail.com">shigeru.hanada@gmail.com</a>></span><br/><blockquote class="gmail_quote"
style="margin:00 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"> Hi all,<br /><br /> I'd like to propose $SUBJECT
forfurther foreign query optimization.<br /> I've not finished development, but I'd appreciate it if I got someone's<br
/>review on my WIP code and its design.<br /><br /> Changes I made<br /> ==============<br /><br /> (1) Add foreign
serverOID to RelOptInfo<br /> I think it would be nice to know whether a join comes from one foreign<br /> server or
notwithout digging into child nodes during considering paths<br /> for a query.  So I added serverid field to
RelOptInfo,which defaults to<br /> InvalidOid ,and is set to OID of the server if the node and all of its<br />
childrenare from same foreign server.  This also avoids looking catalog<br /> up for foreign table entry to determine
FDWroutine.<br /><br /> (2) Add new planner node, ForeignJoinPath<br /> ForeignJoinPath derives JoinPath, like other
Joinnodes, and holds<br /> FdwPlan like ForeignPath node.<br /><br /> This node is used to represent a pushed-down join
betweenforeign tables<br /> and/or another foreign join in early planning phase, for all of<br /> combination such as
table-table,table-join, join-table and join-join<br /> will be considered.  In addition, though they might generate
same<br/> internal (FDW-specific) plan, reversed combination is considered;<br /> planner generates two ForeignJoinPath
forboth (A & B) and (B & A).<br /><br /> During creation of this node, planner calls new FDW handler
function<br/> PlanForeignJoin to get a FdwPlan which includes costs and internal plan<br /> of a foreign join.  If a
FDWcan't (or doesn't want to) handle this<br /> join, just return NULL is OK, and then planner gives such
optimization<br/> up and considers other usual join methods such as nested loop and hash join.<br /><br /> A subtree
whichhas a ForeignJoin on its top is translated into a<br /> ForeignScan node during constructing a plan tree.  This
behavioris<br /> different from other join path nodes such as NestPath and MergePath,<br /> because they have child
plannodes correspond to path nodes.<br /><br /> (3) Add EXPALIN support for foreign join (currently just for debug)<br
/>ForeignScan might not be a simple foreign table scan, so<br /> ExplainScanTarget() can't be used for it.  An idea I
haveis adding<br /> ExplainForeignScanTarget() to handle ForeignScan separately from other<br /> scan nodes.<br /><br
/>(4) Add new GUC parameter, enable_foreignjoin<br /> If this was off, planner never generates ForeignJoinPath.  In
suchcase,<br /> foreign tables will be joined with one of NestLoop, MergeJoin and HashJoin.<br /><br /> Known issue<br
/>===========<br /><br /> I'm sorry but attached patch, join_pushdown_v1.patch, is WIP, so<br /> currently some kind of
queryfails.  Known failure patterns are:<br /><br /> *) SELECT * FROM A JOIN B (...) doesn't work.  Specifying columns
in<br/> SELECT clause explicitly like "SELECT A.col1, A.col2, ..." seems to work.<br /> *) ORDER BY causes error if no
columnis specified in SELECT clause from<br /> sort key's table.<br /><br /> Probably more problems still are
there...<br/><br /> PG-wrapper as sample implementation<br /> ===================================<br /><br />
pgsql_fdw-0.1.0.tar.gzis an WIP implementation of PG-wrapper, which can<br /> (hopefully) handle both simple foreign
tablescan and multiple foreign<br /> joins.  You can build it with placing in contrib/, or using pgxs.  Note<br /> that
ithas some issues such as memory leak of PGresult.  I'm planning<br /> to propose this wrapper as a contrib module, but
itwould be after<br /> clearing such issues.<br /><br /> Regards,<br /><font color="#888888">--<br /> Shigeru
Hanada</font><br/><br /></blockquote></div><br />-- <br />Michael Paquier<br /><a href="http://michael.otacoo.com"
target="_blank">http://michael.otacoo.com</a><br/> 

Re: WIP: Join push-down for foreign tables

От
Shigeru Hanada
Дата:
Hi Michael,

(2011/09/21 12:52), Michael Paquier wrote:
> I am interested in the development you are doing regarding join push down
> and fdw stuff for remote postgreSQL servers.
> Is there a way to get the postgres fdw you are providing here for common
> 9.1?
> I saw that the tar you are providing here is adapted only for your patch.

As you say, the pgsql_fdw I posted requires my join-push-down patch.
But, at least in current revision, using PG_VERSION_NUM would make it
compile-able for both 9.1 and 9.2.  But I'm not sure that changes
required for 9.2 development cycle are enough small for this workaround.

Anyway, I'm going to publish recent pgsql_fdw for 9.1 on PGXN or
elsewhere, though it needs some (hopefully little) time.

Regards,
-- 
Shigeru Hanada


Re: WIP: Join push-down for foreign tables

От
Michael Paquier
Дата:
2011/9/21 Shigeru Hanada <shigeru.hanada@gmail.com>
Hi Michael,

(2011/09/21 12:52), Michael Paquier wrote:
> I am interested in the development you are doing regarding join push down
> and fdw stuff for remote postgreSQL servers.
> Is there a way to get the postgres fdw you are providing here for common
> 9.1?
> I saw that the tar you are providing here is adapted only for your patch.

As you say, the pgsql_fdw I posted requires my join-push-down patch.
But, at least in current revision, using PG_VERSION_NUM would make it
compile-able for both 9.1 and 9.2.  But I'm not sure that changes
required for 9.2 development cycle are enough small for this workaround.
OK, don't worry. I may be able to change that if necessary.
Anyway, I'm going to publish recent pgsql_fdw for 9.1 on PGXN or
elsewhere, though it needs some (hopefully little) time.
Thanks, that will be helpful, I think this work is very valuable for postgresql and keep an eye on it. Btw, please don't feel that you have to rush in doing that. I am not in a hurry at all so of course do it at your rhythm.

Regards,
--
Michael Paquier
http://michael.otacoo.com

Re: WIP: Join push-down for foreign tables

От
Kohei KaiGai
Дата:
Hanada-san,

I applied your patch and run a few test cases. while this test, I
noticed a few points.

At first, I tried to use file_fdw, however, it was crashed of course.
It seems to me this logic should be modified to confirm whether the target FDW
support join push down, or not.

+       if (enable_foreignjoin &&
+           joinrel->serverid != InvalidOid &&
+           (IsA(outerpath, ForeignPath) || IsA(outerpath, ForeignJoinPath)) &&
+           (IsA(inner_cheapest_total, ForeignPath) ||
+            IsA(inner_cheapest_total, ForeignJoinPath)))
+
+       {
+           ForeignJoinPath    *path;
+           path = create_foreignjoin_path(root,
+                                          joinrel,
+                                          jointype,
+                                          sjinfo,
+                                          outerpath,
+                                          inner_cheapest_total,
+                                          restrictlist,
+                                          merge_pathkeys);
+           if (path != NULL)
+               add_path(joinrel, (Path *) path);
+       }
+

In my opinion, FdwRoutine should have an additional API to inform the core its
supported features; such as inner-join, outer-join, order-by,
group-by, aggregate
functions, insert, update, delete, etc... in the future version.

Obviously, it is not hard to implement inner/outer-join feature for
pgsql_fdw module,
but it may be a tough work for memcached_fdw module.

> *) SELECT * FROM A JOIN B (...) doesn't work.  Specifying columns in
> SELECT clause explicitly like "SELECT A.col1, A.col2, ..." seems to work.
> *) ORDER BY causes error if no column is specified in SELECT clause from
> sort key's table.
>
I doubt these issues are in pgsql_fdw side, not the proposed patch itself.

In the case when the table and column names/types are compatible between
local-side and remote-side, the problem was not reproduced in my environment.
I'd like to suggest you to add a functionality to map remote column names to
the local ones in pgsql_fdw.

See below:

* I set up three foreign tables on the local side.
CREATE FOREIGN TABLE ft_1 (a int, b text) SERVER local_db;
CREATE FOREIGN TABLE ft_2 (x int, y text) SERVER local_db;
CREATE FOREIGN TABLE ft_3 (s int, t text) SERVER local_db;

* I also set up related tables on the remote side.
CREATE TABLE ft_1 (a int, b text);
CREATE TABLE ft_2 (x int, y text);
CREATE TABLE ft_3 (ss int, tt text);
Please note that column name of ft_3 is not compatible

* JOIN ft_1 and ft_2 works collectly.
postgres=# SELECT * FROM ft_1 JOIN ft_2 ON a = x;a |  b  | x |  y
---+-----+---+-----2 | bbb | 2 | bbb3 | ccc | 3 | ccc4 | ddd | 4 | ddd
(3 rows)

postgres=# EXPLAIN SELECT * FROM ft_1 JOIN ft_2 ON a = x;                                                    QUERY
PLAN

---------------------------------------------------------------------------------------------------------------------Foreign
Scanon multiple foreign tables  (cost=0.00..0.00 rows=5000 width=72)  Remote SQL: SELECT ft_1.a, ft_1.b, ft_2.x, ft_2.y
FROMpublic.ft_1
 
ft_1, public.ft_2 ft_2 WHERE (ft_1.a = ft_2.x)
(2 rows)

* JOIN ft_1 and ft_3 does not works. Error message says ft_3.s does
not exist. Probably, it means ft_3.s does not exist "on the remote
host".

postgres=# SELECT * FROM ft_1 JOIN ft_3 ON a = s;
ERROR:  could not execute foreign query
DETAIL:  ERROR:  column ft_3.s does not exist
LINE 1: SELECT ft_1.a, ft_1.b, ft_3.s, ft_3.t FROM public.ft_1 ft_1,...                              ^

HINT:  SELECT ft_1.a, ft_1.b, ft_3.s, ft_3.t FROM public.ft_1 ft_1,
public.ft_3 ft_3 WHERE (ft_1.a = ft_3.s)
postgres=# EXPLAIN SELECT * FROM ft_1 JOIN ft_3 ON a = s;                                                    QUERY
PLAN

---------------------------------------------------------------------------------------------------------------------Foreign
Scanon multiple foreign tables  (cost=0.00..0.00 rows=5000 width=72)  Remote SQL: SELECT ft_1.a, ft_1.b, ft_3.s, ft_3.t
FROMpublic.ft_1
 
ft_1, public.ft_3 ft_3 WHERE (ft_1.a = ft_3.s)
(2 rows)

In fact, EXPLAIN shows us the remote SQL tries to reference ft_3.s,
instead of ft_3.ss.

Thanks,

2011年9月14日10:24 Shigeru Hanada <shigeru.hanada@gmail.com>:
> Hi all,
>
> I'd like to propose $SUBJECT for further foreign query optimization.
> I've not finished development, but I'd appreciate it if I got someone's
> review on my WIP code and its design.
>
> Changes I made
> ==============
>
> (1) Add foreign server OID to RelOptInfo
> I think it would be nice to know whether a join comes from one foreign
> server or not without digging into child nodes during considering paths
> for a query.  So I added serverid field to RelOptInfo, which defaults to
> InvalidOid ,and is set to OID of the server if the node and all of its
> children are from same foreign server.  This also avoids looking catalog
> up for foreign table entry to determine FDW routine.
>
> (2) Add new planner node, ForeignJoinPath
> ForeignJoinPath derives JoinPath, like other Join nodes, and holds
> FdwPlan like ForeignPath node.
>
> This node is used to represent a pushed-down join between foreign tables
> and/or another foreign join in early planning phase, for all of
> combination such as table-table, table-join, join-table and join-join
> will be considered.  In addition, though they might generate same
> internal (FDW-specific) plan, reversed combination is considered;
> planner generates two ForeignJoinPath for both (A & B) and (B & A).
>
> During creation of this node, planner calls new FDW handler function
> PlanForeignJoin to get a FdwPlan which includes costs and internal plan
> of a foreign join.  If a FDW can't (or doesn't want to) handle this
> join, just return NULL is OK, and then planner gives such optimization
> up and considers other usual join methods such as nested loop and hash join.
>
> A subtree which has a ForeignJoin on its top is translated into a
> ForeignScan node during constructing a plan tree.  This behavior is
> different from other join path nodes such as NestPath and MergePath,
> because they have child plan nodes correspond to path nodes.
>
> (3) Add EXPALIN support for foreign join (currently just for debug)
> ForeignScan might not be a simple foreign table scan, so
> ExplainScanTarget() can't be used for it.  An idea I have is adding
> ExplainForeignScanTarget() to handle ForeignScan separately from other
> scan nodes.
>
> (4) Add new GUC parameter, enable_foreignjoin
> If this was off, planner never generates ForeignJoinPath.  In such case,
> foreign tables will be joined with one of NestLoop, MergeJoin and HashJoin.
>
> Known issue
> ===========
>
> I'm sorry but attached patch, join_pushdown_v1.patch, is WIP, so
> currently some kind of query fails.  Known failure patterns are:
>
> *) SELECT * FROM A JOIN B (...) doesn't work.  Specifying columns in
> SELECT clause explicitly like "SELECT A.col1, A.col2, ..." seems to work.
> *) ORDER BY causes error if no column is specified in SELECT clause from
> sort key's table.
>
> Probably more problems still are there...
>
> PG-wrapper as sample implementation
> ===================================
>
> pgsql_fdw-0.1.0.tar.gz is an WIP implementation of PG-wrapper, which can
> (hopefully) handle both simple foreign table scan and multiple foreign
> joins.  You can build it with placing in contrib/, or using pgxs.  Note
> that it has some issues such as memory leak of PGresult.  I'm planning
> to propose this wrapper as a contrib module, but it would be after
> clearing such issues.
>
> Regards,
> --
> Shigeru Hanada
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>



-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>


Re: WIP: Join push-down for foreign tables

От
Shigeru Hanada
Дата:
Kaigai-san,

Thanks for the review.

(2011/10/03 17:07), Kohei KaiGai wrote:
> At first, I tried to use file_fdw, however, it was crashed of course.
> It seems to me this logic should be modified to confirm whether the target FDW
> support join push down, or not.
> 
> +       if (enable_foreignjoin&&
> +           joinrel->serverid != InvalidOid&&
> +           (IsA(outerpath, ForeignPath) || IsA(outerpath, ForeignJoinPath))&&
> +           (IsA(inner_cheapest_total, ForeignPath) ||
> +            IsA(inner_cheapest_total, ForeignJoinPath)))
> +
> +       {
> +           ForeignJoinPath    *path;
> +           path = create_foreignjoin_path(root,
> +                                          joinrel,
> +                                          jointype,
> +                                          sjinfo,
> +                                          outerpath,
> +                                          inner_cheapest_total,
> +                                          restrictlist,
> +                                          merge_pathkeys);
> +           if (path != NULL)
> +               add_path(joinrel, (Path *) path);
> +       }
> +
> 
> In my opinion, FdwRoutine should have an additional API to inform the core its
> supported features; such as inner-join, outer-join, order-by,
> group-by, aggregate
> functions, insert, update, delete, etc... in the future version.

Sure, so in my design PlanForeignJoin is optional.

The lack of capability is informed from FDW with setting function
pointer in FdwRoutine to NULL.  If PlanForeignJoin was NULL, core
(planner) will give up to consider join push-down, and use one of local
join methods such as NestLoop and MergeJoin for those foreign tables.
As you say, other push-down-able features would also have optional
handler function for each.

BTW, what is the point of separating inner-join and outer-join in
context of push-down?  I think that providing the type of the join to
FDW as parameter of PlanForeignJoin is enough.  Then they can tell core
planner to give up considering join push-down by returning NULL if the
type of the join was not supported.

> Obviously, it is not hard to implement inner/outer-join feature for
> pgsql_fdw module,
> but it may be a tough work for memcached_fdw module.

Agreed, join push-down might be useful for only RDBMS wrappers. NoSQL
wrapper would not provide handler functions other than required ones for
simple scan.

>> *) SELECT * FROM A JOIN B (...) doesn't work.  Specifying columns in
>> SELECT clause explicitly like "SELECT A.col1, A.col2, ..." seems to work.
>> *) ORDER BY causes error if no column is specified in SELECT clause from
>> sort key's table.
>>
> I doubt these issues are in pgsql_fdw side, not the proposed patch itself.

Yes, this problem is from pgsql_fdw's SQL generator. I'll fix it.

> In the case when the table and column names/types are compatible between
> local-side and remote-side, the problem was not reproduced in my environment.
> I'd like to suggest you to add a functionality to map remote column names to
> the local ones in pgsql_fdw.

Since per-column FDW options patch has been committed in last CF, It's
not hard to implement colname FDW option for pgsql_fdw, and rough
implementation has been already done.  In next post you will be able to
map column names. :)

Regards,
-- 
Shigeru Hanada


Re: WIP: Join push-down for foreign tables

От
Kohei KaiGai
Дата:
2011年10月4日12:08 Shigeru Hanada <shigeru.hanada@gmail.com>:
>> In my opinion, FdwRoutine should have an additional API to inform the core its
>> supported features; such as inner-join, outer-join, order-by,
>> group-by, aggregate
>> functions, insert, update, delete, etc... in the future version.
>
> Sure, so in my design PlanForeignJoin is optional.
>
> The lack of capability is informed from FDW with setting function
> pointer in FdwRoutine to NULL.  If PlanForeignJoin was NULL, core
> (planner) will give up to consider join push-down, and use one of local
> join methods such as NestLoop and MergeJoin for those foreign tables.
> As you say, other push-down-able features would also have optional
> handler function for each.
>
Sorry, I overlooked it was already implemented at create_foreignjoin_path().

I additionally tried several cases using pgsql_fdw.
In some cases, it seems to me the planner don't push down the join tree
as you probably expected.
Please see the following example:

I defined three foreign tables: ft1(a int, b text), ft2(x int, y
text), ft3(s int, t text),
and lt1, lt2, lt3 are regular local tables.

postgres=# explain SELECT * FROM (ft1 join ft2 on a = x join ft3 on a = s);
 QUERY PLAN
---------------------------------------------------------------------------------------------------------------Foreign
Scanon multiple foreign tables  (cost=0.00..0.00 rows=25000 width=108)  Remote SQL: SELECT ft1.a, ft1.b, ft2.x, ft2.y,
ft3.s,ft3.t FROM
 
public.ft1 ft1, public.ft2 ft2, public.ft3 ft3 WHERE (ft1.a = ft3.s)
AND (ft1.a = ft2.x)
(2 rows)

It works good.
(P.S. I noticed that pgsql_fdw has incorrect Assert(). Please fix
pgsql_fdw.c:730)

However, an existence of local relation makes planner confused.
It seems to me you expect "ft1 join ft2 on a = x"

postgres=# explain SELECT * FROM (ft1 join ft2 on a = x join lt3 on a = s);
       QUERY PLAN
 

----------------------------------------------------------------------------------------------------------------------Merge
Join (cost=205.08..758.83 rows=30750 width=108)  Merge Cond: (ft1.a = lt3.s)  ->  Merge Join  (cost=119.66..199.66
rows=5000width=72)        Merge Cond: (ft1.a = ft2.x)        ->  Sort  (cost=59.83..62.33 rows=1000 width=36)
  Sort Key: ft1.a              ->  Foreign Scan on ft1  (cost=10.00..10.00 rows=1000 width=36)
RemoteSQL: DECLARE pgsql_fdw_cursor_2 SCROLL
 
CURSOR FOR SELECT ft1.a, ft1.b FROM public.ft1 ft1        ->  Sort  (cost=59.83..62.33 rows=1000 width=36)
SortKey: ft2.x              ->  Foreign Scan on ft2  (cost=10.00..10.00 rows=1000 width=36)                    Remote
SQL:DECLARE pgsql_fdw_cursor_3 SCROLL
 
CURSOR FOR SELECT ft2.x, ft2.y FROM public.ft2 ft2  ->  Sort  (cost=85.43..88.50 rows=1230 width=36)        Sort Key:
lt3.s       ->  Seq Scan on lt3  (cost=0.00..22.30 rows=1230 width=36)
 
(15 rows)

What is the reason why the foreign join is not pushed down?
Maybe, injected Sort plan prevent the planner to consider both side of
relations being foreign scan owned by same server? I'm still
investigating the reason.

I hope comments from committers. :-(

A collateral evidence is below.
If we try to sort the result by a key being not used to join, the both
of foreign scan gets pushed down.

postgres=# explain SELECT * FROM (ft1 join ft2 on a = x) order by y;
QUERYPLAN
 
-----------------------------------------------------------------------------------------------------------------Sort
(cost=307.19..319.69rows=5000 width=72)  Sort Key: ft2.y  ->  Foreign Scan on multiple foreign tables
(cost=0.00..0.00
rows=5000 width=72)        Remote SQL: SELECT ft1.a, ft1.b, ft2.x, ft2.y FROM public.ft1
ft1, public.ft2 ft2 WHERE (ft1.a = ft2.x)
(4 rows)

However, when I tried to sort by a key being used to join, the both of
foreign scan was not pushed down.

postgres=# explain SELECT * FROM (ft1 join ft2 on a = x) order by a;
QUERYPLAN
 
----------------------------------------------------------------------------------------------------------------Merge
Join (cost=119.66..199.66 rows=5000 width=72)  Merge Cond: (ft1.a = ft2.x)  ->  Sort  (cost=59.83..62.33 rows=1000
width=36)       Sort Key: ft1.a        ->  Foreign Scan on ft1  (cost=10.00..10.00 rows=1000 width=36)
RemoteSQL: DECLARE pgsql_fdw_cursor_6 SCROLL CURSOR
 
FOR SELECT ft1.a, ft1.b FROM public.ft1 ft1  ->  Sort  (cost=59.83..62.33 rows=1000 width=36)        Sort Key: ft2.x
   ->  Foreign Scan on ft2  (cost=10.00..10.00 rows=1000 width=36)              Remote SQL: DECLARE pgsql_fdw_cursor_7
SCROLLCURSOR
 
FOR SELECT ft2.x, ft2.y FROM public.ft2 ft2
(10 rows)

Thanks,
-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>


Re: WIP: Join push-down for foreign tables

От
Kohei KaiGai
Дата:
Hanada-san,

The proposed patch put an invocation of PlanForeignJoin on the
create_foreignjoin_path() being also called by match_unsorted_outer().
Is it a suitable position to make a decision whether a join can be
pushed-down?

I think; it needs an additional functionality to provide higher priority
on the foreign-join plan that other plans, when fdw determind a particular
join can be pushed-down.
(Sorry, I have no idea right now.)

Let's see the following result.

postgres=# EXPLAIN SELECT * FROM ft1 , ft2, lt3 WHERE ft1.a = ft2.x
AND ft1.a = lt3.s;                                                     QUERY PLAN

----------------------------------------------------------------------------------------------------------------------Merge
Join (cost=205.08..758.83 rows=30750 width=108)  Merge Cond: (ft1.a = lt3.s)  ->  Merge Join  (cost=119.66..199.66
rows=5000width=72)        Merge Cond: (ft1.a = ft2.x)        ->  Sort  (cost=59.83..62.33 rows=1000 width=36)
  Sort Key: ft1.a              ->  Foreign Scan on ft1  (cost=10.00..10.00 rows=1000 width=36)
RemoteSQL: DECLARE pgsql_fdw_cursor_0 SCROLL
 
CURSOR FOR SELECT ft1.a, ft1.b FROM public.ft1 ft1        ->  Sort  (cost=59.83..62.33 rows=1000 width=36)
SortKey: ft2.x              ->  Foreign Scan on ft2  (cost=10.00..10.00 rows=1000 width=36)                    Remote
SQL:DECLARE pgsql_fdw_cursor_1 SCROLL
 
CURSOR FOR SELECT ft2.x, ft2.y FROM public.ft2 ft2  ->  Sort  (cost=85.43..88.50 rows=1230 width=36)        Sort Key:
lt3.s       ->  Seq Scan on lt3  (cost=0.00..22.30 rows=1230 width=36)
 
(15 rows)

Then, I turned off the enable_mergejoin.

postgres=# EXPLAIN SELECT * FROM ft1 , ft2, lt3 WHERE ft1.a = ft2.x
AND ft1.a = lt3.s;                                                  QUERY PLAN
-------------------------------------------------------------------------------------------------------------Hash Join
(cost=37.67..1126.42rows=30750 width=108)  Hash Cond: (ft1.a = lt3.s)  ->  Foreign Scan on multiple foreign tables
(cost=0.00..0.00
rows=5000 width=72)        Remote SQL: SELECT ft1.a, ft1.b, ft2.x, ft2.y FROM public.ft1
ft1, public.ft2 ft2 WHERE (ft1.a = ft2.x)  ->  Hash  (cost=22.30..22.30 rows=1230 width=36)        ->  Seq Scan on lt3
(cost=0.00..22.30rows=1230 width=36)
 
(6 rows)

Probably, the basic design is correct. However, the planner gives
higher priority on the join plan between
local and foreign than pushing-down foreign relations.

Does it make sense not to consider any other possible plans when FDW
decided a particular join can be
pushed down?

Thanks,

2011年10月7日18:06 Kohei KaiGai <kaigai@kaigai.gr.jp>:
> 2011年10月4日12:08 Shigeru Hanada <shigeru.hanada@gmail.com>:
>>> In my opinion, FdwRoutine should have an additional API to inform the core its
>>> supported features; such as inner-join, outer-join, order-by,
>>> group-by, aggregate
>>> functions, insert, update, delete, etc... in the future version.
>>
>> Sure, so in my design PlanForeignJoin is optional.
>>
>> The lack of capability is informed from FDW with setting function
>> pointer in FdwRoutine to NULL.  If PlanForeignJoin was NULL, core
>> (planner) will give up to consider join push-down, and use one of local
>> join methods such as NestLoop and MergeJoin for those foreign tables.
>> As you say, other push-down-able features would also have optional
>> handler function for each.
>>
> Sorry, I overlooked it was already implemented at create_foreignjoin_path().
>
> I additionally tried several cases using pgsql_fdw.
> In some cases, it seems to me the planner don't push down the join tree
> as you probably expected.
> Please see the following example:
>
> I defined three foreign tables: ft1(a int, b text), ft2(x int, y
> text), ft3(s int, t text),
> and lt1, lt2, lt3 are regular local tables.
>
> postgres=# explain SELECT * FROM (ft1 join ft2 on a = x join ft3 on a = s);
>
>  QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------
>  Foreign Scan on multiple foreign tables  (cost=0.00..0.00 rows=25000 width=108)
>   Remote SQL: SELECT ft1.a, ft1.b, ft2.x, ft2.y, ft3.s, ft3.t FROM
> public.ft1 ft1, public.ft2 ft2, public.ft3 ft3 WHERE (ft1.a = ft3.s)
> AND (ft1.a = ft2.x)
> (2 rows)
>
> It works good.
> (P.S. I noticed that pgsql_fdw has incorrect Assert(). Please fix
> pgsql_fdw.c:730)
>
> However, an existence of local relation makes planner confused.
> It seems to me you expect "ft1 join ft2 on a = x"
>
> postgres=# explain SELECT * FROM (ft1 join ft2 on a = x join lt3 on a = s);
>                                                      QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------
>  Merge Join  (cost=205.08..758.83 rows=30750 width=108)
>   Merge Cond: (ft1.a = lt3.s)
>   ->  Merge Join  (cost=119.66..199.66 rows=5000 width=72)
>         Merge Cond: (ft1.a = ft2.x)
>         ->  Sort  (cost=59.83..62.33 rows=1000 width=36)
>               Sort Key: ft1.a
>               ->  Foreign Scan on ft1  (cost=10.00..10.00 rows=1000 width=36)
>                     Remote SQL: DECLARE pgsql_fdw_cursor_2 SCROLL
> CURSOR FOR SELECT ft1.a, ft1.b FROM public.ft1 ft1
>         ->  Sort  (cost=59.83..62.33 rows=1000 width=36)
>               Sort Key: ft2.x
>               ->  Foreign Scan on ft2  (cost=10.00..10.00 rows=1000 width=36)
>                     Remote SQL: DECLARE pgsql_fdw_cursor_3 SCROLL
> CURSOR FOR SELECT ft2.x, ft2.y FROM public.ft2 ft2
>   ->  Sort  (cost=85.43..88.50 rows=1230 width=36)
>         Sort Key: lt3.s
>         ->  Seq Scan on lt3  (cost=0.00..22.30 rows=1230 width=36)
> (15 rows)
>
> What is the reason why the foreign join is not pushed down?
> Maybe, injected Sort plan prevent the planner to consider both side of
> relations being foreign scan owned by same server? I'm still
> investigating the reason.
>
> I hope comments from committers. :-(
>
> A collateral evidence is below.
> If we try to sort the result by a key being not used to join, the both
> of foreign scan gets pushed down.
>
> postgres=# explain SELECT * FROM (ft1 join ft2 on a = x) order by y;
>                                                   QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------
>  Sort  (cost=307.19..319.69 rows=5000 width=72)
>   Sort Key: ft2.y
>   ->  Foreign Scan on multiple foreign tables  (cost=0.00..0.00
> rows=5000 width=72)
>         Remote SQL: SELECT ft1.a, ft1.b, ft2.x, ft2.y FROM public.ft1
> ft1, public.ft2 ft2 WHERE (ft1.a = ft2.x)
> (4 rows)
>
> However, when I tried to sort by a key being used to join, the both of
> foreign scan was not pushed down.
>
> postgres=# explain SELECT * FROM (ft1 join ft2 on a = x) order by a;
>                                                   QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------
>  Merge Join  (cost=119.66..199.66 rows=5000 width=72)
>   Merge Cond: (ft1.a = ft2.x)
>   ->  Sort  (cost=59.83..62.33 rows=1000 width=36)
>         Sort Key: ft1.a
>         ->  Foreign Scan on ft1  (cost=10.00..10.00 rows=1000 width=36)
>               Remote SQL: DECLARE pgsql_fdw_cursor_6 SCROLL CURSOR
> FOR SELECT ft1.a, ft1.b FROM public.ft1 ft1
>   ->  Sort  (cost=59.83..62.33 rows=1000 width=36)
>         Sort Key: ft2.x
>         ->  Foreign Scan on ft2  (cost=10.00..10.00 rows=1000 width=36)
>               Remote SQL: DECLARE pgsql_fdw_cursor_7 SCROLL CURSOR
> FOR SELECT ft2.x, ft2.y FROM public.ft2 ft2
> (10 rows)
>
> Thanks,
> --
> KaiGai Kohei <kaigai@kaigai.gr.jp>
>



-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>


Re: WIP: Join push-down for foreign tables

От
Florian Pflug
Дата:
On Oct9, 2011, at 13:35 , Kohei KaiGai wrote:
> I think; it needs an additional functionality to provide higher priority
> on the foreign-join plan that other plans, when fdw determind a particular
> join can be pushed-down.
> (Sorry, I have no idea right now.)
> 
> Probably, the basic design is correct. However, the planner gives
> higher priority on the join plan between
> local and foreign than pushing-down foreign relations.

The textbook approach to that is to factor the cost of transferring the
rows over the network into the plan costs. That, of course, only works
once we have statistics for the foreign tables. But AFAIK we eventually
want to have those, so I'd say punt this until that time.

> Does it make sense not to consider any other possible plans when FDW
> decided a particular join can be
> pushed down?

I think in the long run we're going to want a cost-based decision there.
Pushing down a join is only a win if the join selectivity is low. For a
selectivity close to 1.0, it may very well be many times more efficient
to fetch the tables separately and join them locally. You'll be fetching
only |A| + |B| rows for a local join, instead of |A| * |B| rows for a remote
join (assuming A,B are tables with cardinality |A|, |B|).

best regards,
Florian Pflug



Re: WIP: Join push-down for foreign tables

От
Shigeru Hanada
Дата:
(2011/10/08 1:06), Kohei KaiGai wrote:
> What is the reason why the foreign join is not pushed down?
> Maybe, injected Sort plan prevent the planner to consider both side of
> relations being foreign scan owned by same server? I'm still
> investigating the reason.

Thanks for your testing.

I'm not sure, but I think that Sort plan node would not be the reason
because it's an element of merge join.  Maybe some wrong points would be
in my join method consideration.

In my assumption, ft1 and ft2 should be joined first (because such join
has very low costs) and then that result and lt3 should be joined with
one of local join methods, such as merge join and hash join.

I'll try your example soon.

Regards,
-- 
Shigeru Hanada


Re: WIP: Join push-down for foreign tables

От
Robert Haas
Дата:
2011/10/10 Shigeru Hanada <shigeru.hanada@gmail.com>:
> (2011/10/08 1:06), Kohei KaiGai wrote:
>> What is the reason why the foreign join is not pushed down?
>> Maybe, injected Sort plan prevent the planner to consider both side of
>> relations being foreign scan owned by same server? I'm still
>> investigating the reason.
>
> Thanks for your testing.
>
> I'm not sure, but I think that Sort plan node would not be the reason
> because it's an element of merge join.  Maybe some wrong points would be
> in my join method consideration.
>
> In my assumption, ft1 and ft2 should be joined first (because such join
> has very low costs) and then that result and lt3 should be joined with
> one of local join methods, such as merge join and hash join.

This might be out of left field, but wouldn't it make more sense to
get postgresql_fdw committed first, and then add the join push-down
functionality afterwards?  I mean, otherwise, we're going to be left
with a situation where we have join pushdown in core, but the only FDW
that can actually make use of it elsewhere.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: WIP: Join push-down for foreign tables

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> This might be out of left field, but wouldn't it make more sense to
> get postgresql_fdw committed first, and then add the join push-down
> functionality afterwards?  I mean, otherwise, we're going to be left
> with a situation where we have join pushdown in core, but the only FDW
> that can actually make use of it elsewhere.

There's likely to be a lot of FDW infrastructure that will not be
exercised by anything in core or contrib ...
        regards, tom lane


Re: WIP: Join push-down for foreign tables

От
Kohei KaiGai
Дата:
How about the current status of this patch, although it is still
"Waiting on author".

If Hanada-san would propose contrib/pgsql_fdw as a basis of join-pushdown
feature, I'll likely volunteer to review the patch.
I'm also interested in this feature. Hopefully, I'd like to try other
kind of pushing
down (such as aggregate, sort, ...) or updatable foreign tables. :-)

Thanks,

2011/10/10 Tom Lane <tgl@sss.pgh.pa.us>:
> Robert Haas <robertmhaas@gmail.com> writes:
>> This might be out of left field, but wouldn't it make more sense to
>> get postgresql_fdw committed first, and then add the join push-down
>> functionality afterwards?  I mean, otherwise, we're going to be left
>> with a situation where we have join pushdown in core, but the only FDW
>> that can actually make use of it elsewhere.
>
> There's likely to be a lot of FDW infrastructure that will not be
> exercised by anything in core or contrib ...
>
>                        regards, tom lane
>



--
KaiGai Kohei <kaigai@kaigai.gr.jp>


Re: WIP: Join push-down for foreign tables

От
Shigeru Hanada
Дата:
(2011/10/21 23:53), Kohei KaiGai wrote:
> How about the current status of this patch, although it is still
> "Waiting on author".

I've found some issues for this CF item, and I would need some more
effort to solve them.  So I'll mark this item as "Returned with
feedback", and I'll propose this idea again for next CF (2011-11).  Then
I'll work with some other items in current CF (2011-09), which are
enough easy for me to work on.

> If Hanada-san would propose contrib/pgsql_fdw as a basis of join-pushdown
> feature, I'll likely volunteer to review the patch.
> I'm also interested in this feature. Hopefully, I'd like to try other
> kind of pushing
> down (such as aggregate, sort, ...) or updatable foreign tables. :-)

Thanks for the offer.  I'd like to propose pgsql_fdw as a contrib module
for 9.2.  I've attached tree patches of revised version of pgsql_fdw
which doesn't have join-push-down capability, but IMO it would be enough
for basis of various push-down enhancement.  Note that I chose the name
"pgsql_fdw" to avoid naming conflict with existing validator
postgresql_fdw_validator, which has been in core since 8.4 and used by
contrib/dblink.

It might be useful and reasonable to integrate pgsql_fdw_validator into
postgresql_fdw_validator and use it as a common validator for
contrib/dblink and new FDW postgresql_fdw, but I didn't do so (at least
not yet) because it would confuse users of contrib/dblink by accepting
non-libpq options.

- fdw_helper_doc.patch
This patch adds new document section describing about FDW helper
functions into "Chapter 50. Writing A Foreign Data Wrapper".  This can
be back-patched to 9.1 because this patch contains only functions which
were introduced at 9.1 release.

- fdw_helper_funcs.patch (requires above patch has been applied)
This patch contains two new FDW helper functions.
GetForeignColumnOptions(), which retrieves all FDW options of a column
of a foreign table.   This would be useful not only for pgsql_fdw but
also any FDW which have per-column FDW options.
This patch also adds GetFdwOptionValue(), which returns finest-grained
FDW option value for an option set on given foreign table or its column.
 This would be useful for FDWs which accept an option on multiple object
level, e.g. such as foreign table and foreign server; this kind of usage
was mentioned by David Fetter in this message.
    http://archives.postgresql.org/pgsql-hackers/2011-10/msg00483.php

- pgsql_fdw.patch (requires above patches have been applied)
This patch provides FDW for external PostgreSQL servers.  Note that this
FDW uses pgsql_fdw_validator for its validator, not a built-in
postgresql_fdw_validator.  Please see SGML document for details of this FDW.

Regards,
--
Shigeru Hanada


Вложения

Re: WIP: Join push-down for foreign tables

От
Shigeru Hanada
Дата:
(2011/10/24 20:40), I wrote:
> I've found some issues for this CF item, and I would need some more
> effort to solve them.  So I'll mark this item as "Returned with
> feedback", and I'll propose this idea again for next CF (2011-11).

This is the second effort for $SUBJECT.  Attached patch requires
pgsql_fdw patches[1] to be applied previously.  This patch provides:

* Changes for backend
  * Add new planner node ForeignJoinPath and related routines.  In
    current design, planner consider all of possible join combinations
    between foreign tables, similar to local joins such as nested loop,
    hash join and merge join.  And if foreign join is cheapest, planner
    produces a ForeignScan plan node for a join.  So executor is not
    modified heavily since 9.1.
  * Add new FDW callback for planning join push-down between foreign
    tables on same server.  This function is optional, and allowed to
    return NULL to tell planner that that join can't be handled by the
    FDW.
  * Add server oid to RelOptInfo.  This is useless at all for relations
    other than foreign scan and foreign join, but it would reduces
    catalog lookup during foreign scan/join planning.
  * Add enable_foreignjoin GUC parameter.  Join between foreign tables
    is considered only when this parameter is on.
* Changes for pgsql_fdw
  * Implemente PlanForeignJoin callback function.

[1]http://archives.postgresql.org/pgsql-hackers/2011-11/msg00904.php

Regards,
--
Shigeru Hanada

Вложения

Re: WIP: Join push-down for foreign tables

От
Heikki Linnakangas
Дата:
On 15.11.2011 19:16, Shigeru Hanada wrote:
> This is the second effort for $SUBJECT.  Attached patch requires
> pgsql_fdw patches[1] to be applied previously.  This patch provides:
>
> * Changes for backend
>    * Add new planner node ForeignJoinPath and related routines.  In
>      current design, planner consider all of possible join combinations
>      between foreign tables, similar to local joins such as nested loop,
>      hash join and merge join.  And if foreign join is cheapest, planner
>      produces a ForeignScan plan node for a join.  So executor is not
>      modified heavily since 9.1.
>    * Add new FDW callback for planning join push-down between foreign
>      tables on same server.  This function is optional, and allowed to
>      return NULL to tell planner that that join can't be handled by the
>      FDW.

So the way a three-way join is planned, is that the planner first asks 
the FDW to plan ForeignPaths of scanning the individual tables. Then it 
asks the FDW to consider pairwise joins of those ForeignPaths. Then it 
asks the FDW to consider joins of the constructed ForeignPaths and 
ForeignJoinPaths. Ie. the plan involving a join of three or more remote 
tables is built bottom-up, just like a join of local tables.

When the FDW recognizes it's being asked to join a ForeignJoinPath and a 
ForeignPath, or two ForeignJoinPaths, it throws away the old SQL it 
constructed to do the two-way join, and builds a new one to join all 
three tables. That seems tedious, when there are a lot of tables 
involved. A FDW like the pgsql_fdw that constructs an SQL query doesn't 
need to consider pairs of joins. It could just as well build the SQL for 
the three-way join directly. I think the API needs to reflect that.

I wonder if we should have a heuristic to not even consider doing a join 
locally, if it can be done remotely. For a query like this:

SELECT * FROM remote1 a, remote2 b, remote3 c WHERE a.id = b.id AND c.id 
= b.id

it's quite obvious that the best plan is to do the join remotely, rather 
than pull all the rows from all tables, and do the join locally. In 
theory, if the remote database is remarkably bad at performing joins, it 
might be faster to pull in all the data and do it locally, but I can't 
really imagine that happening in practice.

> * Changes for pgsql_fdw
>    * Implemente PlanForeignJoin callback function.

A couple of basic bugs I bumped into:

* WHERE-clause building fails on a cartesian product ("SELECT * FROM 
remote1, remote2")

* The join planning in pgsql_fdw seems to get confused and gives up if 
there are any local tables also involved in the query (e.g "explain 
SELECT * FROM remote1, remote2 LEFT OUTER JOIN local1 on (local1.a = 
remote2.a) WHERE remote1.a = remote2.a;")

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: WIP: Join push-down for foreign tables

От
Tom Lane
Дата:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> When the FDW recognizes it's being asked to join a ForeignJoinPath and a 
> ForeignPath, or two ForeignJoinPaths, it throws away the old SQL it 
> constructed to do the two-way join, and builds a new one to join all 
> three tables.

It should certainly not "throw away" the old SQL --- that path could
still be chosen.

> That seems tedious, when there are a lot of tables 
> involved. A FDW like the pgsql_fdw that constructs an SQL query doesn't 
> need to consider pairs of joins. It could just as well build the SQL for 
> the three-way join directly. I think the API needs to reflect that.

> I wonder if we should have a heuristic to not even consider doing a join 
> locally, if it can be done remotely.

I think this is a bad idea.  It will require major restructuring of the
planner, and sometimes it will fail to find the best plan, in return for
not much.  The nature of join planning is that we investigate a lot of
dead ends.
        regards, tom lane


Re: WIP: Join push-down for foreign tables

От
Heikki Linnakangas
Дата:
On 17.11.2011 17:24, Tom Lane wrote:
> Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>  writes:
>> When the FDW recognizes it's being asked to join a ForeignJoinPath and a
>> ForeignPath, or two ForeignJoinPaths, it throws away the old SQL it
>> constructed to do the two-way join, and builds a new one to join all
>> three tables.
>
> It should certainly not "throw away" the old SQL --- that path could
> still be chosen.

Right, that was loose phrasing from me.

>> That seems tedious, when there are a lot of tables
>> involved. A FDW like the pgsql_fdw that constructs an SQL query doesn't
>> need to consider pairs of joins. It could just as well build the SQL for
>> the three-way join directly. I think the API needs to reflect that.

Tom, what do you think of this part? I think it would be a lot more 
natural API if the planner could directly ask the FDW to construct a 
plan for a three (or more)-way join, instead of asking it to join a join 
relation into another relation.

The proposed API is this:

+ FdwPlan *
+ PlanForeignJoin (Oid serverid,
+                  PlannerInfo *root,
+                  RelOptInfo *joinrel,
+                  JoinType jointype,
+                  SpecialJoinInfo *sjinfo,
+                  Path *outer_path,
+                  Path *inner_path,
+                  List *restrict_clauses,
+                  List *pathkeys);

The problem I have with this is that the FDW shouldn't need outer_path 
and inner_path. All the information it needs is in 'joinrel'. Except for 
outer-joins, I guess; is there convenient way to get the join types 
involved in a join rel? It's there in SpecialJoinInfo, but if the FDW is 
only passed the RelOptInfo representing the three-way join, it's not there.

Does the planner expect the result from the foreign server to be 
correctly sorted, if it passes pathkeys to that function?

>> I wonder if we should have a heuristic to not even consider doing a join
>> locally, if it can be done remotely.
>
> I think this is a bad idea.  It will require major restructuring of the
> planner, and sometimes it will fail to find the best plan, in return for
> not much.  The nature of join planning is that we investigate a lot of
> dead ends.

Ok.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: WIP: Join push-down for foreign tables

От
Tom Lane
Дата:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Tom, what do you think of this part? I think it would be a lot more 
> natural API if the planner could directly ask the FDW to construct a 
> plan for a three (or more)-way join, instead of asking it to join a join 
> relation into another relation.

I think this is fundamentally not going to work, at least not without
major and IMO unwise surgery on the planner.  Building up joins pairwise
is what it does.

Furthermore, you seem to be imagining that there is only one best path
for any join, which isn't the case.  We'll typically have several paths
under consideration because of cheapest-startup versus cheapest-total
and/or different resulting sort orders.  If we do what you're
suggesting, that's going to either break entirely or require a much more
complicated API for PlanForeignJoin.

> Does the planner expect the result from the foreign server to be 
> correctly sorted, if it passes pathkeys to that function?

Well, the result path should be marked with pathkeys if it is known to
be sorted a certain way, or with NIL if not.  There's no prejudgment as
to what a particular join method will produce.  That does raise
interesting questions though as to how to interact with the remote-end
planner --- if we've reported that a path has certain pathkeys, that
probably means that the SQL sent to the remote had better say ORDER BY,
which would be kind of annoying if in the end we weren't depending on
the path to be sorted.  I'm not sure what it would take to pass that
information back down, though.  What we might have to do to make this
work conveniently is generate two versions of every foreign path: one
marked with pathkeys, and one not.  And make sure the former has a
somewhat-higher cost.  Then we'd know from which path gets picked
whether the plan is actually depending on sorted results.
        regards, tom lane


Re: WIP: Join push-down for foreign tables

От
Heikki Linnakangas
Дата:
On 02.12.2011 18:55, Tom Lane wrote:
> Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>  writes:
>> Tom, what do you think of this part? I think it would be a lot more
>> natural API if the planner could directly ask the FDW to construct a
>> plan for a three (or more)-way join, instead of asking it to join a join
>> relation into another relation.
>
> I think this is fundamentally not going to work, at least not without
> major and IMO unwise surgery on the planner.  Building up joins pairwise
> is what it does.
>
> Furthermore, you seem to be imagining that there is only one best path
> for any join, which isn't the case.

No, I understand that the planner considers many alternatives, even at 
the same time, because of different output sort orders and startup vs. 
total cost. I'm imagining that the planner would ask the FDW to 
construct the two-way joins, and consider joining the results of those 
locally to the third table, and also ask the FDW to construct the 
three-way join as whole. And then choose the cheapest alternative.

>  We'll typically have several paths
> under consideration because of cheapest-startup versus cheapest-total
> and/or different resulting sort orders.  If we do what you're
> suggesting, that's going to either break entirely or require a much more
> complicated API for PlanForeignJoin.

I don't understand why the FDW should care about the order the joins are 
constructed in in the planner. From the FDW's point of view, there's no 
difference between joining ((A B) C) and (A (B C)). Unless you also want 
to consider doing a remote join between (A B) and C, where C is a 
foreign table but A and B are local tables. That would in theory be 
possible to execute in the remote server, by shipping the result of (A 
B) to the remote server, but we'd also need a quite different executor 
API to handle that.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: WIP: Join push-down for foreign tables

От
Tom Lane
Дата:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> On 02.12.2011 18:55, Tom Lane wrote:
>> Furthermore, you seem to be imagining that there is only one best path
>> for any join, which isn't the case.

> No, I understand that the planner considers many alternatives, even at 
> the same time, because of different output sort orders and startup vs. 
> total cost. I'm imagining that the planner would ask the FDW to 
> construct the two-way joins, and consider joining the results of those 
> locally to the third table, and also ask the FDW to construct the 
> three-way join as whole. And then choose the cheapest alternative.

It probably makes sense to turn control over to the FDW just once to
consider all possible foreign join types for a given join pair, ie
we don't want to ask it separately about nestloop, hash, merge joins.
But then we had better be able to let it generate multiple paths within
the one call, and dump them all to add_path.  You're still assuming that
there is one unique best path for any join, and *that is not the case*,
or at least we don't know which will be the best at the time we're
generating join paths.  We don't know whether fast-start is better than
cheapest-total, nor which sort order might be the best, until we get up
to the highest join level.

So rather than returning a Path struct, it would have to just be given
the joinrel struct and be expected to do add_path call(s) for itself.

> I don't understand why the FDW should care about the order the joins are 
> constructed in in the planner. From the FDW's point of view, there's no 
> difference between joining ((A B) C) and (A (B C)).

Maybe there is, maybe there isn't.  You're assuming too much about how
the FDW does its join planning, I think --- in particular, FDWs that are
backed by less than a Postgres-equivalent remote planner might well
appreciate being walked through all the feasible join pairs.

If we do it as I suggest above, the FDW could remember that it had
already planned this joinrel and just drop out immediately if called
again, if it's going to do it the way you're thinking.
        regards, tom lane


Re: WIP: Join push-down for foreign tables

От
Heikki Linnakangas
Дата:
On 03.12.2011 00:24, Tom Lane wrote:
> Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>  writes:
>> On 02.12.2011 18:55, Tom Lane wrote:
>>> Furthermore, you seem to be imagining that there is only one best path
>>> for any join, which isn't the case.
>
>> No, I understand that the planner considers many alternatives, even at
>> the same time, because of different output sort orders and startup vs.
>> total cost. I'm imagining that the planner would ask the FDW to
>> construct the two-way joins, and consider joining the results of those
>> locally to the third table, and also ask the FDW to construct the
>> three-way join as whole. And then choose the cheapest alternative.
>
> It probably makes sense to turn control over to the FDW just once to
> consider all possible foreign join types for a given join pair, ie
> we don't want to ask it separately about nestloop, hash, merge joins.
> But then we had better be able to let it generate multiple paths within
> the one call, and dump them all to add_path.  You're still assuming that
> there is one unique best path for any join, and *that is not the case*,
> or at least we don't know which will be the best at the time we're
> generating join paths.  We don't know whether fast-start is better than
> cheapest-total, nor which sort order might be the best, until we get up
> to the highest join level.

Hmm, so you're saying that the FDW function needs to be able to return 
multiple paths for a single joinrel. Fair enough, and that's not 
specific to remote joins. Even a single-table foreign scan could be 
implemented differently depending on whether you prefer fast-start or 
cheapest total.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: WIP: Join push-down for foreign tables

От
Tom Lane
Дата:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Hmm, so you're saying that the FDW function needs to be able to return 
> multiple paths for a single joinrel. Fair enough, and that's not 
> specific to remote joins. Even a single-table foreign scan could be 
> implemented differently depending on whether you prefer fast-start or 
> cheapest total.

... or ordered vs unordered, etc.  Yeah, good point, we already got this
wrong with the PlanForeignScan API.  Good thing we didn't promise that
would be stable.
        regards, tom lane


Re: WIP: Join push-down for foreign tables

От
Heikki Linnakangas
Дата:
On 03.12.2011 01:05, Tom Lane wrote:
> Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>  writes:
>> Hmm, so you're saying that the FDW function needs to be able to return
>> multiple paths for a single joinrel. Fair enough, and that's not
>> specific to remote joins. Even a single-table foreign scan could be
>> implemented differently depending on whether you prefer fast-start or
>> cheapest total.
>
> ... or ordered vs unordered, etc.  Yeah, good point, we already got this
> wrong with the PlanForeignScan API.  Good thing we didn't promise that
> would be stable.

This discussion withered down here...

I think the advice to Shigeru-san is to work on the API. We didn't reach 
a consensus on what exactly it should look like, but at least you need 
to be able to return multiple paths for a single joinrel, and should 
look at fixing the PlanForeignScan API to allow that too.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com