Обсуждение: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

Поиск

Список

Период

Сортировка

postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

01 мая 2015 г., 16:35:15

On Mon, Apr 27, 2015 at 5:07 AM, Shigeru Hanada
<shigeru.hanada@gmail.com> wrote:
> 2015-04-27 11:00 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:
>> Hanada-san, could you adjust your postgres_fdw patch according to
>> the above new (previous?) definition.
>
> The attached v14 patch is the revised version for your v13 patch.  It also contains changed for Ashutosh’s comments.

We should probably move this discussion to a new thread now that the
other patch is committed.  Changing subject line accordingly.

Generally, there's an awful lot of changes in this patch - it is over
2000 insertions and more than 450 deletions - and it's not awfully
obvious why all of those changes are there.  I think this patch needs
a detailed README to accompany it explaining what the various changes
in the patch are and why those things got changed; or maybe there is a
way to break it up into multiple patches so that we can take a more
incremental approach.  I am really suspicious of the amount of
wholesale reorganization of code that this patch is doing.  It's
really hard to validate that a reorganization like that is necessary,
or that it's correct, and it's gonna make back-patching noticeably
harder in the future.  If we really need this much code churn it needs
careful justification; if we don't, we shouldn't do it.

+SET enable_mergejoin = off; -- planner choose MergeJoin even it has
higher costs, so disable it for testing.

This seems awfully strange.  Why would the planner choose a plan if it
had a higher cost?

-        * If the table or the server is configured to use remote estimates,
-        * identify which user to do remote access as during planning.  This
+        * Identify which user to do remote access as during planning.  This        * should match what
ExecCheckRTEPerms()does.  If we fail due 
to lack of        * permissions, the query would have failed at runtime anyway.        */
-       if (fpinfo->use_remote_estimate)
-       {
-               RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root);
-               Oid                     userid = rte->checkAsUser ?
rte->checkAsUser : GetUserId();
-
-               fpinfo->user = GetUserMapping(userid, fpinfo->server->serverid);
-       }
-       else
-               fpinfo->user = NULL;
+       rte = planner_rt_fetch(baserel->relid, root);
+       fpinfo->userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();

So, wait a minute, remote estimates aren't optional any more?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Shigeru HANADA

Дата:

03 мая 2015 г., 04:51:24

Thanks for the comments.

2015/05/01 22:35、Robert Haas <robertmhaas@gmail.com> のメール：
> On Mon, Apr 27, 2015 at 5:07 AM, Shigeru Hanada
> <shigeru.hanada@gmail.com> wrote:
>> 2015-04-27 11:00 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:
>>> Hanada-san, could you adjust your postgres_fdw patch according to
>>> the above new (previous?) definition.
>>
>> The attached v14 patch is the revised version for your v13 patch.  It also contains changed for Ashutosh’s comments.
>
> We should probably move this discussion to a new thread now that the
> other patch is committed.  Changing subject line accordingly.
>
> Generally, there's an awful lot of changes in this patch - it is over
> 2000 insertions and more than 450 deletions - and it's not awfully
> obvious why all of those changes are there.  I think this patch needs
> a detailed README to accompany it explaining what the various changes
> in the patch are and why those things got changed; or maybe there is a
> way to break it up into multiple patches so that we can take a more
> incremental approach.  I am really suspicious of the amount of
> wholesale reorganization of code that this patch is doing.  It's
> really hard to validate that a reorganization like that is necessary,
> or that it's correct, and it's gonna make back-patching noticeably
> harder in the future.  If we really need this much code churn it needs
> careful justification; if we don't, we shouldn't do it.
>

I agree.  I’ll write detailed description for the patch and repost the new one with rebasing onto current HEAD.  I’m
sorrybut it will take a day or so... 

> +SET enable_mergejoin = off; -- planner choose MergeJoin even it has
> higher costs, so disable it for testing.
>
> This seems awfully strange.  Why would the planner choose a plan if it
> had a higher cost?

I thought so but I couldn’t reproduce such situation now.  I’ll investigate it more.  If the issue has gone, I’ll
removethat SET statement for straightforward tests. 


>
> -        * If the table or the server is configured to use remote estimates,
> -        * identify which user to do remote access as during planning.  This
> +        * Identify which user to do remote access as during planning.  This
>         * should match what ExecCheckRTEPerms() does.  If we fail due
> to lack of
>         * permissions, the query would have failed at runtime anyway.
>         */
> -       if (fpinfo->use_remote_estimate)
> -       {
> -               RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root);
> -               Oid                     userid = rte->checkAsUser ?
> rte->checkAsUser : GetUserId();
> -
> -               fpinfo->user = GetUserMapping(userid, fpinfo->server->serverid);
> -       }
> -       else
> -               fpinfo->user = NULL;
> +       rte = planner_rt_fetch(baserel->relid, root);
> +       fpinfo->userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
>
> So, wait a minute, remote estimates aren't optional any more?

No, it seems to be removed accidentally.  I’ll check the reason again though, but I’ll revert the change unless I find
anyproblem. 

--
Shigeru HANADA
shigeru.hanada@gmail.com

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Shigeru Hanada

Дата:

16 мая 2015 г., 16:04:57

Attached is the v15 patch of foreign join support for postgres_fdw.

This patch is based on current master, and having being removed some
hunks which are not essential.

And I wrote description of changes done by the patch.  It is little
bit long but I hope it would help understanding what the patch does.

The total LOC of the patch is 3.7k, 1.8k for code and 2.0k for
regression tests.  This is not a small patch, as Robert says, so I'd
like to summarize changed done by this patch and explain why they are
necessary.

Outline of join push-down support for postgres_fdw
==================================================

This patch provides new capability to join between foriegn tables
managed by same foreign server on remote side, by constructing a
remote query containing join clause, and executing it as source of a
pseudo foreign scan.  This patch is based on Custom/Foreign join patch
written by Kohei KaiGai.

PostgreSQL's planning for a query containing join is done with these steps:

1. generate possible scan paths for each base relations
2. generate join paths with bottom-up approach
3. generate plan nodes required for the cheapest path
4. execute the plan nodes to obtain result tuples

Generating path node
--------------------
As of now, postgres_fdw generates a ForeignPath which represents a
result of a join for each RelOptInfo, and planner can determine which
path is cheapest from its cost values.

GetForeignJoinPaths is called once for each join combination, i.e. A
JOIN B and B JOIN A are considered separately.  So GetForeignJoinPath
should return immediately to skip its job when the call is the
reversed combination of already considered one.  For this purpose, I
added update_safe flag to PgFdwRelationInfo.  This flag is always set
for simple foriegn scans, but for join relation it is set only when
the join can be pushed down.  The reason of adding this flag is that
checking RelOptInfo#fdw_private is MULL can't prevent useless
processing for a join combination which is reversed one of already
considered join which can't be pushed down.

postgres_fdw's GetForeignJoinPaths() does various checks, to ensure
that the result has same semantics as local joins.  Now postgres_fdw
have these criteria:

a) join type must be one of INNER/LEFT OUTER/RIGHT OUTER/FULL OUTER join
This check is done with given jointype argument.  IOW, CROSS joins and
SEMI/ANTI joins are not pushed down.  This is because 1) CROSS joins
would produe more result than separeted join sources, and 2) ANTI/SEMI
joins need to be deparsed as sub-query and it seems to take some more
time to implement.
b) Both outer and inner must have RelOptInfo#fdw_private
Having fdw_private means that the RelOptInfo is safe to push down, so
having no fdw_private means that portion is not safe to push down and
thus the whole join is not safe to push down.
c) All relations in the join must belong to the same server
This check is done with serverid stored in RelOptInfo#fdw_private as
PgFdwRelationInfo.  Joining relations belong to different servers is
not leagal.  Even they finally have completely same connection
information, they should accessed via different libpq sessions.
Responsibility of checking server matching is under discussion in the
Custom/Foreign join thread, and I'd vote for checking it in core.  If
it is decided, I remove this criterion soon.
d) All relations must have the same effective user id
This check is done with userid stored in PgFdwRelationInfo, which is
valid only when underlying relations have the same effective user id.
Here "effective user id" means id of the user executing the query, or
the owner of the view when the foreign table is accessed via view.
Tom suggested that it is necessary to check that user mapping matches
for security reason, and now I think it's better than checking
effective user as current postgres_fdw does.
e) Each source relation must not have any local filter
Evaluating conditions of join source talbe potentially produces
different result in OUTER join cases.  This can be relaxed for the
cases when the join is INNER and local filters don't contain any
volatile function/operator, but it is left as future enhancement.
f) All join conditions of non-inner join must be safe to push down
This is similar to e).

A join which passes all criteria above is safe to push-down, so
postgres_fdw create a ForeignPath for the join and add it to
RelOptInfo.  Currently postgres_fdw doesn't set pathkeys (ordering
information) nor require_outer (information for parameterized path).

PgFdwRelationInfo is used to store various planning information
specific to postgres_fdw.  To support join push-down, I added some
fields which are necessary to deparse join query recursively in
deparseSelectSql.

- outer RelOptInfo, to generate source relatoin as subquery
- inner RelOptInfo, to generate source relation as subquery
- jointype, to deparse JOIN keyword string (e.g. INNER JOIN)
- joinclauses, to deprase ON part of JOIN clause
- otherclauses, to deparse WHERE clause of join query

I also moved it from postgres_fdw.c to postgres_fdw.h, because
deparse.c needs to refer the definition during deparsing query.

Generating plan node
--------------------
Once planner find the cheapest path, it generates plan tree from the
path tree.  During the steps, planner calls GetForeignPlan for the
ForeignPath in the top of a join tree.  IOW, GetForeignPlan is not
called for underlying joins and scans, so postgres_fdw needs a way to
do its task (mainly generating query string) recursively.

GetForeignPlan generates remote query and store it in ForeignScan node
to be returned.  The construction of remote query is done by calling
deparseSelectSql for given RelOptInfo.  This function was modified to
accept both base relations and join relations to support join
push-down.  Main part of generating join query is implemented in
deparseJoinSql, which is a newly added function, and deparseSelectSql
calls it if given relation was a join relation.

One big change about deparsing base relation is aliasing.  This patch
adds column alias to SELECT clause even original query is a simple
single table SELECT.

fdw=# EXPLAIN (VERBOSE, COSTS false) SELECT * FROM pgbench_branches b;
                                     QUERY PLAN
------------------------------------------------------------------------------------
 Foreign Scan on public.pgbench_branches b
   Output: bid, bbalance, filler
   Remote SQL: SELECT bid a9, bbalance a10, filler a11 FROM
public.pgbench_branches
(3 rows)

As you see, every column has alias in format "a%d" with index value
derived from pg_attribute.attnum.  Index value is attnum + 8, and the
magic number "8" comes from FirstLowInvalidHeapAttributeNumber for the
adjustment that makes attribute number of system attributes positive.
To make the code more readable, I introduced local macro
GET_RELATIVE_ATTNO(), but I'm not sure that this name is very nice.
This column alias system allows to refer a particular column pointed
by Var node in upper join level without consideration about column
position in SELECT clause.  To achieve this, deparseVar is expanded to
handle variables used in join query, and deparseJoinVar is added to
deparse column reference with table alias "l" or "r" for outer or
inner respectively.

As mentioned at the beginning of this section, GetForeignPlan is
called only for the top node of the join tree, so we need to do
something recursively.  So postgres_fdw has information about join
tree structure in PgFdwRelationInfo and pass it via
RelOptInfo#fdw_private.  This link works down to the base relation at
the leaf of the join tree.

When deparseSelectSql is called for a join relation, it calls
deparseSelectSql for each outer and inner RelOptInfo to generate query
strings, and pass them to deparseJoinSql as parts for FROM clause
subquery.  For example of two way join:

[table definition]
CREATE FOREIGN TABLE table1 (
id int NOT NULL,
name text,
...
) SERVER server;
CREATE FOREIGN TABLE table2 (
id int NOT NULL,
name text,
...
) SERVER server;

[original query]
SELECT t1.name, t2.name FROM table1 t1 INNER JOIN table2 t2 ON t1.id = t2.id;

[remote query]
SELECT l.a2, r.a2
 FROM (SELECT id, name FROM table1) l (a1, a2)
  INNER JOIN
  (SELECT id, name FROM table2) r (a1, a2)
  ON (l.a1 = r.a1)
;

During deparsing join query, deparseJoinSql maintains fdw_scan_tlist
list to hold TargetEntry for columns used in SELECT clause of every
query.

One thing tricky is "peusdo projection" which is done by
deparseProjectionSql for whole-row reference.  This is done by put the
query string in FROM subquery and add whole-row reference in outer
SELECT clause.  This is done by ExecProjection in 9.4 and older, but
we need to do this in postgres_fdw because ExecProjection is not
called any more.

For various conditions, such as join conditions in JOIN clauses and
filter conditions in WHERE clauses, appendCondition is used to deparse
condition strings.  This was expanded version of appendWhereClause.
Note that appendConditions accepts list of RestrictInfo or list of
Expr as source, and downcasts them properly.  As of 9.4
appendWhereClause accepted only RestrictInfo, but join conditions are
Expr, so I made it little flexible.

Finally deparseJoinSql construct whole query by putting parts into the
right places.  Note that column aliases are not written in SELECT
clause but in FROM clause, after table alias.  This simpifies SELECT
clause construction simpler.

debug stuffs
------------
bms_to_str is a function which prints contents of a bitmapset in
human-readable format.  I added this for debug purpose, but IMO it's
ok to have such function as public bitmapset API.

2015-05-03 10:51 GMT+09:00 Shigeru HANADA <shigeru.hanada@gmail.com>:
> Thanks for the comments.
>
> 2015/05/01 22:35、Robert Haas <robertmhaas@gmail.com> のメール：
>> On Mon, Apr 27, 2015 at 5:07 AM, Shigeru Hanada
>> <shigeru.hanada@gmail.com> wrote:
>>> 2015-04-27 11:00 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:
>>>> Hanada-san, could you adjust your postgres_fdw patch according to
>>>> the above new (previous?) definition.
>>>
>>> The attached v14 patch is the revised version for your v13 patch.  It also contains changed for Ashutosh’s
comments.
>>
>> We should probably move this discussion to a new thread now that the
>> other patch is committed.  Changing subject line accordingly.
>>
>> Generally, there's an awful lot of changes in this patch - it is over
>> 2000 insertions and more than 450 deletions - and it's not awfully
>> obvious why all of those changes are there.  I think this patch needs
>> a detailed README to accompany it explaining what the various changes
>> in the patch are and why those things got changed; or maybe there is a
>> way to break it up into multiple patches so that we can take a more
>> incremental approach.  I am really suspicious of the amount of
>> wholesale reorganization of code that this patch is doing.  It's
>> really hard to validate that a reorganization like that is necessary,
>> or that it's correct, and it's gonna make back-patching noticeably
>> harder in the future.  If we really need this much code churn it needs
>> careful justification; if we don't, we shouldn't do it.
>>
>
> I agree.  I’ll write detailed description for the patch and repost the new one with rebasing onto current HEAD.  I’m
sorrybut it will take a day or so... 
>
>> +SET enable_mergejoin = off; -- planner choose MergeJoin even it has
>> higher costs, so disable it for testing.
>>
>> This seems awfully strange.  Why would the planner choose a plan if it
>> had a higher cost?
>
> I thought so but I couldn’t reproduce such situation now.  I’ll investigate it more.  If the issue has gone, I’ll
removethat SET statement for straightforward tests. 
>
>
>>
>> -        * If the table or the server is configured to use remote estimates,
>> -        * identify which user to do remote access as during planning.  This
>> +        * Identify which user to do remote access as during planning.  This
>>         * should match what ExecCheckRTEPerms() does.  If we fail due
>> to lack of
>>         * permissions, the query would have failed at runtime anyway.
>>         */
>> -       if (fpinfo->use_remote_estimate)
>> -       {
>> -               RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root);
>> -               Oid                     userid = rte->checkAsUser ?
>> rte->checkAsUser : GetUserId();
>> -
>> -               fpinfo->user = GetUserMapping(userid, fpinfo->server->serverid);
>> -       }
>> -       else
>> -               fpinfo->user = NULL;
>> +       rte = planner_rt_fetch(baserel->relid, root);
>> +       fpinfo->userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
>>
>> So, wait a minute, remote estimates aren't optional any more?
>
> No, it seems to be removed accidentally.  I’ll check the reason again though, but I’ll revert the change unless I
findany problem. 
>
> --
> Shigeru HANADA
> shigeru.hanada@gmail.com
>
>
>
>

--
Shigeru HANADA

Вложения

foreign_join_v15.patch

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

18 мая 2015 г., 08:03:51

On Sat, May 16, 2015 at 6:34 PM, Shigeru Hanada <shigeru.hanada@gmail.com> wrote:

Attached is the v15 patch of foreign join support for postgres_fdw.

This patch is based on current master, and having being removed some
hunks which are not essential.

And I wrote description of changes done by the patch. It is little
bit long but I hope it would help understanding what the patch does.

The total LOC of the patch is 3.7k, 1.8k for code and 2.0k for
regression tests. This is not a small patch, as Robert says, so I'd
like to summarize changed done by this patch and explain why they are
necessary.

Outline of join push-down support for postgres_fdw
==================================================

This patch provides new capability to join between foriegn tables
managed by same foreign server on remote side, by constructing a
remote query containing join clause, and executing it as source of a
pseudo foreign scan. This patch is based on Custom/Foreign join patch
written by Kohei KaiGai.

PostgreSQL's planning for a query containing join is done with these steps:

1. generate possible scan paths for each base relations
2. generate join paths with bottom-up approach
3. generate plan nodes required for the cheapest path
4. execute the plan nodes to obtain result tuples

Generating path node
--------------------
As of now, postgres_fdw generates a ForeignPath which represents a
result of a join for each RelOptInfo, and planner can determine which
path is cheapest from its cost values.

GetForeignJoinPaths is called once for each join combination, i.e. A
JOIN B and B JOIN A are considered separately. So GetForeignJoinPath
should return immediately to skip its job when the call is the
reversed combination of already considered one. For this purpose, I
added update_safe flag to PgFdwRelationInfo. This flag is always set
for simple foriegn scans, but for join relation it is set only when
the join can be pushed down. The reason of adding this flag is that
checking RelOptInfo#fdw_private is MULL can't prevent useless
processing for a join combination which is reversed one of already
considered join which can't be pushed down.

This is just a suggestion, but you may actually get rid of the flag by restricting the path generation only when say outer relation's pointer or OID or relid is greater/lesser than inner relation's corresponding property.

postgres_fdw's GetForeignJoinPaths() does various checks, to ensure
that the result has same semantics as local joins. Now postgres_fdw
have these criteria:

a) join type must be one of INNER/LEFT OUTER/RIGHT OUTER/FULL OUTER join
This check is done with given jointype argument. IOW, CROSS joins and
SEMI/ANTI joins are not pushed down. This is because 1) CROSS joins
would produe more result than separeted join sources,

We might loose on some optimizations in aggregate push-down by not creating paths altogether for CROSS joins. If there is a count(*) on CROSS join result, we will push count(*) since there doesn't exist a foreign path for the join. OR it might be possible that pushing down A INNER JOIN B CROSS JOIN C is cheaper than performing some or all of the joins locally. I think we should create a path and let it stay in the paths list. If there is no path which can use CROSS join path, it will discarded eventually. Sorry for bringing this so late in the discussion.

and 2) ANTI/SEMI
joins need to be deparsed as sub-query and it seems to take some more
time to implement.
b) Both outer and inner must have RelOptInfo#fdw_private
Having fdw_private means that the RelOptInfo is safe to push down, so
having no fdw_private means that portion is not safe to push down and
thus the whole join is not safe to push down.
c) All relations in the join must belong to the same server
This check is done with serverid stored in RelOptInfo#fdw_private as
PgFdwRelationInfo. Joining relations belong to different servers is
not leagal. Even they finally have completely same connection
information, they should accessed via different libpq sessions.
Responsibility of checking server matching is under discussion in the
Custom/Foreign join thread, and I'd vote for checking it in core. If
it is decided, I remove this criterion soon.
d) All relations must have the same effective user id
This check is done with userid stored in PgFdwRelationInfo, which is
valid only when underlying relations have the same effective user id.
Here "effective user id" means id of the user executing the query, or
the owner of the view when the foreign table is accessed via view.
Tom suggested that it is necessary to check that user mapping matches
for security reason, and now I think it's better than checking
effective user as current postgres_fdw does.
e) Each source relation must not have any local filter
Evaluating conditions of join source talbe potentially produces
different result in OUTER join cases. This can be relaxed for the
cases when the join is INNER and local filters don't contain any
volatile function/operator, but it is left as future enhancement.
f) All join conditions of non-inner join must be safe to push down
This is similar to e).

A join which passes all criteria above is safe to push-down, so
postgres_fdw create a ForeignPath for the join and add it to
RelOptInfo. Currently postgres_fdw doesn't set pathkeys (ordering
information) nor require_outer (information for parameterized path).

PgFdwRelationInfo is used to store various planning information
specific to postgres_fdw. To support join push-down, I added some
fields which are necessary to deparse join query recursively in
deparseSelectSql.

- outer RelOptInfo, to generate source relatoin as subquery
- inner RelOptInfo, to generate source relation as subquery
- jointype, to deparse JOIN keyword string (e.g. INNER JOIN)
- joinclauses, to deprase ON part of JOIN clause
- otherclauses, to deparse WHERE clause of join query

I also moved it from postgres_fdw.c to postgres_fdw.h, because
deparse.c needs to refer the definition during deparsing query.

Generating plan node
--------------------
Once planner find the cheapest path, it generates plan tree from the
path tree. During the steps, planner calls GetForeignPlan for the
ForeignPath in the top of a join tree. IOW, GetForeignPlan is not
called for underlying joins and scans, so postgres_fdw needs a way to
do its task (mainly generating query string) recursively.

GetForeignPlan generates remote query and store it in ForeignScan node
to be returned. The construction of remote query is done by calling
deparseSelectSql for given RelOptInfo. This function was modified to
accept both base relations and join relations to support join
push-down. Main part of generating join query is implemented in
deparseJoinSql, which is a newly added function, and deparseSelectSql
calls it if given relation was a join relation.

One big change about deparsing base relation is aliasing. This patch
adds column alias to SELECT clause even original query is a simple
single table SELECT.

fdw=# EXPLAIN (VERBOSE, COSTS false) SELECT * FROM pgbench_branches b;
QUERY PLAN
------------------------------------------------------------------------------------
Foreign Scan on public.pgbench_branches b
Output: bid, bbalance, filler
Remote SQL: SELECT bid a9, bbalance a10, filler a11 FROM
public.pgbench_branches
(3 rows)

As you see, every column has alias in format "a%d" with index value
derived from pg_attribute.attnum. Index value is attnum + 8, and the
magic number "8" comes from FirstLowInvalidHeapAttributeNumber for the
adjustment that makes attribute number of system attributes positive.
To make the code more readable, I introduced local macro
GET_RELATIVE_ATTNO(), but I'm not sure that this name is very nice.
This column alias system allows to refer a particular column pointed
by Var node in upper join level without consideration about column
position in SELECT clause. To achieve this, deparseVar is expanded to
handle variables used in join query, and deparseJoinVar is added to
deparse column reference with table alias "l" or "r" for outer or
inner respectively.

As mentioned at the beginning of this section, GetForeignPlan is
called only for the top node of the join tree, so we need to do
something recursively. So postgres_fdw has information about join
tree structure in PgFdwRelationInfo and pass it via
RelOptInfo#fdw_private. This link works down to the base relation at
the leaf of the join tree.

When deparseSelectSql is called for a join relation, it calls
deparseSelectSql for each outer and inner RelOptInfo to generate query
strings, and pass them to deparseJoinSql as parts for FROM clause
subquery. For example of two way join:

[table definition]
CREATE FOREIGN TABLE table1 (
id int NOT NULL,
name text,
...
) SERVER server;
CREATE FOREIGN TABLE table2 (
id int NOT NULL,
name text,
...
) SERVER server;

[original query]
SELECT t1.name, t2.name FROM table1 t1 INNER JOIN table2 t2 ON t1.id = t2.id;

[remote query]
SELECT l.a2, r.a2
FROM (SELECT id, name FROM table1) l (a1, a2)
INNER JOIN
(SELECT id, name FROM table2) r (a1, a2)
ON (l.a1 = r.a1)
;

During deparsing join query, deparseJoinSql maintains fdw_scan_tlist
list to hold TargetEntry for columns used in SELECT clause of every
query.

One thing tricky is "peusdo projection" which is done by
deparseProjectionSql for whole-row reference. This is done by put the
query string in FROM subquery and add whole-row reference in outer
SELECT clause. This is done by ExecProjection in 9.4 and older, but
we need to do this in postgres_fdw because ExecProjection is not
called any more.

For various conditions, such as join conditions in JOIN clauses and
filter conditions in WHERE clauses, appendCondition is used to deparse
condition strings. This was expanded version of appendWhereClause.
Note that appendConditions accepts list of RestrictInfo or list of
Expr as source, and downcasts them properly. As of 9.4
appendWhereClause accepted only RestrictInfo, but join conditions are
Expr, so I made it little flexible.

Finally deparseJoinSql construct whole query by putting parts into the
right places. Note that column aliases are not written in SELECT
clause but in FROM clause, after table alias. This simpifies SELECT
clause construction simpler.

About deparsing stuff, it looks like we are duplicating the functionality in ruleutils.c and more push-down stuff we add, more will be the duplication. Can we reuse that functionality?

debug stuffs
------------
bms_to_str is a function which prints contents of a bitmapset in
human-readable format. I added this for debug purpose, but IMO it's
ok to have such function as public bitmapset API.

2015-05-03 10:51 GMT+09:00 Shigeru HANADA <shigeru.hanada@gmail.com>:
> Thanks for the comments.
>
> 2015/05/01 22:35、Robert Haas <robertmhaas@gmail.com> のメール：
>> On Mon, Apr 27, 2015 at 5:07 AM, Shigeru Hanada
>> <shigeru.hanada@gmail.com> wrote:
>>> 2015-04-27 11:00 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:
>>>> Hanada-san, could you adjust your postgres_fdw patch according to
>>>> the above new (previous?) definition.
>>>
>>> The attached v14 patch is the revised version for your v13 patch. It also contains changed for Ashutosh’s comments.
>>
>> We should probably move this discussion to a new thread now that the
>> other patch is committed. Changing subject line accordingly.
>>
>> Generally, there's an awful lot of changes in this patch - it is over
>> 2000 insertions and more than 450 deletions - and it's not awfully
>> obvious why all of those changes are there. I think this patch needs
>> a detailed README to accompany it explaining what the various changes
>> in the patch are and why those things got changed; or maybe there is a
>> way to break it up into multiple patches so that we can take a more
>> incremental approach. I am really suspicious of the amount of
>> wholesale reorganization of code that this patch is doing. It's
>> really hard to validate that a reorganization like that is necessary,
>> or that it's correct, and it's gonna make back-patching noticeably
>> harder in the future. If we really need this much code churn it needs
>> careful justification; if we don't, we shouldn't do it.
>>
>
> I agree. I’ll write detailed description for the patch and repost the new one with rebasing onto current HEAD. I’m sorry but it will take a day or so...
>
>> +SET enable_mergejoin = off; -- planner choose MergeJoin even it has
>> higher costs, so disable it for testing.
>>
>> This seems awfully strange. Why would the planner choose a plan if it
>> had a higher cost?
>
> I thought so but I couldn’t reproduce such situation now. I’ll investigate it more. If the issue has gone, I’ll remove that SET statement for straightforward tests.
>
>
>>
>> - * If the table or the server is configured to use remote estimates,
>> - * identify which user to do remote access as during planning. This
>> + * Identify which user to do remote access as during planning. This
>> * should match what ExecCheckRTEPerms() does. If we fail due
>> to lack of
>> * permissions, the query would have failed at runtime anyway.
>> */
>> - if (fpinfo->use_remote_estimate)
>> - {
>> - RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root);
>> - Oid userid = rte->checkAsUser ?
>> rte->checkAsUser : GetUserId();
>> -
>> - fpinfo->user = GetUserMapping(userid, fpinfo->server->serverid);
>> - }
>> - else
>> - fpinfo->user = NULL;
>> + rte = planner_rt_fetch(baserel->relid, root);
>> + fpinfo->userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
>>
>> So, wait a minute, remote estimates aren't optional any more?
>
> No, it seems to be removed accidentally. I’ll check the reason again though, but I’ll revert the change unless I find any problem.
>
> --
> Shigeru HANADA
> shigeru.hanada@gmail.com
>
>
>
>

--
Shigeru HANADA

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

21 мая 2015 г., 17:11:25

On Sat, May 16, 2015 at 9:04 AM, Shigeru Hanada
<shigeru.hanada@gmail.com> wrote:
> d) All relations must have the same effective user id
> This check is done with userid stored in PgFdwRelationInfo, which is
> valid only when underlying relations have the same effective user id.
> Here "effective user id" means id of the user executing the query, or
> the owner of the view when the foreign table is accessed via view.
> Tom suggested that it is necessary to check that user mapping matches
> for security reason, and now I think it's better than checking
> effective user as current postgres_fdw does.

So, should this be a separate patch?

One of my concerns about this patch is that it's got a lot of stuff in
it that isn't obviously related to the patch.  Anything that is a
separate change should be separated out into its own patch.  Perhaps
you can post a set of patches that apply one on top of the next, with
the changes for each one clearly separated.

> e) Each source relation must not have any local filter
> Evaluating conditions of join source talbe potentially produces
> different result in OUTER join cases.  This can be relaxed for the
> cases when the join is INNER and local filters don't contain any
> volatile function/operator, but it is left as future enhancement.

I think this restriction is a sign that you're not really doing this
right. Consider:

(1) SELECT * FROM a LEFT JOIN b ON a.x = b.x AND b.x = 3;
(2) SELECT * FROM a LEFT JOIN b ON a.x = b.x WHERE b.x = 3;

If you push down the scan of b, you can include the b.x = 3 qual in
case (1) but not in case (2).  If you push down the join, you can
include the qual in either case, but you must attach it in the same
place where it was before.

> One big change about deparsing base relation is aliasing.  This patch
> adds column alias to SELECT clause even original query is a simple
> single table SELECT.
>
> fdw=# EXPLAIN (VERBOSE, COSTS false) SELECT * FROM pgbench_branches b;
>                                      QUERY PLAN
> ------------------------------------------------------------------------------------
>  Foreign Scan on public.pgbench_branches b
>    Output: bid, bbalance, filler
>    Remote SQL: SELECT bid a9, bbalance a10, filler a11 FROM
> public.pgbench_branches
> (3 rows)
>
> As you see, every column has alias in format "a%d" with index value
> derived from pg_attribute.attnum.  Index value is attnum + 8, and the
> magic number "8" comes from FirstLowInvalidHeapAttributeNumber for the
> adjustment that makes attribute number of system attributes positive.

Yeah.  I'm not sure this is a good idea.  The column labels are
pointless at the outermost level.

I'm not sure it isn't a good idea, either, but I have some doubts.

> One thing tricky is "peusdo projection" which is done by
> deparseProjectionSql for whole-row reference.  This is done by put the
> query string in FROM subquery and add whole-row reference in outer
> SELECT clause.  This is done by ExecProjection in 9.4 and older, but
> we need to do this in postgres_fdw because ExecProjection is not
> called any more.

What commit changed this?

Thanks for your work on this.  Although I know progress has been slow,
I think this work is really important to the project.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Kouhei Kaigai

Дата:

22 мая 2015 г., 05:26:39

> > One thing tricky is "peusdo projection" which is done by
> > deparseProjectionSql for whole-row reference.  This is done by put the
> > query string in FROM subquery and add whole-row reference in outer
> > SELECT clause.  This is done by ExecProjection in 9.4 and older, but
> > we need to do this in postgres_fdw because ExecProjection is not
> > called any more.
> 
> What commit changed this?
>
It seems to me the nature of whole-row reference, not a flaw of
ExecProject().

In case of base relation scan, whole-row reference is transformed
to ExecEvalWholeRowVar, then executed during ExecProject().
It constructs a record datum according to the TupleDesc of the
relation being in scan.
On the other hands, foreign-join also looks like a scan on relation
that is result set of remote join, its record type is defined in
the fdw_scan_tlist.
However, it may contain values come from multiple relations,
so it is not intended behavior if whole-row reference constructs
a record datum that contains all the attributes in the result-
set. In this context, whole-row reference shall contain all the
attributes of the "base" relation.
Only FDW driver can know, and ensure all the attributes to
construct whole-row reference are fetched from remote side.

Hanada-san, could you correct me, if I misunderstood above your
explanation?

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Shigeru Hanada

Дата:

22 мая 2015 г., 12:38:20

Thank for your comments.

2015-05-21 23:11 GMT+09:00 Robert Haas <robertmhaas@gmail.com>:
> On Sat, May 16, 2015 at 9:04 AM, Shigeru Hanada
> <shigeru.hanada@gmail.com> wrote:
>> d) All relations must have the same effective user id
>> This check is done with userid stored in PgFdwRelationInfo, which is
>> valid only when underlying relations have the same effective user id.
>> Here "effective user id" means id of the user executing the query, or
>> the owner of the view when the foreign table is accessed via view.
>> Tom suggested that it is necessary to check that user mapping matches
>> for security reason, and now I think it's better than checking
>> effective user as current postgres_fdw does.
>
> So, should this be a separate patch?

Yes, it should be an additional patch for Custom/Foreign join API which is already committed.
The patch would contain these changes:
* add new field usermappingid to RelOptInfo, it is InvalidOid for non-foreign tables
* obtain oid of user mapping for a foreign table, and store it in the RelOptInfo (we already have serverid in
RelOptInfo,so userid is enough to identify user mapping though) 
* propagate usermappingid up to the join relation when outer and inner relations have same valid value
* check matching of user mapping before calling GetForeignJoinPaths, rather than serverid

> One of my concerns about this patch is that it's got a lot of stuff in
> it that isn't obviously related to the patch.  Anything that is a
> separate change should be separated out into its own patch.  Perhaps
> you can post a set of patches that apply one on top of the next, with
> the changes for each one clearly separated.

IIUC, each patch should not break compilation, and should contain only one complete logical change which can't be
separatedinto pieces.  I think whole of the patch is necessary to implement  

>
>> e) Each source relation must not have any local filter
>> Evaluating conditions of join source talbe potentially produces
>> different result in OUTER join cases.  This can be relaxed for the
>> cases when the join is INNER and local filters don't contain any
>> volatile function/operator, but it is left as future enhancement.
>
> I think this restriction is a sign that you're not really doing this
> right. Consider:
>
> (1) SELECT * FROM a LEFT JOIN b ON a.x = b.x AND b.x = 3;
> (2) SELECT * FROM a LEFT JOIN b ON a.x = b.x WHERE b.x = 3;
>
> If you push down the scan of b, you can include the b.x = 3 qual in
> case (1) but not in case (2).  If you push down the join, you can
> include the qual in either case, but you must attach it in the same
> place where it was before.
>
>> One big change about deparsing base relation is aliasing.  This patch
>> adds column alias to SELECT clause even original query is a simple
>> single table SELECT.
>>
>> fdw=# EXPLAIN (VERBOSE, COSTS false) SELECT * FROM pgbench_branches b;
>>                                     QUERY PLAN
>> ------------------------------------------------------------------------------------
>> Foreign Scan on public.pgbench_branches b
>>   Output: bid, bbalance, filler
>>   Remote SQL: SELECT bid a9, bbalance a10, filler a11 FROM
>> public.pgbench_branches
>> (3 rows)
>>
>> As you see, every column has alias in format "a%d" with index value
>> derived from pg_attribute.attnum.  Index value is attnum + 8, and the
>> magic number "8" comes from FirstLowInvalidHeapAttributeNumber for the
>> adjustment that makes attribute number of system attributes positive.
>
> Yeah.  I'm not sure this is a good idea.  The column labels are
> pointless at the outermost level.
>
> I'm not sure it isn't a good idea, either, but I have some doubts.

I fixed the patch to not add column alias to remote queries for single table.  This change also reduces amount of
differencesfrom master branch slightly. 


>
>> One thing tricky is "peusdo projection" which is done by
>> deparseProjectionSql for whole-row reference.  This is done by put the
>> query string in FROM subquery and add whole-row reference in outer
>> SELECT clause.  This is done by ExecProjection in 9.4 and older, but
>> we need to do this in postgres_fdw because ExecProjection is not
>> called any more.
>
> What commit changed this?

No commit changed this behavior, as Kaigai-san says.  If you still have comments, please refer my response to
Kaigai-san.

>
> Thanks for your work on this.  Although I know progress has been slow,
> I think this work is really important to the project.

I agree.  I’ll take more time for this work.

--
Shigeru HANADA

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Shigeru Hanada

Дата:

28 мая 2015 г., 08:20:40

2015-05-22 18:37 GMT+09:00 Shigeru Hanada <shigeru.hanada@gmail.com>:
> 2015-05-21 23:11 GMT+09:00 Robert Haas <robertmhaas@gmail.com>:
>> On Sat, May 16, 2015 at 9:04 AM, Shigeru Hanada
>> <shigeru.hanada@gmail.com> wrote:
>>> d) All relations must have the same effective user id
>>> This check is done with userid stored in PgFdwRelationInfo, which is
>>> valid only when underlying relations have the same effective user id.
>>> Here "effective user id" means id of the user executing the query, or
>>> the owner of the view when the foreign table is accessed via view.
>>> Tom suggested that it is necessary to check that user mapping matches
>>> for security reason, and now I think it's better than checking
>>> effective user as current postgres_fdw does.
>>
>> So, should this be a separate patch?

I wrote patches for this issue.  Let me describe their design.

To require the matching of user mapping between two relations (base or
join) which involves foreign tables, it would require these stuffs:

a) Add new field umid to RelOptInfo to hold OID of user mapping used
for the relation and children
b) Propagate umid up to join relation only when both outer and inner
have the save valid values
c) Check matching of umid between two relations to be joined before
calling GetForeignJoinPaths

For a), adding an OID field would not be a serious burden.  Obtaining
the OID of user mapping can be accomplished by calling GetUserMapping
in get_relation_info, but it allocates an unnecessary UserMapping
object, so I added GetUserMappingId which just returns OID.

One concern about getting user mapping is checkAsUser.  Currently FDW
authors are required to consider which user is valid as argument of
GetUserMapping, because it is different from the result of GetUserId
when the target relation is accessed via a view which is owned by
another user.  This requirement would be easily ignored by FDW authors
and the ignorance causes terrible security issues.  So IMO it should
be hidden in the core code, and FDW authors should use user mappings
determined by the core.  This would break FDW I/F, so we can't change
it right now, but making GetUserMapping obsolete and mention it in the
documents would guide FDW authors to the right direction.

For b), such check should be done in build_join_rel, similarly to serverid.

For c), we can reuse the check about RelOptInfo#fdwroutine in
add_paths_to_joinrel, because all of serverid, umid and fdwroutine are
valid only when both the servers and the user mappings match between
outer and inner relations.

Attached are the patches which implement the idea above except
checkAsUser issue.
usermapping_matching.patch: check matching of user mapping OIDs
add_GetUserMappingById.patch: add helper function which is handy for
FDWs to obtain UserMapping
foreign_join_v16.patch: postgres_fdw which assumes user mapping
matching is done in core

Another idea about a) is to have an entire UserMapping object for each
RelOptInfo, but it would require UserMapping to have extra field of
its Oid, say umid, to compare them by OID.  But IMO creating
UserMapping for every RelOptInfo seems waste of space and time.

Thoughts?
--
Shigeru HANADA

Вложения

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

20 июля 2015 г., 10:10:04

Hi Hanada-san,

I have reviewed usermapping patch and here are some comments.

The patch applies cleanly on Head and compiles without problem. The make check in regression folder does not show any failure.

In find_user_mapping(), if the first cache search returns a valid tuple, it is checked twice for validity, un-necessarily. Instead if the first search returns a valid tuple, it should be returned immediately. I see that this code was copied from GetUserMapping(), where as well it had the same problem.

In build_join_rel(), we check whether user mappings of the two joining relations are same. If the user mappings are same, doesn't that imply that the foreign server and local user are same too?

Rest of the patch looks fine.

On Thu, May 28, 2015 at 10:50 AM, Shigeru Hanada <shigeru.hanada@gmail.com> wrote:

2015-05-22 18:37 GMT+09:00 Shigeru Hanada <shigeru.hanada@gmail.com>:
> 2015-05-21 23:11 GMT+09:00 Robert Haas <robertmhaas@gmail.com>:
>> On Sat, May 16, 2015 at 9:04 AM, Shigeru Hanada
>> <shigeru.hanada@gmail.com> wrote:
>>> d) All relations must have the same effective user id
>>> This check is done with userid stored in PgFdwRelationInfo, which is
>>> valid only when underlying relations have the same effective user id.
>>> Here "effective user id" means id of the user executing the query, or
>>> the owner of the view when the foreign table is accessed via view.
>>> Tom suggested that it is necessary to check that user mapping matches
>>> for security reason, and now I think it's better than checking
>>> effective user as current postgres_fdw does.
>>
>> So, should this be a separate patch?

I wrote patches for this issue. Let me describe their design.

To require the matching of user mapping between two relations (base or
join) which involves foreign tables, it would require these stuffs:

a) Add new field umid to RelOptInfo to hold OID of user mapping used
for the relation and children
b) Propagate umid up to join relation only when both outer and inner
have the save valid values
c) Check matching of umid between two relations to be joined before
calling GetForeignJoinPaths

For a), adding an OID field would not be a serious burden. Obtaining
the OID of user mapping can be accomplished by calling GetUserMapping
in get_relation_info, but it allocates an unnecessary UserMapping
object, so I added GetUserMappingId which just returns OID.

One concern about getting user mapping is checkAsUser. Currently FDW
authors are required to consider which user is valid as argument of
GetUserMapping, because it is different from the result of GetUserId
when the target relation is accessed via a view which is owned by
another user. This requirement would be easily ignored by FDW authors
and the ignorance causes terrible security issues. So IMO it should
be hidden in the core code, and FDW authors should use user mappings
determined by the core. This would break FDW I/F, so we can't change
it right now, but making GetUserMapping obsolete and mention it in the
documents would guide FDW authors to the right direction.

For b), such check should be done in build_join_rel, similarly to serverid.

For c), we can reuse the check about RelOptInfo#fdwroutine in
add_paths_to_joinrel, because all of serverid, umid and fdwroutine are
valid only when both the servers and the user mappings match between
outer and inner relations.

Attached are the patches which implement the idea above except
checkAsUser issue.
usermapping_matching.patch: check matching of user mapping OIDs
add_GetUserMappingById.patch: add helper function which is handy for
FDWs to obtain UserMapping
foreign_join_v16.patch: postgres_fdw which assumes user mapping
matching is done in core

Another idea about a) is to have an entire UserMapping object for each
RelOptInfo, but it would require UserMapping to have extra field of
its Oid, say umid, to compare them by OID. But IMO creating
UserMapping for every RelOptInfo seems waste of space and time.

Thoughts?
--
Shigeru HANADA

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Shigeru Hanada

Дата:

04 августа 2015 г., 11:50:56

Hi Ashutosh,

Sorry for leaving the thread.

2015-07-20 16:09 GMT+09:00 Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>:
> In find_user_mapping(), if the first cache search returns a valid tuple, it
> is checked twice for validity, un-necessarily. Instead if the first search
> returns a valid tuple, it should be returned immediately. I see that this
> code was copied from GetUserMapping(), where as well it had the same
> problem.

Oops.  I changed find_user_mapping to exit immediately when any valid
cache was found.

> In build_join_rel(), we check whether user mappings of the two joining
> relations are same. If the user mappings are same, doesn't that imply that
> the foreign server and local user are same too?

Yes, validity of umid is identical to serverid.  We can remove the
check for serverid for some cycles.  One idea is to put Assert for
serverid inside the if-statement block.

> Rest of the patch looks fine.

Thanks

-- 
Shigeru HANADA

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

19 августа 2015 г., 15:40:54

On Tue, Aug 4, 2015 at 2:20 PM, Shigeru Hanada <shigeru.hanada@gmail.com> wrote:

Hi Ashutosh,

Sorry for leaving the thread.

2015-07-20 16:09 GMT+09:00 Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>:
> In find_user_mapping(), if the first cache search returns a valid tuple, it
> is checked twice for validity, un-necessarily. Instead if the first search
> returns a valid tuple, it should be returned immediately. I see that this
> code was copied from GetUserMapping(), where as well it had the same
> problem.

Oops. I changed find_user_mapping to exit immediately when any valid
cache was found.

Thanks.

> In build_join_rel(), we check whether user mappings of the two joining
> relations are same. If the user mappings are same, doesn't that imply that
> the foreign server and local user are same too?

Yes, validity of umid is identical to serverid. We can remove the
check for serverid for some cycles. One idea is to put Assert for
serverid inside the if-statement block.

> Rest of the patch looks fine.

Thanks

I started reviewing the other patches.

In patch foreign_join_v16.patch, the user mapping structure being passed to GetConnection() is the one obtained from GetUserMappingById(). GetUserMappingById() constructs the user mapping structure from the user mapping catalog. For public user mappings, catalog entries have InvalidOid as userid. Thus, with this patch there is a chance that userid in UserMapping being passed to GetConnection(), contains InvalidOid as userid. This is not the case today. The UserMapping structure constructed using GetUserMapping(Oid userid, Oid serverid) (which ultimately gets passed to GetConnection()), has the passed in userid and not the one in the catalog. Is this change intentional?

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

02 декабря 2015 г., 14:45:47

Hi All,
It's been a long time since last patch on this thread was posted. I have started
to work on supporting join pushdown for postgres_fdw. Attached please find three
patches
1. pg_fdw_core.patch: changes in core related to user mapping handling, GUC
                    enable_foreignjoin
2. pg_fdw_join.patch: postgres_fdw changes for supporting join pushdown
3. pg_join_pd.patch: patch which combines both of these for easy testing.

This mail describes the high level changes included in the patch.

GUC enable_foreignjoin
======================
Like enable_*join GUCs, this GUC when ON allows planner to consider pushing down
joins to the foreign server. If OFF, foreign join paths will not be considered.

Building joinrel for joins involving foreign relations
======================================================
RelOptInfo gets a new field umid for user mapping id used for the given foreign
relation. For base relations it's set to the user mapping of effective user and
foreign server for that relation. While building RelOptInfo for a join between
two foreign relations, if server and user mapping of joining relations are same,
they are copied to the RelOptInfo of the join relation being built. Also,
fdwroutine is set in joinrel to indicate that the core code considers this join
as pushable. It should suffice just to check equality of the user mapping oid
since same user mapping implies same server.

Since user mapping oid is available in RelOptInfo, FDWs can get user
mapping information by using this oid, new function GetUserMappingById()
facilitates that.

Generating paths
================
If fdwroutine is set in joinrel, add_paths_to_joinrel() calls
GetForeignJoinPaths hook of corresponding FDW. For postgres_fdw the hook is
implemented by function postgresGetForeignJoinPaths(). Follows the description
of this function.

In order to avoid considering same joinrel again, fdw_private member of
RelOptInfo is set by this function and populated with FDW specific information
about joinrel. When this member is set, it indicates that the pushable paths
have been considered for the given joinrel.

A join between two foreign relations is considered safe to push down if
1. The joining sides are pushable
2. The type of join is OUTER or INNER (LEFT/RIGHT/FULL/INNER). SEMI and ANTI
   joins are not considered right now, because of difficulties in constructing
   the queries involving those. The join clauses of SEMI/ANTI joins are not in a
   form that can be readily converted to IN/EXISTS/NOT EXIST kind of expression.
   We might consider this as future optimization.
3. Joining sides do not have clauses which can not be pushed down to the foreign
   server. For an OUTER join this is important since those clauses need to be
   applied before performing the join and thus join can not be pushed to the
   foreign server. An example is
   SELECT * FROM ft1 LEFT JOIN (SELECT * FROM ft2 where local_cond) ft2 ON (join clause)
   Here the local_cond on ft2 needs to be executed before performing LEFT JOIN
   between ft1 and ft2.
   This condition can be relaxed for an INNER join by pulling the local clauses
   up the join tree. But this needs more investigation and is not considered in
   this version.
4. The join conditions (e.g. conditions in ON clause) are all safe to push down.
   This is important for OUTER joins as pushing down join clauses partially and
   applying rest locally changes the result. There are ways [1] by which partial
   OUTER join can be completed by applying unpushable clauses locally and then
   nullifying the nullable side and eliminating duplicate non-nullable side
   rows. But that's again out of scope of first version of postgres_fdw join
   pushdown.

A ForeignPath is created for a safe-to-push-down join. Recursively applying this
procedure ends in having a single ForeignPath node for whole pushable join tree,
which may represent a join between more than two foreign tables.

Generating plan
===============
postgresGetForeignPlan() is used to create plan from path chosen by the
optimizer for both joins and base relation scans. This function constructs the SQL to
be sent to the remote node and create ForeignPlan node.

The function first separates given scan_clauses into pushable and safe-to-push
clauses. For joinrel baserestrictlist is NIL and we are not considering
parameterized paths, so there are no scan clauses expected. Next the function
constructs the SQL to be sent to the foreign server. Then it constructs the
ForeignScan node. Rest of this section describes the logic to construct the SQL
for join; the logic is implemented as function deparseSelectSqlForRel().

deparseSelectSqlForRel() builds the SQL for given joinrel (and now for baserel
asd well) recursively.
For joinrels
1. it constructs SQL representing either side of join, by calling itself
   in recursive fashion.
2. These SQLs are converted into subqueries and become part of the FROM clause
   with appropriate JOIN type and clauses. The left and right subqueries are
   given aliases "l" and "r" respectively. The columns in each subquery are
   aliased as "a1", "a2", "a3" and so on. Thus the third column on left side can
   be referenced as "l.a3" at any recursion level.
3. Targetlist is added representing the columns in the join result expected at
   that level.
4. The join clauses are added as part of ON clause
5. Any clauses that planner has deemed fit to be evaluated at that level of join
   are added as part of WHERE clause.

It uses the same old logic for deparsing SQL for base relations, except for the
deparsing the targetlist. When plan is being constructed only for a base
relation, the targetlist (SELECT clause) is constructed by including all the
columns by looking at attrs_used. This does not work when the base relation is
part of the join being pushed down, since the join targetlist depend upon the
targetlist in RelOptInfo of base relation and not necessarily the targetlist
obtained from attrs_used.

Row marks
---------
Because of recursive nature of SQL, the names of relations referenced in row
marks are not available at the top level in SQL built for a given top level join
relation. Hence we have to add FOR SHARE/UPDATE clauses to the subqueries built
for foreign base relations. This causes all the rows participating in join (not
the join result) from the base relations to get locked (on foreign server).
Ideally for a top level row mark clause, we should be locking only those rows
(on foreign server) which are part of the top level join result. But that
requires flattening of the FROM clause constructed, which would require some
signficant intelligence in the deparser code. I have left this out of scope for
at least this version of the patch.

Examples of remote SQL can be found in the expected output of regression test postgres_fdw.sql.

Foreign plan execution
======================
For pushed down joins, the tuple descriptor for the result is obtained from the
targetlist of the plan. Accordingly the error callback and result handling has
been modified.

Explain output for a pushed down join provides the join expression that it
represents.

TODOs
=====
This patch is very much WIP patch to show case the approach and invite early
comments. I will continue to improve the patch and some of the areas that will
be improved are
1. Costing of foreign join paths.
2. Various TODOs in the patch, making it more readable, finishing etc.
3. Tests
4. Any comments/suggestions on approach or the attached patch.

In another thread Robert, Fujita-san and Kaigai-san are discussing about
EvalPlanQual support for foreign joins. Corresponding changes to postgres_fdw
will need to be added once those changes get committed.

Items that will be considered in subsequent patches for 9.6
===========================================================
1. Parameterized paths for join: For regular joins, the parameterized paths for
   joins consider the parameterization of the joining paths. In case of foreign
   join, we do not consider any specific paths for the joining foreign
   relations and it acts as a scan. It should be treated as if parameterizing a
   scan and not like regular join. This requires some work and even without that
   the functionality supported by this patch is quite useful. So, I have left it
   out of the scope for this patch and will be considered in subsequent patch.

2. Foreign join paths with pathkeys: There's my patch pending for considering
   pathkeys for foreign base relation scan. The support for foreign join paths
   with pathkeys will added once that patch gets committed.

Suggestions/comments are welcome.
--

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Etsuro Fujita

Дата:

03 декабря 2015 г., 10:08:30

Hi Ashutosh,

On 2015/12/02 20:45, Ashutosh Bapat wrote:
> It's been a long time since last patch on this thread was posted. I have
> started
> to work on supporting join pushdown for postgres_fdw.

Thanks for the work!

> Generating paths
> ================

> A join between two foreign relations is considered safe to push down if
> 1. The joining sides are pushable
> 2. The type of join is OUTER or INNER (LEFT/RIGHT/FULL/INNER). SEMI and ANTI
>     joins are not considered right now, because of difficulties in
> constructing
>     the queries involving those. The join clauses of SEMI/ANTI joins are
> not in a
>     form that can be readily converted to IN/EXISTS/NOT EXIST kind of
> expression.
>     We might consider this as future optimization.
> 3. Joining sides do not have clauses which can not be pushed down to the
> foreign
>     server. For an OUTER join this is important since those clauses need
> to be
>     applied before performing the join and thus join can not be pushed
> to the
>     foreign server. An example is
>     SELECT * FROM ft1 LEFT JOIN (SELECT * FROM ft2 where local_cond) ft2
> ON (join clause)
>     Here the local_cond on ft2 needs to be executed before performing
> LEFT JOIN
>     between ft1 and ft2.
>     This condition can be relaxed for an INNER join by pulling the local
> clauses
>     up the join tree. But this needs more investigation and is not
> considered in
>     this version.
> 4. The join conditions (e.g. conditions in ON clause) are all safe to
> push down.
>     This is important for OUTER joins as pushing down join clauses
> partially and
>     applying rest locally changes the result. There are ways [1] by
> which partial
>     OUTER join can be completed by applying unpushable clauses locally
> and then
>     nullifying the nullable side and eliminating duplicate non-nullable side
>     rows. But that's again out of scope of first version of postgres_fdw
> join
>     pushdown.

As for 4, as commented in the patch, we could relax the requirement that 
all the join conditions (given by JoinPathExtraData's restrictlist) need 
to be safe to push down to the remote server;
* In case of inner join, all the conditions would not need to be safe.
* In case of outer join, all the "otherclauses" would not need to be 
safe, while I think all the "joinclauses" need to be safe to get the 
right results (where "joinclauses" and "otherclauses" are defined by 
extract_actual_join_clauses).  And I think we should do this relaxation 
to some extent for 9.6, to allow more joins to be pushed down.  I don't 
know about [1].  May I see more information about [1]?

> Generating plan
> ===============

> Rest of this section describes the logic to construct
> the SQL
> for join; the logic is implemented as function deparseSelectSqlForRel().
>
> deparseSelectSqlForRel() builds the SQL for given joinrel (and now for
> baserel
> asd well) recursively.
> For joinrels
> 1. it constructs SQL representing either side of join, by calling itself
>     in recursive fashion.
> 2. These SQLs are converted into subqueries and become part of the FROM
> clause
>     with appropriate JOIN type and clauses. The left and right
> subqueries are
>     given aliases "l" and "r" respectively. The columns in each subquery are
>     aliased as "a1", "a2", "a3" and so on. Thus the third column on left
> side can
>     be referenced as "l.a3" at any recursion level.
> 3. Targetlist is added representing the columns in the join result
> expected at
>     that level.
> 4. The join clauses are added as part of ON clause
> 5. Any clauses that planner has deemed fit to be evaluated at that level
> of join
>     are added as part of WHERE clause.

Honestly, I'm not sure that that is a good idea.  One reason for that is 
that a query string constructed by the procedure is difficult to read 
especially when the procedure is applied recursively.  So, I'm thinking 
to revise the procedure so as to construct a query string with a 
flattened FROM clause, as discussed in eg, [2].

> TODOs
> =====
> This patch is very much WIP patch to show case the approach and invite early
> comments. I will continue to improve the patch and some of the areas
> that will
> be improved are
> 1. Costing of foreign join paths.
> 2. Various TODOs in the patch, making it more readable, finishing etc.
> 3. Tests
> 4. Any comments/suggestions on approach or the attached patch.

That would be great!

> In another thread Robert, Fujita-san and Kaigai-san are discussing about
> EvalPlanQual support for foreign joins. Corresponding changes to
> postgres_fdw
> will need to be added once those changes get committed.

Yeah, we would need those changes including helper functions to create a 
local join execution plan for that support.  I'd like to add those 
changes to your updated patch if it's okay.

Best regards,
Etsuro Fujita

[2] 
http://www.postgresql.org/message-id/CA+TgmoZH9PB8BC+Z3rE7wo8CwuxAF7VP3066iSG39QfR1jJ+UQ@mail.gmail.com

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

08 декабря 2015 г., 11:28:00

On Thu, Dec 3, 2015 at 12:36 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:

Hi Ashutosh,

On 2015/12/02 20:45, Ashutosh Bapat wrote:
It's been a long time since last patch on this thread was posted. I have
started
to work on supporting join pushdown for postgres_fdw.

Thanks for the work!

Generating paths
================

A join between two foreign relations is considered safe to push down if
1. The joining sides are pushable
2. The type of join is OUTER or INNER (LEFT/RIGHT/FULL/INNER). SEMI and ANTI
joins are not considered right now, because of difficulties in
constructing
the queries involving those. The join clauses of SEMI/ANTI joins are
not in a
form that can be readily converted to IN/EXISTS/NOT EXIST kind of
expression.
We might consider this as future optimization.
3. Joining sides do not have clauses which can not be pushed down to the
foreign
server. For an OUTER join this is important since those clauses need
to be
applied before performing the join and thus join can not be pushed
to the
foreign server. An example is
SELECT * FROM ft1 LEFT JOIN (SELECT * FROM ft2 where local_cond) ft2
ON (join clause)
Here the local_cond on ft2 needs to be executed before performing
LEFT JOIN
between ft1 and ft2.
This condition can be relaxed for an INNER join by pulling the local
clauses
up the join tree. But this needs more investigation and is not
considered in
this version.
4. The join conditions (e.g. conditions in ON clause) are all safe to
push down.
This is important for OUTER joins as pushing down join clauses
partially and
applying rest locally changes the result. There are ways [1] by
which partial
OUTER join can be completed by applying unpushable clauses locally
and then
nullifying the nullable side and eliminating duplicate non-nullable side
rows. But that's again out of scope of first version of postgres_fdw
join
pushdown.

As for 4, as commented in the patch, we could relax the requirement that all the join conditions (given by JoinPathExtraData's restrictlist) need to be safe to push down to the remote server;
* In case of inner join, all the conditions would not need to be safe.
* In case of outer join, all the "otherclauses" would not need to be safe, while I think all the "joinclauses" need to be safe to get the right results (where "joinclauses" and "otherclauses" are defined by extract_actual_join_clauses). And I think we should do this relaxation to some extent for 9.6, to allow more joins to be pushed down.

agreed. I will work on those.

I don't know about [1]. May I see more information about [1]?

Generating plan
===============

Rest of this section describes the logic to construct
the SQL
for join; the logic is implemented as function deparseSelectSqlForRel().

deparseSelectSqlForRel() builds the SQL for given joinrel (and now for
baserel
asd well) recursively.
For joinrels
1. it constructs SQL representing either side of join, by calling itself
in recursive fashion.
2. These SQLs are converted into subqueries and become part of the FROM
clause
with appropriate JOIN type and clauses. The left and right
subqueries are
given aliases "l" and "r" respectively. The columns in each subquery are
aliased as "a1", "a2", "a3" and so on. Thus the third column on left
side can
be referenced as "l.a3" at any recursion level.
3. Targetlist is added representing the columns in the join result
expected at
that level.
4. The join clauses are added as part of ON clause
5. Any clauses that planner has deemed fit to be evaluated at that level
of join
are added as part of WHERE clause.

Honestly, I'm not sure that that is a good idea. One reason for that is that a query string constructed by the procedure is difficult to read especially when the procedure is applied recursively. So, I'm thinking to revise the procedure so as to construct a query string with a flattened FROM clause, as discussed in eg, [2].

Just to confirm, the hook discussed in [2] is not in place right? I can find only one hook for foreign join
50 typedef void (*GetForeignJoinPaths_function) (PlannerInfo *root,
51                                                           RelOptInfo *joinrel,
52                                                         RelOptInfo *outerrel,
53                                                         RelOptInfo *innerrel,
54                                                           JoinType jointype,
55                                                    JoinPathExtraData *extra);

This hook takes an inner and outer relation, so can not be used for N-way join as discussed in that thread.

Are you suggesting that we should add that hook before we implement join pushdown in postgres_fdw? Am I missing something?

TODOs
=====
This patch is very much WIP patch to show case the approach and invite early
comments. I will continue to improve the patch and some of the areas
that will
be improved are
1. Costing of foreign join paths.
2. Various TODOs in the patch, making it more readable, finishing etc.
3. Tests
4. Any comments/suggestions on approach or the attached patch.

That would be great!

In another thread Robert, Fujita-san and Kaigai-san are discussing about
EvalPlanQual support for foreign joins. Corresponding changes to
postgres_fdw
will need to be added once those changes get committed.

Yeah, we would need those changes including helper functions to create a local join execution plan for that support. I'd like to add those changes to your updated patch if it's okay.

Right now, we do not have any support for postgres_fdw join pushdown. I was thinking of adding at least minimal support for the same using this patch, may be by preventing join pushdown in case there are row marks for now. That way, we at least have some way to play with postgres_fdw join pushdown. Once we have that, we can work on remaining items listed for 9.6 and also you can add suport for row marks with fix for EvalPlanQual independently. This will keep the first patch smaller. Do you agree or you want to see EvalPlanQual fix to be in the first patch itself?

Best regards,
Etsuro Fujita

[2] http://www.postgresql.org/message-id/CA+TgmoZH9PB8BC+Z3rE7wo8CwuxAF7VP3066iSG39QfR1jJ+UQ@mail.gmail.com

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Etsuro Fujita

Дата:

08 декабря 2015 г., 14:43:31

On 2015/12/08 17:27, Ashutosh Bapat wrote:
> On Thu, Dec 3, 2015 at 12:36 PM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote:

>         Generating paths
>         ================

>         A join between two foreign relations is considered safe to push
>         down if

>         4. The join conditions (e.g. conditions in ON clause) are all
>         safe to
>         push down.
>              This is important for OUTER joins as pushing down join clauses
>         partially and
>              applying rest locally changes the result. There are ways [1] by
>         which partial
>              OUTER join can be completed by applying unpushable clauses
>         locally
>         and then
>              nullifying the nullable side and eliminating duplicate
>         non-nullable side
>              rows. But that's again out of scope of first version of
>         postgres_fdw
>         join
>              pushdown.

>     As for 4, as commented in the patch, we could relax the requirement
>     that all the join conditions (given by JoinPathExtraData's
>     restrictlist) need to be safe to push down to the remote server;
>     * In case of inner join, all the conditions would not need to be safe.
>     * In case of outer join, all the "otherclauses" would not need to be
>     safe, while I think all the "joinclauses" need to be safe to get the
>     right results (where "joinclauses" and "otherclauses" are defined by
>     extract_actual_join_clauses).  And I think we should do this
>     relaxation to some extent for 9.6, to allow more joins to be pushed
>     down.

> agreed. I will work on those.

Great!

>         Generating plan
>         ===============

>         Rest of this section describes the logic to construct
>         the SQL
>         for join; the logic is implemented as function
>         deparseSelectSqlForRel().
>
>         deparseSelectSqlForRel() builds the SQL for given joinrel (and
>         now for
>         baserel
>         asd well) recursively.
>         For joinrels
>         1. it constructs SQL representing either side of join, by
>         calling itself
>              in recursive fashion.
>         2. These SQLs are converted into subqueries and become part of
>         the FROM
>         clause
>              with appropriate JOIN type and clauses. The left and right
>         subqueries are
>              given aliases "l" and "r" respectively. The columns in each
>         subquery are
>              aliased as "a1", "a2", "a3" and so on. Thus the third
>         column on left
>         side can
>              be referenced as "l.a3" at any recursion level.
>         3. Targetlist is added representing the columns in the join result
>         expected at
>              that level.
>         4. The join clauses are added as part of ON clause
>         5. Any clauses that planner has deemed fit to be evaluated at
>         that level
>         of join
>              are added as part of WHERE clause.

>     Honestly, I'm not sure that that is a good idea.  One reason for
>     that is that a query string constructed by the procedure is
>     difficult to read especially when the procedure is applied
>     recursively.  So, I'm thinking to revise the procedure so as to
>     construct a query string with a flattened FROM clause, as discussed
>     in eg, [2].

> Just to confirm, the hook discussed in [2] is not in place right? I can
> find only one hook for foreign join
>   50 typedef void (*GetForeignJoinPaths_function) (PlannerInfo *root,
>   51
> RelOptInfo *joinrel,
>   52                                                         RelOptInfo
> *outerrel,
>   53                                                         RelOptInfo
> *innerrel,
>   54                                                           JoinType
> jointype,
>   55
> JoinPathExtraData *extra);
> This hook takes an inner and outer relation, so can not be used for
> N-way join as discussed in that thread.
>
> Are you suggesting that we should add that hook before we implement join
> pushdown in postgres_fdw? Am I missing something?

I don't mean it.  I'm thinking that I'll just revise the procedure so as 
to generate a FROM clause that is something like "from c left join d on 
(...) full join e on (...)" based on the existing hook you mentioned.

>         TODOs
>         =====

>         In another thread Robert, Fujita-san and Kaigai-san are
>         discussing about
>         EvalPlanQual support for foreign joins. Corresponding changes to
>         postgres_fdw
>         will need to be added once those changes get committed.

>     Yeah, we would need those changes including helper functions to
>     create a local join execution plan for that support.  I'd like to
>     add those changes to your updated patch if it's okay.

> Right now, we do not have any support for postgres_fdw join pushdown. I
> was thinking of adding at least minimal support for the same using this
> patch, may be by preventing join pushdown in case there are row marks
> for now. That way, we at least have some way to play with postgres_fdw
> join pushdown. Once we have that, we can work on remaining items listed
> for 9.6 and also you can add suport for row marks with fix for
> EvalPlanQual independently. This will keep the first patch smaller. Do
> you agree or you want to see EvalPlanQual fix to be in the first patch
> itself?

IMO I want to see the EvalPlanQual fix in the first version for 9.6.

Best regards,
Etsuro Fujita

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

10 декабря 2015 г., 20:50:12

On Tue, Dec 8, 2015 at 6:40 AM, Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp> wrote:
> IMO I want to see the EvalPlanQual fix in the first version for 9.6.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

11 декабря 2015 г., 01:11:49

On Wed, Dec 2, 2015 at 6:45 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> It's been a long time since last patch on this thread was posted. I have
> started
> to work on supporting join pushdown for postgres_fdw. Attached please find
> three
> patches
> 1. pg_fdw_core.patch: changes in core related to user mapping handling, GUC
>                       enable_foreignjoin
> 2. pg_fdw_join.patch: postgres_fdw changes for supporting join pushdown

It seems useful to break things up this way.  However, I'm not sure we
want an enable_foreignjoin GUC; in fact, I think we probably don't.
If we want to have a way to control this, postgres_fdw can provide a
custom GUC or FDW option for that.

And to be honest, I haven't really been able to understand why join
pushdown needs changes to user mapping handling.  Just hypothetically
speaking, if I put my foot down and said we're not committing any of
that stuff, how and why would that impact our ability to have join
pushdown in 9.6?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

11 декабря 2015 г., 08:16:36

On Thu, Dec 10, 2015 at 11:20 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Dec 8, 2015 at 6:40 AM, Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp> wrote:
> IMO I want to see the EvalPlanQual fix in the first version for 9.6.

+1.

I think there is still a lot functionality that is offered without EvalPlanQual fix. As long as we do not push joins when there are RowMarks involved, implementation of that hook is not required. We won't be able to push down joins for DMLs and when there are FOR SHARE/UPDATE clauses in the query. And there are huge number of queries, which will be benefitted by the push down even without that support. There's nothing in this patch, which comes in way of implementing the EvalPlanQual fix. It can be easily added after committing the first version. On the other hand, getting minimal (it's not really minimal, it's much more than that) support for postgres_fdw support committed opens up possibility to work on multiple items (as listed in my mail) in parallel.

I am not saying that we do not need EvalPlanQual fix in 9.6. But it's not needed in the first cut. If we get the first cut in first couple of months of 2016, there's plenty of room for the fix to go in 9.6. It would be really bad situation if we could not get postgres_fdw join pushdown supported in 9.6 because EvalPlanQual hook could not be committed while the rest of the code is ready. EvalPlanQual fix in core was being discussed since April 2015. It took 8 months to get that fixed. Hopefully we won't need that long to implement the hook in postgres_fdw, but that number says something about the complexity of the feature.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

11 декабря 2015 г., 09:34:18

On Fri, Dec 11, 2015 at 3:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Dec 2, 2015 at 6:45 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> It's been a long time since last patch on this thread was posted. I have
> started
> to work on supporting join pushdown for postgres_fdw. Attached please find
> three
> patches
> 1. pg_fdw_core.patch: changes in core related to user mapping handling, GUC
> enable_foreignjoin
> 2. pg_fdw_join.patch: postgres_fdw changes for supporting join pushdown

It seems useful to break things up this way. However, I'm not sure we
want an enable_foreignjoin GUC; in fact, I think we probably don't.
If we want to have a way to control this, postgres_fdw can provide a
custom GUC or FDW option for that.

enable_foreignjoin or its FDW counterpart would be useful for debugging purposes just like enable_hashjoin/enable_mergejoin etc. Foreign join push down can be viewed as a join strategy just like merge/nest loop/hash join etc. Having enable_foreignjoin complements that picture. Users find more usage of the same. A user running multiple FDWs and needing to disable join pushdown across FDWs for any purpose would find enable_foreignjoin very useful. Needing to turn on/off multiple GUCs would be cumbersome.

And to be honest, I haven't really been able to understand why join
pushdown needs changes to user mapping handling.

Current join pushdown infrastructure in core allows join to be pushed down if both the sides of join come from the same server. Those sides may have different user mappings and thus different user properties/access permissions/visibility on the foreign server. If FDW chooses either of these different user mappings to push down the join, it will get wrong results. So, a join between two foreign relations can not be pushed down if the user mappings on both sides do not match. This has been already discussed in this thread. I am pasting your response to Hanada-san back in May 2015 as hint to the discussion
--
2015-05-21 23:11 GMT+09:00 Robert Haas <robertmhaas@gmail.com>:
> On Sat, May 16, 2015 at 9:04 AM, Shigeru Hanada
> <shigeru.hanada@gmail.com> wrote:
>> d) All relations must have the same effective user id
>> This check is done with userid stored in PgFdwRelationInfo, which is
>> valid only when underlying relations have the same effective user id.
>> Here "effective user id" means id of the user executing the query, or
>> the owner of the view when the foreign table is accessed via view.
>> Tom suggested that it is necessary to check that user mapping matches
>> for security reason, and now I think it's better than checking
>> effective user as current postgres_fdw does.
>
> So, should this be a separate patch?
--

To add to that, checking user mapping is better than checking the effective user id for multiple reasons. Multiple local users can share same public user mapping, which implies that they share same permissions/visibility for objects on the foreign server. Join involving two sides with different effective local user but same user mapping can be pushed down to the foreign server as same objects/data is going to visible where or not we push the join down.

Just hypothetically
speaking, if I put my foot down and said we're not committing any of
that stuff, how and why would that impact our ability to have join

pushdown in 9.6?

In fact, the question would be what user mapping should be used by the connection on which we are firing join query. Unless we answer that question we won't have join pushdown in 9.6. If we push down joins without taking into consideration the user mapping, we will have all sorts of security/data visibility problems because of wrong user properties used for connection.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Etsuro Fujita

Дата:

11 декабря 2015 г., 12:51:11

On 2015/12/11 14:16, Ashutosh Bapat wrote:
> On Thu, Dec 10, 2015 at 11:20 PM, Robert Haas <robertmhaas@gmail.com
> <mailto:robertmhaas@gmail.com>> wrote:

>     On Tue, Dec 8, 2015 at 6:40 AM, Etsuro Fujita
>     <fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>>
>     wrote:
>     > IMO I want to see the EvalPlanQual fix in the first version for 9.6.
>
>     +1.

> I think there is still a lot functionality that is offered without
> EvalPlanQual fix. As long as we do not push joins when there are
> RowMarks involved, implementation of that hook is not required. We won't
> be able to push down joins for DMLs and when there are FOR SHARE/UPDATE
> clauses in the query. And there are huge number of queries, which will
> be benefitted by the push down even without that support. There's
> nothing in this patch, which comes in way of implementing the
> EvalPlanQual fix. It can be easily added after committing the first
> version. On the other hand, getting minimal (it's not really minimal,
> it's much more than that) support for postgres_fdw support committed
> opens up possibility to work on multiple items (as listed in my mail) in
> parallel.

> I am not saying that we do not need EvalPlanQual fix in 9.6. But it's
> not needed in the first cut. If we get the first cut in first couple of
> months of 2016, there's plenty of room for the fix to go in 9.6. It
> would be really bad situation if we could not get postgres_fdw join
> pushdown supported in 9.6 because EvalPlanQual hook could not be
> committed while the rest of the code is ready. EvalPlanQual fix in core
> was being discussed since April 2015. It took 8 months to get that
> fixed. Hopefully we won't need that long to implement the hook in
> postgres_fdw, but that number says something about the complexity of the
> feature.

ISTM that further enhancements are of secondary importance. Let's do the 
EvalPlanQual fix first. I'll add the RecheckForeignScan callback routine 
to your version of the postgres_fdw patch as soon as possible.

Best regards,
Etsuro Fujita

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

11 декабря 2015 г., 16:56:09

On Fri, Dec 11, 2015 at 12:16 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
>> +1.
>>
> I think there is still a lot functionality that is offered without
> EvalPlanQual fix.

Sure.  But I think that the EvalPlanQual-related fixes might have some
impact on the overall design, and I don't want to commit this with one
design and then have to revise it because we didn't examine the
EvalPlanQual requirements carefully enough.  We've already been down
that path once, and I don't want to go back.  It's not always possible
to get the design right the first time, but it's definitely nicer when
you do.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

18 января 2016 г., 13:46:44

Hi All,

PFA patches for postgres_fdw join pushdown, taken care of all TODOs in my last mail.

Here is the list of things that have been improved/added new as compared to Hanada-san's previous patch at [1].

1. Condition handling for join
Patch in [1] allowed a foreign join to be pushed down if only all the conditions were safe to push down to the foreign server. This patch differentiates these conditions into 1. conditions to be applied while joining (ON clause) 2. conditions to be applied after joining (WHERE clause). For a join to be safe to pushdown, only conditions in 1 need to be all safe to pushdown. The conditions in second category, which are not safe to be pushed down can be applied locally. This allows more avenue for join pushdown. For an INNER join all the conditions can be applied on the cross product. Hence we can push down an INNER join even if one or more of the conditions are not safe to be pushed down. This patch includes the optimization as well.

2. Targetlist handling:
The columns required to evaluate the non-pushable conditions on a join relation need to be fetched from the foreign server. In previous patch the SELECT clauses were built from rel->reltargetlist, which doesn't contain these columns. This patch includes those columns as well.

3. Column projection:
Earlier patch required another layer of SQL to project whole-row attribute from a base relation. This patch takes care of that while constructing and deparsing
targetlist. This reduces the complexity and length of the query to be sent to the foreign server e.g.

With the projection in previous patch the query looked like
EXPLAIN (COSTS false, VERBOSE)
SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;

QUERY PLAN

... explain output clipped
               Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, r.a1 FROM (SELECT l.a7, ROW(l.a10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17), l.a10, l.a12 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17, ctid a7 FROM "S 1"."T 1") l) l (a1, a2, a3, a4) INNER JOIN (SELECT ROW(r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a9 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1") r) r (a1, a2) ON ((l.a3 = r.a2))

With this patch it looks like
EXPLAIN (COSTS false, VERBOSE)
SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
                                               QUERY PLAN
... explain output clipped
               Remote SQL: SELECT l.a3, l.a4, l.a1, l.a2, r.a2 FROM (SELECT "C 1", c3, ctid, ROW("C 1", c2, c3, c4, c5, c6, c7, c8) FROM "S 1"."T 1") l (a1, a2, a3, a4) INNER JOIN (SELECT "C 1", ROW("C 1", c2, c3, c4, c5, c6, c7, c8) FROM "S 1"."T 1") r (a1, a2) ON (TRUE) WHERE ((l.a1 = r.a1))
(9 rows)

4. Local cost estimation
Previous patch had a TODO left for estimating join cost locally, when use_remote_estimate is false. This patch adds support for the same. The relevant
discussion in mail thread [2], [3].

5. This patch adds a GUC enable_foreignjoin to enable or disable join pushdown through core.

6. Added more tests to test lateral references, unsafe to push conditions at various places in the query,

Many cosmetic improvements like adding static function declarations, comment improvements and making code readable.

[1] http://www.postgresql.org/message-id/CAEZqfEe9KGy=1_waGh2rgZPg0o4pqgD+iauYaj8wTze+CYJUHg@mail.gmail.com
[2] http://www.postgresql.org/message-id/CAFjFpRcqSwUs+tb5iyp1M3c-w0k3xaB6H5mw4+N2q2iuAfSzKA@mail.gmail.com
[3] http://www.postgresql.org/message-id/CAFjFpRepSC2e3mZ1uYSopJD6R19fOZ0dNNf9Z=gnyKSB6wGk5g@mail.gmail.com

I will be working next on (in that order)

1. eval_plan_qual fix for foreign join. (Considered as a must-have for 9.6)

2. Pushing down ORDER BY clause along with join pushdown

3. Parameterization of foreign join paths (Given the complexity of the feature this may not make it into 9.6)

--

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Thom Brown

Дата:

18 января 2016 г., 14:41:21

On 18 January 2016 at 10:46, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> Hi All,
> PFA patches for postgres_fdw join pushdown, taken care of all TODOs in my
> last mail.
>
> Here is the list of things that have been improved/added new as compared to
> Hanada-san's previous patch at [1].
>
> 1. Condition handling for join
> Patch in [1] allowed a foreign join to be pushed down if only all the
> conditions were safe to push down to the foreign server. This patch
> differentiates these conditions into 1. conditions to be applied while
> joining (ON clause) 2. conditions to be applied after joining (WHERE
> clause). For a join to be safe to pushdown, only conditions in 1 need to be
> all safe to pushdown. The conditions in second category, which are not safe
> to be pushed down can be applied locally. This allows more avenue for join
> pushdown. For an INNER join all the conditions can be applied on the cross
> product. Hence we can push down an INNER join even if one or more of the
> conditions are not safe to be pushed down. This patch includes the
> optimization as well.
>
> 2. Targetlist handling:
> The columns required to evaluate the non-pushable conditions on a join
> relation need to be fetched from the foreign server. In previous patch the
> SELECT clauses were built from rel->reltargetlist, which doesn't contain
> these columns. This patch includes those columns as well.
>
> 3. Column projection:
> Earlier patch required another layer of SQL to project whole-row attribute
> from a base relation. This patch takes care of that while constructing and
> deparsing
> targetlist. This reduces the complexity and length of the query to be sent
> to the foreign server e.g.
>
> With the projection in previous patch the query looked like
> EXPLAIN (COSTS false, VERBOSE)
> SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)
> ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
>                                               QUERY PLAN
> ... explain output clipped
>                Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, r.a1 FROM (SELECT
> l.a7, ROW(l.a10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17), l.a10,
> l.a12 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7
> a16, c8 a17, ctid a7 FROM "S 1"."T 1") l) l (a1, a2, a3, a4) INNER JOIN
> (SELECT ROW(r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a9
> FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8
> a17 FROM "S 1"."T 1") r) r (a1, a2) ON ((l.a3 = r.a2))
>
> With this patch it looks like
> EXPLAIN (COSTS false, VERBOSE)
> SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)
> ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
>                                                QUERY PLAN
> ... explain output clipped
>                Remote SQL: SELECT l.a3, l.a4, l.a1, l.a2, r.a2 FROM (SELECT
> "C 1", c3, ctid, ROW("C 1", c2, c3, c4, c5, c6, c7, c8) FROM "S 1"."T 1") l
> (a1, a2, a3, a4) INNER JOIN (SELECT "C 1", ROW("C 1", c2, c3, c4, c5, c6,
> c7, c8) FROM "S 1"."T 1") r (a1, a2) ON (TRUE) WHERE ((l.a1 = r.a1))
> (9 rows)
>
> 4. Local cost estimation
> Previous patch had a TODO left for estimating join cost locally, when
> use_remote_estimate is false. This patch adds support for the same. The
> relevant
> discussion in mail thread [2], [3].
>
> 5. This patch adds a GUC enable_foreignjoin to enable or disable join
> pushdown through core.
>
> 6. Added more tests to test lateral references, unsafe to push conditions at
> various places in the query,
>
> Many cosmetic improvements like adding static function declarations, comment
> improvements and making code readable.
>
> [1]
> http://www.postgresql.org/message-id/CAEZqfEe9KGy=1_waGh2rgZPg0o4pqgD+iauYaj8wTze+CYJUHg@mail.gmail.com
> [2]
> http://www.postgresql.org/message-id/CAFjFpRcqSwUs+tb5iyp1M3c-w0k3xaB6H5mw4+N2q2iuAfSzKA@mail.gmail.com
> [3]
> http://www.postgresql.org/message-id/CAFjFpRepSC2e3mZ1uYSopJD6R19fOZ0dNNf9Z=gnyKSB6wGk5g@mail.gmail.com
>
> I will be working next on (in that order)
> 1. eval_plan_qual fix for foreign join. (Considered as a must-have for 9.6)
> 2. Pushing down ORDER BY clause along with join pushdown
> 3. Parameterization of foreign join paths (Given the complexity of the
> feature this may not make it into 9.6)

It seems you forgot to attach the patch.

Thom

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

18 января 2016 г., 14:48:09

Thanks Thom for bringing it to my notice quickly. Sorry for the same.

Here are the patches.

1. pg_fdw_core_v2.patch: changes in core related to user mapping handling, GUC
enable_foreignjoin
2. pg_fdw_join_v2.patch: postgres_fdw changes for supporting join pushdown
3. pg_join_pd_v2.patch: patch which combines both of these for easy testing.

On Mon, Jan 18, 2016 at 5:10 PM, Thom Brown <thom@linux.com> wrote:

On 18 January 2016 at 10:46, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> Hi All,
> PFA patches for postgres_fdw join pushdown, taken care of all TODOs in my
> last mail.
>
> Here is the list of things that have been improved/added new as compared to
> Hanada-san's previous patch at [1].
>
> 1. Condition handling for join
> Patch in [1] allowed a foreign join to be pushed down if only all the
> conditions were safe to push down to the foreign server. This patch
> differentiates these conditions into 1. conditions to be applied while
> joining (ON clause) 2. conditions to be applied after joining (WHERE
> clause). For a join to be safe to pushdown, only conditions in 1 need to be
> all safe to pushdown. The conditions in second category, which are not safe
> to be pushed down can be applied locally. This allows more avenue for join
> pushdown. For an INNER join all the conditions can be applied on the cross
> product. Hence we can push down an INNER join even if one or more of the
> conditions are not safe to be pushed down. This patch includes the
> optimization as well.
>
> 2. Targetlist handling:
> The columns required to evaluate the non-pushable conditions on a join
> relation need to be fetched from the foreign server. In previous patch the
> SELECT clauses were built from rel->reltargetlist, which doesn't contain
> these columns. This patch includes those columns as well.
>
> 3. Column projection:
> Earlier patch required another layer of SQL to project whole-row attribute
> from a base relation. This patch takes care of that while constructing and
> deparsing
> targetlist. This reduces the complexity and length of the query to be sent
> to the foreign server e.g.
>
> With the projection in previous patch the query looked like
> EXPLAIN (COSTS false, VERBOSE)
> SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)
> ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
> QUERY PLAN
> ... explain output clipped
> Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, r.a1 FROM (SELECT
> l.a7, ROW(l.a10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17), l.a10,
> l.a12 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7
> a16, c8 a17, ctid a7 FROM "S 1"."T 1") l) l (a1, a2, a3, a4) INNER JOIN
> (SELECT ROW(r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a9
> FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8
> a17 FROM "S 1"."T 1") r) r (a1, a2) ON ((l.a3 = r.a2))
>
> With this patch it looks like
> EXPLAIN (COSTS false, VERBOSE)
> SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)
> ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
> QUERY PLAN
> ... explain output clipped
> Remote SQL: SELECT l.a3, l.a4, l.a1, l.a2, r.a2 FROM (SELECT
> "C 1", c3, ctid, ROW("C 1", c2, c3, c4, c5, c6, c7, c8) FROM "S 1"."T 1") l
> (a1, a2, a3, a4) INNER JOIN (SELECT "C 1", ROW("C 1", c2, c3, c4, c5, c6,
> c7, c8) FROM "S 1"."T 1") r (a1, a2) ON (TRUE) WHERE ((l.a1 = r.a1))
> (9 rows)
>
> 4. Local cost estimation
> Previous patch had a TODO left for estimating join cost locally, when
> use_remote_estimate is false. This patch adds support for the same. The
> relevant
> discussion in mail thread [2], [3].
>
> 5. This patch adds a GUC enable_foreignjoin to enable or disable join
> pushdown through core.
>
> 6. Added more tests to test lateral references, unsafe to push conditions at
> various places in the query,
>
> Many cosmetic improvements like adding static function declarations, comment
> improvements and making code readable.
>
> [1]
> http://www.postgresql.org/message-id/CAEZqfEe9KGy=1_waGh2rgZPg0o4pqgD+iauYaj8wTze+CYJUHg@mail.gmail.com
> [2]
> http://www.postgresql.org/message-id/CAFjFpRcqSwUs+tb5iyp1M3c-w0k3xaB6H5mw4+N2q2iuAfSzKA@mail.gmail.com
> [3]
> http://www.postgresql.org/message-id/CAFjFpRepSC2e3mZ1uYSopJD6R19fOZ0dNNf9Z=gnyKSB6wGk5g@mail.gmail.com
>
> I will be working next on (in that order)
> 1. eval_plan_qual fix for foreign join. (Considered as a must-have for 9.6)
> 2. Pushing down ORDER BY clause along with join pushdown
> 3. Parameterization of foreign join paths (Given the complexity of the
> feature this may not make it into 9.6)

It seems you forgot to attach the patch.

Thom

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Etsuro Fujita

Дата:

19 января 2016 г., 10:36:07

On 2016/01/18 19:46, Ashutosh Bapat wrote:
> PFA patches for postgres_fdw join pushdown, taken care of all TODOs in
> my last mail.
>
> Here is the list of things that have been improved/added new as compared
> to Hanada-san's previous patch at [1].

Great!  Thank you for working on that!  I'll review the patch.

> I will be working next on (in that order)
> 1. eval_plan_qual fix for foreign join. (Considered as a must-have for 9.6)
> 2. Pushing down ORDER BY clause along with join pushdown
> 3. Parameterization of foreign join paths (Given the complexity of the
> feature this may not make it into 9.6)

As discussed internally, I think #3 might have some impact on the 
overall design of the EvalPlanQual fix, especially the design of a 
helper function that creates a local join execution path for a foreign 
join path for EvalPlanQual.  So, IMO, I think the order is #1, #3, and 
#2 (or #3, #1, #2).

Best regards,
Etsuro Fujita

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

20 января 2016 г., 00:39:28

On Wed, Aug 19, 2015 at 8:40 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> I started reviewing the other patches.
>
> In patch foreign_join_v16.patch, the user mapping structure being passed to
> GetConnection() is the one obtained from GetUserMappingById().
> GetUserMappingById() constructs the user mapping structure from the user
> mapping catalog. For public user mappings, catalog entries have InvalidOid
> as userid. Thus, with this patch there is a chance that userid in
> UserMapping being passed to GetConnection(), contains InvalidOid as userid.
> This is not the case today. The UserMapping structure constructed using
> GetUserMapping(Oid userid, Oid serverid) (which ultimately gets passed to
> GetConnection()), has the passed in userid and not the one in the catalog.
> Is this change intentional?

This point seems not to have been addressed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

20 января 2016 г., 02:28:23

On Mon, Jan 18, 2016 at 6:47 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> Thanks Thom for bringing it to my notice quickly. Sorry for the same.
>
> Here are the patches.
>
> 1. pg_fdw_core_v2.patch: changes in core related to user mapping handling,
> GUC
>                       enable_foreignjoin

I tried to whittle this patch down to something that I'd be
comfortable committing and ended up with nothing left.

First, I removed the enable_foreignjoin GUC.  I still think an
FDW-specific GUC is better, and I haven't heard anybody make a strong
argument the other way. Your argument that this might be inconvenient
if somebody is using a bunch of join-pushdown-enabled FDWs sounds like
a strictly theoretical problem, considering how much difficulty we're
having getting even one FDW to support join pushdown.  And if it does
happen, the user can script it.  I'm willing to reconsider this point
if there is a massive chorus of support for having this be a core
option rather than an FDW option, but to me it seems that we've gone
to a lot of trouble to make the system extensible and might as well
get some benefit from it.

Second, I removed the documentation for GetForeignTable().  That
function is already documented and doesn't need re-documenting.

Third, I removed GetUserMappingById().  As mentioned in the email to
which I replied earlier, that doesn't actually produce the same result
as the way we're doing it now, and might leave the user ID invalid.
Even if that were no issue, it doesn't seem to add anything.  The only
caller of the new function is  postgresBeginForeignScan(), and that
function already has a way of getting the user mapping.  The new way
doesn't seem to be better or faster, so why bother changing it?

At this point, I was down to just the changes to store the user
mapping ID (umid) in the RelOptInfo, and to consider join pushdown
only if the user mapping IDs match.  One observation I made is that if
the code to initialize the FDW-related fields were lifted from
get_relation_info() up to build_simple_rel(), we would not need to use
planner_rt_fetch(), because the caller already has that information.
That seems like it might be worth considering.  But then I realized a
more fundamental problem: making the plan depend on the user ID is a
problem, because the user ID can be changed, and the plan might be
cached.  The same issue arises for RLS, but there is provision for
that in RevalidateCachedQuery.  This patch makes no similar provision.

I think there are two ways forward here.  One is to figure out a way
for the plancache to invalidate queries using FDW join pushdown when
the user ID changes.  The other is to recheck at execution time
whether the user mapping IDs still match, and if not, fall back to
using the "backup" plan that we need anyway for EvalPlanQual rechecks.
This would of course mean that the backup plan would need to be
something decently efficient, not just whatever we had nearest to
hand.  But that might not be too hard to manage.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

20 января 2016 г., 16:50:37

On Wed, Jan 20, 2016 at 4:58 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Jan 18, 2016 at 6:47 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> Thanks Thom for bringing it to my notice quickly. Sorry for the same.
>
> Here are the patches.
>
> 1. pg_fdw_core_v2.patch: changes in core related to user mapping handling,
> GUC
> enable_foreignjoin

I tried to whittle this patch down to something that I'd be
comfortable committing and ended up with nothing left.

First, I removed the enable_foreignjoin GUC. I still think an
FDW-specific GUC is better, and I haven't heard anybody make a strong
argument the other way. Your argument that this might be inconvenient
if somebody is using a bunch of join-pushdown-enabled FDWs sounds like
a strictly theoretical problem, considering how much difficulty we're
having getting even one FDW to support join pushdown. And if it does
happen, the user can script it. I'm willing to reconsider this point
if there is a massive chorus of support for having this be a core
option rather than an FDW option, but to me it seems that we've gone
to a lot of trouble to make the system extensible and might as well
get some benefit from it.

Ok. Removed.

Second, I removed the documentation for GetForeignTable(). That
function is already documented and doesn't need re-documenting.

Removed.

Third, I removed GetUserMappingById(). As mentioned in the email to
which I replied earlier, that doesn't actually produce the same result
as the way we're doing it now, and might leave the user ID invalid.

The comment you quoted was my comment :). I never got a reply from Hanada-san on that comment. A bit of investigation revealed this: A pushed down foreign join which involves N foreign tables, might have different effective userid for each of them.
userid = OidIsValid(rte->checkAsUser) ? rte->checkAsUser : GetUserId()

In such a case, AFAIU, the join will be pushed down only if none of those have user mapping and there is public user mapping. Is that right? In such a case, which userid should be stored in UserMapping structure?It might look like setting GetUserId() as userid in UserMapping is wise, but not really. All the foreign tables might have different effective userids, each different from GetUserId() and what GetUserId() would return might have a user mapping different from the effective userids. What userid should UserMapping structure have in that case? I thought, that's why Hanada-san used invalid userid there, so left as it is.

But that has an undesirable effect of setting it to invalid, even in case of base relation or when all the joining relations have same effective userid. We may cache "any" of the effective userids of joining relations if the user mapping matches in build_join_rel() (we will need to add another field userid in RelOptInfo along with umid). While creating the plan use this userid to get user mapping. Does that sound good? This clubbed with the plan cache invalidation should be full proof. We need a way to get user mapping whether from umid (in which case public user mapping will have -1) or serverid and some userid.

Even if that were no issue, it doesn't seem to add anything. The only
caller of the new function is postgresBeginForeignScan(), and that
function already has a way of getting the user mapping. The new way
doesn't seem to be better or faster, so why bother changing it?

At this point, I was down to just the changes to store the user
mapping ID (umid) in the RelOptInfo, and to consider join pushdown
only if the user mapping IDs match. One observation I made is that if
the code to initialize the FDW-related fields were lifted from
get_relation_info() up to build_simple_rel(), we would not need to use
planner_rt_fetch(), because the caller already has that information.
That seems like it might be worth considering. But then I realized a
more fundamental problem: making the plan depend on the user ID is a
problem, because the user ID can be changed, and the plan might be
cached. The same issue arises for RLS, but there is provision for
that in RevalidateCachedQuery. This patch makes no similar provision.

Thanks for the catch. Please see attached patch for a quick fix in RevalidateCachedQuery(). Are you thinking on similar lines? However, I am not sure of planUserId - that field actually puzzles me. It's set when the first time we create a plan and it never changes then. This seems to be a problem, even for RLS, in following scene

prepare statement using RLS feature

execute statement -- now planUserId is set to say user1

set session role user2;

execute statement - RevalidateCachedQuery() detects that the user has changed and invalidates the plan and recreates it (including possibly the query analyze and rewrite). Note planUserId is unchanged here; it's still user1

reset role; -- now GetUserId() returns user1

execute statement - RevalidateCachedQuery() detects that the planUserId and GetUserId is same and doesn't invalidate the plan. Looks like it will execute wrong plan. I am not very familiar with RLS feature, so this analysis can be completely wrong.

If planUserId is not good for our purpose we can always have a different field there, say foreignJoinUserId.

There might be another issue here. A user mapping can change and the plan is cached. Public user mapping was used while planning but before the (cached) plan was executed again, the user mapping for one of the effective users involved was added with different credentials. We should expect the new user mapping to be used for fetching data from foreign tables and thus join becomes unsafe to be pushed down. If this issue exists we should add code to invalidate the plan cache, at least the plans which have pushed down joins.

I think there are two ways forward here. One is to figure out a way
for the plancache to invalidate queries using FDW join pushdown when
the user ID changes.

I think this is better since it provides avenue for join pushdown even in the changed conditions, thus giving better performance if permitted under the changed conditions.

The other is to recheck at execution time
whether the user mapping IDs still match, and if not, fall back to
using the "backup" plan that we need anyway for EvalPlanQual rechecks.
This would of course mean that the backup plan would need to be
something decently efficient, not just whatever we had nearest to
hand. But that might not be too hard to manage.

In this case, we will need to create the backup plan even in those cases where there is no EvalPlanQual involved. I am hesitant to do it that way, unless we are left with no other option.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

20 января 2016 г., 16:53:11

I missed the example plan cache revalidation patch in the previous mail. Sorry. Here it is.

On Wed, Jan 20, 2016 at 7:20 PM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:

On Wed, Jan 20, 2016 at 4:58 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Jan 18, 2016 at 6:47 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> Thanks Thom for bringing it to my notice quickly. Sorry for the same.
>
> Here are the patches.
>
> 1. pg_fdw_core_v2.patch: changes in core related to user mapping handling,
> GUC
> enable_foreignjoin

I tried to whittle this patch down to something that I'd be
comfortable committing and ended up with nothing left.

First, I removed the enable_foreignjoin GUC. I still think an
FDW-specific GUC is better, and I haven't heard anybody make a strong
argument the other way. Your argument that this might be inconvenient
if somebody is using a bunch of join-pushdown-enabled FDWs sounds like
a strictly theoretical problem, considering how much difficulty we're
having getting even one FDW to support join pushdown. And if it does
happen, the user can script it. I'm willing to reconsider this point
if there is a massive chorus of support for having this be a core
option rather than an FDW option, but to me it seems that we've gone
to a lot of trouble to make the system extensible and might as well
get some benefit from it.

Ok. Removed.

Second, I removed the documentation for GetForeignTable(). That
function is already documented and doesn't need re-documenting.

Removed.

Third, I removed GetUserMappingById(). As mentioned in the email to
which I replied earlier, that doesn't actually produce the same result
as the way we're doing it now, and might leave the user ID invalid.

The comment you quoted was my comment :). I never got a reply from Hanada-san on that comment. A bit of investigation revealed this: A pushed down foreign join which involves N foreign tables, might have different effective userid for each of them.
userid = OidIsValid(rte->checkAsUser) ? rte->checkAsUser : GetUserId()
In such a case, AFAIU, the join will be pushed down only if none of those have user mapping and there is public user mapping. Is that right? In such a case, which userid should be stored in UserMapping structure?It might look like setting GetUserId() as userid in UserMapping is wise, but not really. All the foreign tables might have different effective userids, each different from GetUserId() and what GetUserId() would return might have a user mapping different from the effective userids. What userid should UserMapping structure have in that case? I thought, that's why Hanada-san used invalid userid there, so left as it is.

But that has an undesirable effect of setting it to invalid, even in case of base relation or when all the joining relations have same effective userid. We may cache "any" of the effective userids of joining relations if the user mapping matches in build_join_rel() (we will need to add another field userid in RelOptInfo along with umid). While creating the plan use this userid to get user mapping. Does that sound good? This clubbed with the plan cache invalidation should be full proof. We need a way to get user mapping whether from umid (in which case public user mapping will have -1) or serverid and some userid.

Even if that were no issue, it doesn't seem to add anything. The only
caller of the new function is postgresBeginForeignScan(), and that
function already has a way of getting the user mapping. The new way
doesn't seem to be better or faster, so why bother changing it?

At this point, I was down to just the changes to store the user
mapping ID (umid) in the RelOptInfo, and to consider join pushdown
only if the user mapping IDs match. One observation I made is that if
the code to initialize the FDW-related fields were lifted from
get_relation_info() up to build_simple_rel(), we would not need to use
planner_rt_fetch(), because the caller already has that information.
That seems like it might be worth considering. But then I realized a
more fundamental problem: making the plan depend on the user ID is a
problem, because the user ID can be changed, and the plan might be
cached. The same issue arises for RLS, but there is provision for
that in RevalidateCachedQuery. This patch makes no similar provision.

Thanks for the catch. Please see attached patch for a quick fix in RevalidateCachedQuery(). Are you thinking on similar lines? However, I am not sure of planUserId - that field actually puzzles me. It's set when the first time we create a plan and it never changes then. This seems to be a problem, even for RLS, in following scene

prepare statement using RLS feature
execute statement -- now planUserId is set to say user1
set session role user2;
execute statement - RevalidateCachedQuery() detects that the user has changed and invalidates the plan and recreates it (including possibly the query analyze and rewrite). Note planUserId is unchanged here; it's still user1
reset role; -- now GetUserId() returns user1
execute statement - RevalidateCachedQuery() detects that the planUserId and GetUserId is same and doesn't invalidate the plan. Looks like it will execute wrong plan. I am not very familiar with RLS feature, so this analysis can be completely wrong.

If planUserId is not good for our purpose we can always have a different field there, say foreignJoinUserId.

There might be another issue here. A user mapping can change and the plan is cached. Public user mapping was used while planning but before the (cached) plan was executed again, the user mapping for one of the effective users involved was added with different credentials. We should expect the new user mapping to be used for fetching data from foreign tables and thus join becomes unsafe to be pushed down. If this issue exists we should add code to invalidate the plan cache, at least the plans which have pushed down joins.

I think there are two ways forward here. One is to figure out a way
for the plancache to invalidate queries using FDW join pushdown when
the user ID changes.

I think this is better since it provides avenue for join pushdown even in the changed conditions, thus giving better performance if permitted under the changed conditions.

The other is to recheck at execution time
whether the user mapping IDs still match, and if not, fall back to
using the "backup" plan that we need anyway for EvalPlanQual rechecks.
This would of course mean that the backup plan would need to be
something decently efficient, not just whatever we had nearest to
hand. But that might not be too hard to manage.

In this case, we will need to create the backup plan even in those cases where there is no EvalPlanQual involved. I am hesitant to do it that way, unless we are left with no other option.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

revalidation.patch

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

20 января 2016 г., 23:13:42

On Wed, Jan 20, 2016 at 8:53 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> I missed the example plan cache revalidation patch in the previous mail.
> Sorry. Here it is.

I see.  Yeah, I don't see a reason offhand why that wouldn't work.
But you need to update src/backend/nodes for the new field in each
place where PlannedStmt or PlannerGlobal is mentioned.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

20 января 2016 г., 23:37:41

On Wed, Jan 20, 2016 at 8:50 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
>> Third, I removed GetUserMappingById().  As mentioned in the email to
>> which I replied earlier, that doesn't actually produce the same result
>> as the way we're doing it now, and might leave the user ID invalid.
>
> The comment you quoted was my comment :). I never got a reply from
> Hanada-san on that comment. A bit of investigation revealed this: A pushed
> down foreign join which involves N foreign tables, might have different
> effective userid for each of them.
> userid = OidIsValid(rte->checkAsUser) ? rte->checkAsUser : GetUserId()
> In such a case, AFAIU, the join will be pushed down only if none of those
> have user mapping and there is public user mapping. Is that right?

Yes, I think that is right.

> In such a
> case, which userid should be stored in UserMapping structure?It might look
> like setting GetUserId() as userid in UserMapping is wise, but not really.
> All the foreign tables might have different effective userids, each
> different from GetUserId() and what GetUserId() would return might have a
> user mapping different from the effective userids. What userid should
> UserMapping structure have in that case? I thought, that's why Hanada-san
> used invalid userid there, so left as it is.

Well, we kind of need to get it right, not just be guessing.

It looks to me as though GetConnection() only uses the user ID as a
cache lookup key.  What it's trying to do is ensure that if user A and
user B need different user mapping options, we don't use the same
connection for both.  So, actually, passing InvalidOid seems like it's
not really a problem here.  It's arguably more correct than what we've
been doing up until now, since it means we cache the connection under
user OID whose options we used, rather than the user OID that asked
about the options.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

21 января 2016 г., 00:33:19

On Mon, Jan 18, 2016 at 6:47 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> 2. pg_fdw_join_v2.patch: postgres_fdw changes for supporting join pushdown

The very first hunk of this patch contains annoying whitespace
changes.  Even if the result is what pgindent is going to do anyway,
such changes should be stripped out of patches for ease of review.  In
this case, though, I'm pretty sure it isn't what pgindent is going to
do, so it's just useless churn.  Please remove all such changes from
the patch.

find_var_pos() looks like it should just be inlined into its only caller.

Node *node = (Node *) var;
TargetListEntry *tle = tlist_member(node, context->outerlist);
if (tle)
{   side = OUTER_ALIAS;   pos = tle->resno;
}
else
{   side = INNER_ALIAS;   tle = tlist_member(node, context->innertlist);   pos = tle->resno;
}

Why are we calling the return value "pos" instead of "resno" as we
typically do elsewhere?

get_jointype_name() would probably be better written as a switch.  On
the flip side, deparseColumnRef() would have a lot less code churn if
it *weren't* written as a switch.

What this patch does to the naming and calling conventions in
deparse.c does not good.  Previously, we had deparseTargetList().
Now, we sometimes use that and sometimes deparseAttrsUsed() for almost
the same thing.  Previously, we had deparseColumnRef(); now we have
both that and deparseJoinVar() doing very similar things.  But in each
case, the function names are not parallel and the calling conventions
are totally different.  Like here:

+               if (context->foreignrel->reloptkind == RELOPT_JOINREL)
+                       deparseJoinVar(node, context);
+               else
+                       deparseColumnRef(buf, node->varno,
node->varattno, context->root);

We pass the buf separately to deparseColumnRef(), but for some reason
not to deparseJoinVar().  I wonder if these functions need to be two
separate things or if the work done by deparseJoinVar() should
actually be part of deparseColumnRef().  But even if it needs to be
separate, I wonder why we can't arrange things so that they get the
same arguments, more or less.

Generally, I think this patch is on the right track, but I think
there's a good bit of work to be done to make it clearer and more
understandable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

21 января 2016 г., 08:47:33

On Thu, Jan 21, 2016 at 2:07 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Well, we kind of need to get it right, not just be guessing.

It looks to me as though GetConnection() only uses the user ID as a
cache lookup key. What it's trying to do is ensure that if user A and
user B need different user mapping options, we don't use the same
connection for both. So, actually, passing InvalidOid seems like it's
not really a problem here. It's arguably more correct than what we've
been doing up until now, since it means we cache the connection under
user OID whose options we used, rather than the user OID that asked
about the options.

In that case, do we need GetUserMappingById changes in that patch and not pull it out. If we are keeping those changes, I need some clarification about your comment

Even if that were no issue, it doesn't seem to add anything. The only
caller of the new function is postgresBeginForeignScan(), and that
function already has a way of getting the user mapping. The new way
doesn't seem to be better or faster, so why bother changing it?

In pg_fdw_join_v2.patch, postgresBeginForeignScan() obtained user mapping using GetUserMappingById() instead of the earlier way of fetching it by userid and serverid. So even that change will remain, right?

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

21 января 2016 г., 16:36:29

On Thu, Jan 21, 2016 at 3:03 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Jan 18, 2016 at 6:47 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> 2. pg_fdw_join_v2.patch: postgres_fdw changes for supporting join pushdown

The very first hunk of this patch contains annoying whitespace
changes. Even if the result is what pgindent is going to do anyway,
such changes should be stripped out of patches for ease of review. In
this case, though, I'm pretty sure it isn't what pgindent is going to
do, so it's just useless churn. Please remove all such changes from
the patch.

Done.

find_var_pos() looks like it should just be inlined into its only caller.

Node *node = (Node *) var;
TargetListEntry *tle = tlist_member(node, context->outerlist);
if (tle)
{
side = OUTER_ALIAS;
pos = tle->resno;
}
else
{
side = INNER_ALIAS;
tle = tlist_member(node, context->innertlist);
pos = tle->resno;
}

Why are we calling the return value "pos" instead of "resno" as we
typically do elsewhere?

I have rewritten deparseJoinVar as
/*
* Deparse given Var required for a joinrel into buf.
*/
static void
deparseJoinVar(Var *node, deparse_expr_cxt *context)
{
    char        *side;
    TargetEntry *tle;

    /* Lookup outer side */
    tle = tlist_member((Node *)node, context->outertlist);
    if (tle)
        side = OUTER_ALIAS;
    else
    {
        /* Not found on outer side; lookup inner */
        side = INNER_ALIAS;
        tle = tlist_member((Node *)node, context->innertlist);
    }

    /* The input var should be either on left or right side */
    Assert(tle && side);

    appendStringInfo(context->buf, "%s.%s%d", side, COL_ALIAS_PREFIX, tle->resno);
}

get_jointype_name() would probably be better written as a switch.

Done.

On
the flip side, deparseColumnRef() would have a lot less code churn if
it *weren't* written as a switch.

Done.

What this patch does to the naming and calling conventions in
deparse.c does not good. Previously, we had deparseTargetList().
Now, we sometimes use that and sometimes deparseAttrsUsed() for almost
the same thing.

Previously deparseTargetList deparsed the SELECT or RETURNING clause by including list of name of attributes provided by attrs_used. That's now done by deparseAttrsUsed(). In current path deparseTargetList() deparses the targetlist i.e. list of TargetEntry nodes (right now only Vars). Although these two functions return comma separated string of column names, their inputs are different. deparseAttrsUsed() can never be called for more than one base relation. deparseTargetList() on the other hand can deparse a targetlist with Var nodes from multiple relations. We need those two functionalities separately. We might convert attrs_used into a list of TargetEntry nodes using build_tlist_to_deparse() and use deparseTargetList everywhere. A side effect of that would be separating retrieved_attrs collection from deparsing code. I didn't do that change in this version to avoid large code changes. But I am fine doing that, if that makes code readable.

If we have to keep old deparseTargetList as is (and don't rename it as deparseAttrsUsed), we can rename the new deparseTargetList as something different may be deparseSelectList. I am fine with that too. But we need the later functionality, whatever its name be.

Previously, we had deparseColumnRef(); now we have
both that and deparseJoinVar() doing very similar things. But in each
case, the function names are not parallel and the calling conventions
are totally different. Like here:

+ if (context->foreignrel->reloptkind == RELOPT_JOINREL)
+ deparseJoinVar(node, context);
+ else
+ deparseColumnRef(buf, node->varno,
node->varattno, context->root);

We pass the buf separately to deparseColumnRef(), but for some reason
not to deparseJoinVar().I wonder if these functions need to be two
separate things or if the work done by deparseJoinVar() should
actually be part of deparseColumnRef(). But even if it needs to be
separate, I wonder why we can't arrange things so that they get the
same arguments, more or less.

deparseVar() is called for any Var node that's encountered. deparseJoinVar is called to deparse a Var from join relation, which is supposed to output INNER or OUTER var's alias as used in INNER or OUTER subqueries. It does not output the real column names. deparseColumnRef() however is the same old thing; it deparses column of given base relation. They are not similar things.

deparseJoinVar gets its buf from context, which we do not pass to deparseColumnRef(). Not all callers of deparseColumnRef have a deparse_expr_cxt with them. Also, outertlist and innertlist required by deparseJoinVar are passed through deparse_expr_cxt. It doesn't look worth to create a context just for the sake of making function definitions look similar. So, we need to have these two functions separate,

Generally, I think this patch is on the right track, but I think
there's a good bit of work to be done to make it clearer and more
understandable.

I agree that the code is complex for a reader. One of the reasons is recursive nature of deparsing. I will try to make it more cleaner and easier to understand. Would adding a call tree for deparsing routines help here? Or improving function prologues of even the existing functions?

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

21 января 2016 г., 16:48:45

> In such a
> case, which userid should be stored in UserMapping structure?It might look
> like setting GetUserId() as userid in UserMapping is wise, but not really.
> All the foreign tables might have different effective userids, each
> different from GetUserId() and what GetUserId() would return might have a
> user mapping different from the effective userids. What userid should
> UserMapping structure have in that case? I thought, that's why Hanada-san
> used invalid userid there, so left as it is.

Well, we kind of need to get it right, not just be guessing.

It looks to me as though GetConnection() only uses the user ID as a
cache lookup key. What it's trying to do is ensure that if user A and
user B need different user mapping options, we don't use the same
connection for both. So, actually, passing InvalidOid seems like it's
not really a problem here. It's arguably more correct than what we've
been doing up until now, since it means we cache the connection under
user OID whose options we used, rather than the user OID that asked
about the options.

That means that, right now, for two different local users which use public user mapping we will be creating two different connections to the foreign server with the same credentials. That doesn't look good. If we obtained user mapping using user mapping oid (which will have invalid user id for public user mapping) and used userid from that structure, we will get rid of this problem.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

25 января 2016 г., 18:52:18

On Thu, Jan 21, 2016 at 12:47 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
>> Well, we kind of need to get it right, not just be guessing.
>>
>> It looks to me as though GetConnection() only uses the user ID as a
>> cache lookup key.  What it's trying to do is ensure that if user A and
>> user B need different user mapping options, we don't use the same
>> connection for both.  So, actually, passing InvalidOid seems like it's
>> not really a problem here.  It's arguably more correct than what we've
>> been doing up until now, since it means we cache the connection under
>> user OID whose options we used, rather than the user OID that asked
>> about the options.
>
> In that case, do we need GetUserMappingById changes in that patch and not
> pull it out. If we are keeping those changes, I need some clarification
> about your comment
>
>> Even if that were no issue, it doesn't seem to add anything.  The only
>> caller of the new function is  postgresBeginForeignScan(), and that
>> function already has a way of getting the user mapping.  The new way
>> doesn't seem to be better or faster, so why bother changing it?
>
> In pg_fdw_join_v2.patch, postgresBeginForeignScan() obtained user mapping
> using GetUserMappingById() instead of the earlier way of fetching it by
> userid and serverid. So even that change will remain, right?

In light of the investigation which led to the first comment you
quoted, I withdraw the second one (which I think I actually made
chronologically prior to the first one).  I'm not exactly sure what
needs to happen here, but it seems to me that maybe the connection
cache should be changed to be keyed by user mapping OID.  That seems
like it would address the problem you mention here:

http://www.postgresql.org/message-id/CAFjFpRf-LiD5bai4D6cSUseJh=xxJqipo_vN8mTnZG16TMWJ-w@mail.gmail.com

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

25 января 2016 г., 20:13:59

On Thu, Jan 21, 2016 at 8:36 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
>> What this patch does to the naming and calling conventions in
>> deparse.c does not good.  Previously, we had deparseTargetList().
>> Now, we sometimes use that and sometimes deparseAttrsUsed() for almost
>> the same thing.
>
> Previously deparseTargetList deparsed the SELECT or RETURNING clause by
> including list of name of attributes provided by attrs_used. That's now done
> by deparseAttrsUsed(). In current path deparseTargetList() deparses the
> targetlist i.e. list of TargetEntry nodes (right now only Vars). Although
> these two functions return comma separated string of column names, their
> inputs are different. deparseAttrsUsed() can never be called for more than
> one base relation. deparseTargetList() on the other hand can deparse a
> targetlist with Var nodes from multiple relations. We need those two
> functionalities separately. We might convert attrs_used into a list of
> TargetEntry nodes using build_tlist_to_deparse() and use deparseTargetList
> everywhere. A side effect of that would be separating retrieved_attrs
> collection from deparsing code. I didn't do that change in this version to
> avoid large code changes. But I am fine doing that, if that makes code
> readable.
>
> If we have to keep old deparseTargetList as is (and don't rename it as
> deparseAttrsUsed), we can rename the new deparseTargetList as something
> different may be deparseSelectList. I am fine with that too. But we need the
> later functionality, whatever its name be.

I'm not arguing that we don't need the functionality.  I'm arguing
that if we've got a set of existing functions that are named one way,
we shouldn't get a whole bunch of new functions that invent an
entirely new naming convention.  I'm not sure exactly how to clean
this up, but I think we need to find a way.

>> Previously, we had deparseColumnRef(); now we have
>> both that and deparseJoinVar() doing very similar things.  But in each
>> case, the function names are not parallel and the calling conventions
>> are totally different.  Like here:
>>
>> +               if (context->foreignrel->reloptkind == RELOPT_JOINREL)
>> +                       deparseJoinVar(node, context);
>> +               else
>> +                       deparseColumnRef(buf, node->varno,
>> node->varattno, context->root);
>>
>> We pass the buf separately to deparseColumnRef(), but for some reason
>> not to deparseJoinVar().I wonder if these functions need to be two
>> separate things or if the work done by deparseJoinVar() should
>> actually be part of deparseColumnRef().  But even if it needs to be
>> separate, I wonder why we can't arrange things so that they get the
>> same arguments, more or less.
>
> deparseVar() is called for any Var node that's encountered. deparseJoinVar
> is called to deparse a Var from join relation, which is supposed to output
> INNER or OUTER var's alias as used in INNER or OUTER subqueries. It does not
> output the real column names. deparseColumnRef() however is the same old
> thing; it deparses column of given base relation. They are not similar
> things.

deparseColumnRef() emits things like "foo" meaning column foo, or
"foo.bar" meaning column bar of table foo.  deparseJoinVar() emits
things like "r.a7", referring to a column called "a7" in a relation
called "r".  I feel that those *are* similar things.

I also wonder whether they couldn't be made more similar.  It seems to
me this patch is going to realias things potentially multiple times
for its own convenience.  That's not a catastrophe, but it's not
great, either, because it produces queries that are not necessarily
very human readable.  It would be nicer to get
actual_table_name.actual_column_name in more places and r.a7 in fewer.

> I agree that the code is complex for a reader. One of the reasons is
> recursive nature of deparsing. I will try to make it more cleaner and easier
> to understand. Would adding a call tree for deparsing routines help here? Or
> improving function prologues of even the existing functions?

I don't think so.  A README might help, but honestly I think some of
these APIs really just need to be revised.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

28 января 2016 г., 19:27:08

Hi All,

Here's an updated version of the previous patches, broken up like before

1. pg_fdw_core_v3.patch: changes in core - more description below

2. pg_fdw_join_v3.patch: changes to postgres_fdw - more description below

3. pg_join_pd_v3.patch: combined patch for easy testing

Here is the summary of changes from the last set of patches

1. Removed GUC enable_foreignjoin as well as extra documentation for GetForeignTable().

2. Included fix for EvalPlanQual in postgres_fdw - an alternate local path is obtained from the list of paths linked to the joinrel. Since this is done before adding the ForeignPath, we should be a local path available for given join.

3. Moved code to obtain user mapping id for given relation from get_relation_info() to build_simple_rel() to avoid an extra call to planner_rt_fetch().

4. Plan cache invalidation logic when the user which tries to execute a cached plan is different from a user which created the plan. Also, plan cache invalidation logic in case there are changes to user mapping system cache. An example case is Public user mapping was used while planning but before the (cached) plan was executed again, the user mapping for one of the effective users involved was added with different credentials. We should expect the new user mapping to be used for fetching data from corresponding foreign table and thus join becomes unsafe to be pushed down. Added tests in postgres_fdw.sql for these two issues.

5. removed find_var_pos() and instead inlined the logic into its caller as suggested by Robert. get_jointype_name() uses switch.

6. The functions in deparse.c name the functions depending upon the object that parser will create on parsing the output produced by those functions. E.g. deparseColumnRef() produced column names which when parsed would produce ColumnRef object. In this patch, the new functions added use similar convention. Now there are multiple functions which would produce output which when parsed would create same object. In order to have different names for such functions, the names also include the purpose of deparsing or kind of input etc. deparseJoinVar in the previous patch is now deparseColumnRefForJoinrel() since it produces column references for a join relation. There is corresponding deparseColumnRefForBaserel(). I have not changed the name of existing deparseColumnRef, which deparses the non-system column names of a foreign table. There are many changes to comments making them more readable and also some modularization. Let me know if the new names make sense.

On Mon, Jan 25, 2016 at 10:43 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Jan 21, 2016 at 8:36 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
>> What this patch does to the naming and calling conventions in
>> deparse.c does not good. Previously, we had deparseTargetList().
>> Now, we sometimes use that and sometimes deparseAttrsUsed() for almost
>> the same thing.
>
> Previously deparseTargetList deparsed the SELECT or RETURNING clause by
> including list of name of attributes provided by attrs_used. That's now done
> by deparseAttrsUsed(). In current path deparseTargetList() deparses the
> targetlist i.e. list of TargetEntry nodes (right now only Vars). Although
> these two functions return comma separated string of column names, their
> inputs are different. deparseAttrsUsed() can never be called for more than
> one base relation. deparseTargetList() on the other hand can deparse a
> targetlist with Var nodes from multiple relations. We need those two
> functionalities separately. We might convert attrs_used into a list of
> TargetEntry nodes using build_tlist_to_deparse() and use deparseTargetList
> everywhere. A side effect of that would be separating retrieved_attrs
> collection from deparsing code. I didn't do that change in this version to
> avoid large code changes. But I am fine doing that, if that makes code
> readable.
>
> If we have to keep old deparseTargetList as is (and don't rename it as
> deparseAttrsUsed), we can rename the new deparseTargetList as something
> different may be deparseSelectList. I am fine with that too. But we need the
> later functionality, whatever its name be.

I'm not arguing that we don't need the functionality. I'm arguing
that if we've got a set of existing functions that are named one way,
we shouldn't get a whole bunch of new functions that invent an
entirely new naming convention. I'm not sure exactly how to clean
this up, but I think we need to find a way.

>> Previously, we had deparseColumnRef(); now we have
>> both that and deparseJoinVar() doing very similar things. But in each
>> case, the function names are not parallel and the calling conventions
>> are totally different. Like here:
>>
>> + if (context->foreignrel->reloptkind == RELOPT_JOINREL)
>> + deparseJoinVar(node, context);
>> + else
>> + deparseColumnRef(buf, node->varno,
>> node->varattno, context->root);
>>
>> We pass the buf separately to deparseColumnRef(), but for some reason
>> not to deparseJoinVar().I wonder if these functions need to be two
>> separate things or if the work done by deparseJoinVar() should
>> actually be part of deparseColumnRef(). But even if it needs to be
>> separate, I wonder why we can't arrange things so that they get the
>> same arguments, more or less.
>
> deparseVar() is called for any Var node that's encountered. deparseJoinVar
> is called to deparse a Var from join relation, which is supposed to output
> INNER or OUTER var's alias as used in INNER or OUTER subqueries. It does not
> output the real column names. deparseColumnRef() however is the same old
> thing; it deparses column of given base relation. They are not similar
> things.

deparseColumnRef() emits things like "foo" meaning column foo, or
"foo.bar" meaning column bar of table foo. deparseJoinVar() emits
things like "r.a7", referring to a column called "a7" in a relation
called "r". I feel that those *are* similar things.

I also wonder whether they couldn't be made more similar. It seems to
me this patch is going to realias things potentially multiple times
for its own convenience. That's not a catastrophe, but it's not
great, either, because it produces queries that are not necessarily
very human readable. It would be nicer to get
actual_table_name.actual_column_name in more places and r.a7 in fewer.

> I agree that the code is complex for a reader. One of the reasons is
> recursive nature of deparsing. I will try to make it more cleaner and easier
> to understand. Would adding a call tree for deparsing routines help here? Or
> improving function prologues of even the existing functions?

I don't think so. A README might help, but honestly I think some of
these APIs really just need to be revised.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

28 января 2016 г., 22:09:11

On Thu, Jan 28, 2016 at 11:26 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> 1. pg_fdw_core_v3.patch: changes in core - more description below

I've committed most of this patch, with some modifications.  In
particular, I moved CachedPlanSource's hasForeignJoin flag to the
CachedPlan and renamed it has_foreign_join, consistent with the naming
of other members.

The GetUserMappingById() function seemed like a separate thing, so I
left that out of this commit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

29 января 2016 г., 07:21:37

On Thu, Jan 28, 2016 at 11:26 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> 2. pg_fdw_join_v3.patch: changes to postgres_fdw - more description below

This patch no longer quite applies because of conflicts with one of
your other patches that I applied today (cf. commit
fbe5a3fb73102c2cfec11aaaa4a67943f4474383).

And then I broke it some more by committing a patch to extract
deparseLockingClause from postgresGetForeignPlan and move it to
deparse.c, but that should pretty directly reduce the size of this
patch.  I wonder if there are any other bits of refactoring of that
sort that we can do in advance of landing the main patch, just to
simplify review.  But I'm not sure there are: this patch removes very
little existing code; it just adds a ton of new stuff.

I think the names deparseColumnRefForJoinrel and
deparseColumnRefForBaserel are better than the previous names, but I
would capitalize the last "r", so "Rel" instead of "rel".  But it's
weird that we have those functions and also just plain old
deparseColumnRef, which is logically part of
deparseColumnRefForBaserel but inexplicably remains a separate
function.  I still don't see why you can't just add a bunch of new
logic to the existing deparseColumnRef, change the last argument to be
of type deparse_expr_cxt instead of PlannerInfo, and have that one
function simply handle more cases than it does currently.

It seems unlikely to me that postgresGetForeignPlan really needs to
call list_free_deep(fdw_scan_tlist).  Can't we just let memory context
reset clean that up?

In postgresGetForeignPlan (and I think some other functions), you
renamed the argument from baserel to foreignrel.  But I think it would
be better to just go with "rel".  We do that elsewhere in various
places, and it seems fine here too.  And it's shorter.

copy_path_for_epq_recheck() and friends are really ugly.  Should we
consider just adding copyObject() support for those node types
instead?

The error message quality in conversion_error_callback() looks
unacceptably poor.  The column number we're proposing to output will
be utterly meaningless to the user.  It would ideally be desirable to
output the base table name and the column name from that table.

I'm sure there's more -- this is a huge patch and I don't fully
understand it yet -- but I'm out of energy for tonight.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Etsuro Fujita

Дата:

29 января 2016 г., 11:37:59

On 2016/01/29 1:26, Ashutosh Bapat wrote:
> Here's an updated version of the previous patches, broken up like before

> 2. pg_fdw_join_v3.patch: changes to postgres_fdw - more description below

> Here is the summary of changes from the last set of patches

> 2. Included fix for EvalPlanQual in postgres_fdw - an alternate local
> path is obtained from the list of paths linked to the joinrel. Since
> this is done before adding the ForeignPath, we should be a local path
> available for given join.

I looked at that code in the patch (ie, postgresRecheckForeignScan and 
the helper function that creates a local join path for a given foreign 
join path.), briefly.  Maybe I'm missing something, but I think that is 
basically the same as the fix I proposed for addressing this issue, 
posted before [1], right?  If so, my concern is, the helper function 
probably wouldn't extend to the parameterized-foreign-join-path cases, 
though that would work well for the unparameterized-foreign-join-path 
cases.  We don't support parameterized-foreign-join paths for 9.6?

Best regards,
Etsuro Fujita

[1] http://www.postgresql.org/message-id/5666B59F.6010701@lab.ntt.co.jp

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

29 января 2016 г., 11:46:55

On Fri, Jan 29, 2016 at 9:51 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Jan 28, 2016 at 11:26 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> 2. pg_fdw_join_v3.patch: changes to postgres_fdw - more description below

This patch no longer quite applies because of conflicts with one of
your other patches that I applied today (cf. commit
fbe5a3fb73102c2cfec11aaaa4a67943f4474383).

And then I broke it some more by committing a patch to extract
deparseLockingClause from postgresGetForeignPlan and move it to
deparse.c, but that should pretty directly reduce the size of this
patch. I wonder if there are any other bits of refactoring of that
sort that we can do in advance of landing the main patch, just to
simplify review. But I'm not sure there are: this patch removes very
little existing code; it just adds a ton of new stuff.

PFA patch to move code to deparse SELECT statement into a function deparseSelectStmtForRel(). This code is duplicated in estimate_path_cost_size() and postgresGetForeignPlan(), so moving it into a function avoids that duplication. As a side note, estimate_path_cost_size() doesn't add FOR SHARE/UPDATE clause to the statement being EXPLAINed, even if the actual statement during execution would have it. This difference looks unintentional to me. This patch corrects it as well. appendOrderByClause and appendWhereClause both create a context within themselves and pass it to deparseExpr. This patch creates the context once in deparseSelectStmtForRel() and then passes it to the other deparse functions avoiding repeated context creation.

copy_path_for_epq_recheck() and friends are really ugly. Should we
consider just adding copyObject() support for those node types
instead?

the note in copyfuncs.c says
* We also do not support copying Path trees, mainly
* because the circular linkages between RelOptInfo and Path nodes can't
* be handled easily in a simple depth-first traversal.

We may avoid that by just copying the pointer to RelOptInfo and not copy the whole RelOptInfo. The other problem is paths in epq_paths will be copied as many times as the number of 2-way joins pushed down. Let me give it a try and produce patch with that.

I'm sure there's more -- this is a huge patch and I don't fully
understand it yet -- but I'm out of energy for tonight.

Thanks a lot for your comments and moving this patch forward.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

pg_fdw_deparse_select.patch

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

29 января 2016 г., 11:53:03

On Fri, Jan 29, 2016 at 2:05 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:

On 2016/01/29 1:26, Ashutosh Bapat wrote:
Here's an updated version of the previous patches, broken up like before

2. pg_fdw_join_v3.patch: changes to postgres_fdw - more description below

Here is the summary of changes from the last set of patches

2. Included fix for EvalPlanQual in postgres_fdw - an alternate local
path is obtained from the list of paths linked to the joinrel. Since
this is done before adding the ForeignPath, we should be a local path
available for given join.

I looked at that code in the patch (ie, postgresRecheckForeignScan and the helper function that creates a local join path for a given foreign join path.), briefly. Maybe I'm missing something, but I think that is basically the same as the fix I proposed for addressing this issue, posted before [1], right?

Yes, although I have added functions to copy the paths, not consider pathkeys and change the relevant members of the paths. Sorry if I have missed giving you due credits.

If so, my concern is, the helper function probably wouldn't extend to the parameterized-foreign-join-path cases, though that would work well for the unparameterized-foreign-join-path cases. We don't support parameterized-foreign-join paths for 9.6?

If we do not find a local path with given parameterization, it means there are other local parameterized paths which are superior to it. This possibly indicates that there will be foreign join parameterised paths which are superior to this parameterized path, so we basically do not create foreign join path with that parameterization.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

30 января 2016 г., 17:28:23

PFA patches

pg_fdw_core_v4.patch GetUserMappingById changes

pg_fdw_join_v4.patch: postgres_fdw changes for join pushdown including suggestions as described below

pg_join_pd_v4.patch: combined patch for ease of testing.

On Fri, Jan 29, 2016 at 9:51 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Jan 28, 2016 at 11:26 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> 2. pg_fdw_join_v3.patch: changes to postgres_fdw - more description below

This patch no longer quite applies because of conflicts with one of
your other patches that I applied today (cf. commit
fbe5a3fb73102c2cfec11aaaa4a67943f4474383).

And then I broke it some more by committing a patch to extract
deparseLockingClause from postgresGetForeignPlan and move it to
deparse.c, but that should pretty directly reduce the size of this
patch. I wonder if there are any other bits of refactoring of that
sort that we can do in advance of landing the main patch, just to
simplify review. But I'm not sure there are: this patch removes very
little existing code; it just adds a ton of new stuff.

The patches are rebased. A separate patch to move select statement deparsing code into a separate function has been submitted in a separate mail.

I think the names deparseColumnRefForJoinrel and
deparseColumnRefForBaserel are better than the previous names, but I
would capitalize the last "r", so "Rel" instead of "rel".

Done.

But it's
weird that we have those functions and also just plain old
deparseColumnRef, which is logically part of
deparseColumnRefForBaserel but inexplicably remains a separate
function. I still don't see why you can't just add a bunch of new
logic to the existing deparseColumnRef, change the last argument to be
of type deparse_expr_cxt instead of PlannerInfo, and have that one
function simply handle more cases than it does currently.

1. There are existing callers of deparseColumnRef() like deparseInsertSql() which will need few lines added to create deparse context. 2. These callers pass relid and attribute number as arguments as against deparseColumnRefForJoinRel, which needs a Var node as input (to be searched in inner and outer targetlists. You seemed to object to different signature of deparseColumnRefForJoinRel and deparseColumnRefForBaseRel. So, I left that change in this patch. I am fine with that change as well, if 1 and 2 are acceptable.

It seems unlikely to me that postgresGetForeignPlan really needs to
call list_free_deep(fdw_scan_tlist). Can't we just let memory context
reset clean that up?

Done.

In postgresGetForeignPlan (and I think some other functions), you
renamed the argument from baserel to foreignrel. But I think it would
be better to just go with "rel". We do that elsewhere in various
places, and it seems fine here too. And it's shorter.

We are using rel as variable name for Relation variable name and we need both of them RelOptInfo and Relation atlest in deparseSelectStmtForRel(). So, used a different name foreignrel.

copy_path_for_epq_recheck() and friends are really ugly. Should we
consider just adding copyObject() support for those node types
instead?

Done. I had pessimistically code to copy the paths deeply, but that's not required. We need to copy the path in case it gets freed by the planner while ejecting it. A flat copy suffices.

The error message quality in conversion_error_callback() looks
unacceptably poor. The column number we're proposing to output will
be utterly meaningless to the user. It would ideally be desirable to
output the base table name and the column name from that table.

Done. In conversion_error_callback(), fdw_scan_tlist and Estate are used to obtain the name of the table and column.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

30 января 2016 г., 18:33:09

On Fri, Jan 29, 2016 at 3:46 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> PFA patch to move code to deparse SELECT statement into a function
> deparseSelectStmtForRel(). This code is duplicated in
> estimate_path_cost_size() and postgresGetForeignPlan(), so moving it into a
> function avoids that duplication. As a side note, estimate_path_cost_size()
> doesn't add FOR SHARE/UPDATE clause to the statement being EXPLAINed, even
> if the actual statement during execution would have it. This difference
> looks unintentional to me. This patch corrects it as well.
> appendOrderByClause and appendWhereClause both create a context within
> themselves and pass it to deparseExpr. This patch creates the context once
> in deparseSelectStmtForRel() and then passes it to the other deparse
> functions avoiding repeated context creation.

Right, OK.  I think this is good, so, committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

01 февраля 2016 г., 16:27:50

Here are patches rebased on recent commit cc592c48c58d9c1920f8e2063756dcbcce79e4dd. Renamed original deparseSelectSql as deparseSelectSqlForBaseRel and added deparseSelectSqlForJoinRel to construct SELECT and FROM clauses for base and join relations.

pg_fdw_core_v5.patch GetUserMappingById changes
pg_fdw_join_v5.patch: postgres_fdw changes for join pushdown including suggestions as described below
pg_join_pd_v5.patch: combined patch for ease of testing.

The patches also have following changes along with the changes described in my last mail.

1. Revised the way targetlists are handled. For a bare base relation the SELECT clause is deparsed from fpinfo->attrs_used but for a base relation which is part of join relation, the expected targetlist is passed down to deparseSelectSqlForBaseRel(). This change removed 75 odd lines in build_tlist_to_deparse() which were very similar to deparseTargetListFromAttrsUsed() in the previous patch.

2. Refactored postgresGetForeignJoinPaths to be more readable moving the code to assess safety of join pushdown into a separate function.

On Sat, Jan 30, 2016 at 7:58 PM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:

PFA patches
pg_fdw_core_v4.patch GetUserMappingById changes
pg_fdw_join_v4.patch: postgres_fdw changes for join pushdown including suggestions as described below
pg_join_pd_v4.patch: combined patch for ease of testing.

On Fri, Jan 29, 2016 at 9:51 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Jan 28, 2016 at 11:26 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> 2. pg_fdw_join_v3.patch: changes to postgres_fdw - more description below

This patch no longer quite applies because of conflicts with one of
your other patches that I applied today (cf. commit
fbe5a3fb73102c2cfec11aaaa4a67943f4474383).

And then I broke it some more by committing a patch to extract
deparseLockingClause from postgresGetForeignPlan and move it to
deparse.c, but that should pretty directly reduce the size of this
patch. I wonder if there are any other bits of refactoring of that
sort that we can do in advance of landing the main patch, just to
simplify review. But I'm not sure there are: this patch removes very
little existing code; it just adds a ton of new stuff.

The patches are rebased. A separate patch to move select statement deparsing code into a separate function has been submitted in a separate mail.

I think the names deparseColumnRefForJoinrel and
deparseColumnRefForBaserel are better than the previous names, but I
would capitalize the last "r", so "Rel" instead of "rel".

Done.

But it's
weird that we have those functions and also just plain old
deparseColumnRef, which is logically part of
deparseColumnRefForBaserel but inexplicably remains a separate
function. I still don't see why you can't just add a bunch of new
logic to the existing deparseColumnRef, change the last argument to be
of type deparse_expr_cxt instead of PlannerInfo, and have that one
function simply handle more cases than it does currently.

1. There are existing callers of deparseColumnRef() like deparseInsertSql() which will need few lines added to create deparse context. 2. These callers pass relid and attribute number as arguments as against deparseColumnRefForJoinRel, which needs a Var node as input (to be searched in inner and outer targetlists. You seemed to object to different signature of deparseColumnRefForJoinRel and deparseColumnRefForBaseRel. So, I left that change in this patch. I am fine with that change as well, if 1 and 2 are acceptable.

It seems unlikely to me that postgresGetForeignPlan really needs to
call list_free_deep(fdw_scan_tlist). Can't we just let memory context
reset clean that up?

Done.

In postgresGetForeignPlan (and I think some other functions), you
renamed the argument from baserel to foreignrel. But I think it would
be better to just go with "rel". We do that elsewhere in various
places, and it seems fine here too. And it's shorter.

We are using rel as variable name for Relation variable name and we need both of them RelOptInfo and Relation atlest in deparseSelectStmtForRel(). So, used a different name foreignrel.

copy_path_for_epq_recheck() and friends are really ugly. Should we
consider just adding copyObject() support for those node types
instead?

Done. I had pessimistically code to copy the paths deeply, but that's not required. We need to copy the path in case it gets freed by the planner while ejecting it. A flat copy suffices.

The error message quality in conversion_error_callback() looks
unacceptably poor. The column number we're proposing to output will
be utterly meaningless to the user. It would ideally be desirable to
output the base table name and the column name from that table.

Done. In conversion_error_callback(), fdw_scan_tlist and Estate are used to obtain the name of the table and column.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

02 февраля 2016 г., 02:49:04

On Mon, Feb 1, 2016 at 8:27 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> Here are patches rebased on recent commit
> cc592c48c58d9c1920f8e2063756dcbcce79e4dd. Renamed original deparseSelectSql
> as deparseSelectSqlForBaseRel and added deparseSelectSqlForJoinRel to
> construct SELECT and FROM clauses for base and join relations.
>
> pg_fdw_core_v5.patch GetUserMappingById changes
> pg_fdw_join_v5.patch: postgres_fdw changes for join pushdown including
> suggestions as described below
> pg_join_pd_v5.patch: combined patch for ease of testing.
>
> The patches also have following changes along with the changes described in
> my last mail.
> 1. Revised the way targetlists are handled. For a bare base relation the
> SELECT clause is deparsed from fpinfo->attrs_used but for a base relation
> which is part of join relation, the expected targetlist is passed down to
> deparseSelectSqlForBaseRel(). This change removed 75 odd lines in
> build_tlist_to_deparse() which were very similar to
> deparseTargetListFromAttrsUsed() in the previous patch.

Nice!

> 2. Refactored postgresGetForeignJoinPaths to be more readable moving the
> code to assess safety of join pushdown into a separate function.

That looks good.  But maybe call the function foreign_join_ok() or
something like that?  is_foreign_join() isn't terrible but it sounds a
little odd to me.

The path-copying stuff in get_path_for_epq_recheck() looks a lot
better now, but you neglected to add a comment explaining why you did
it that way (e.g. "Make a shallow copy of the join path, because the
planner might free the original structure after a future add_path().
We don't need to copy the substructure, though; that won't get freed."I would forget about setting
merge_path->materialize_inner= false;

that doesn't seem essential.  Also, I would arrange things so that if
you hit an unrecognized path type (like a custom join, or a gather)
you skip that particular path instead of erroring out.  I think this
whole function should be moved to core, and I think the argument
should be a RelOptInfo * rather than a List *.

+     * We can't know VERBOSE option is specified or not, so always add shcema

We can't know "whether" VERBOSE...
shcema -> schema

+     * the join relaiton is already considered, so that we won't waste time in

Typo.

+     * judging safety of join pushdow and adding the same paths again if found

Typo.

More when I have a bit more time to look at this...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

02 февраля 2016 г., 07:43:09

On Mon, Feb 1, 2016 at 6:48 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> More when I have a bit more time to look at this...

Why does deparseSelectStmtForRel change the order of the existing
arguments?  I have no issue with adding new arguments as required, but
rearranging the existing argument order doesn't serve any useful
purpose that is immediately apparent.  There's also no real need for
the rel -> foreignrel renaming.

+    /*
+     * If we have constructed the SELECT clause from the targetlist, construct
+     * the retrieved attributes list as continuously increasing list of
+     * integers.
+     */
+    if (retrieved_attrs && tlist)
+    {
+        int i;
+        *retrieved_attrs = NIL;
+        for (i = 1; i <= list_length(tlist); i++)
+            *retrieved_attrs = lappend_int(*retrieved_attrs, i);
+    }

This is really wonky.  First, you pass retrieved_attrs to
deparseSelectSqlForBaseRel, but then you have this code which blows it
away and replaces it if tlist != NIL.  So I guess this will run always
for a join relation, and for a base relation only sometimes.  But
that's certainly not at all clear.  I think you need to find some way
of rearranging this so that it's more obvious what is going on here.

I suggest not renaming the existing deparseTargetList() and instead
coming up with a different name for the new thing you need, maybe
deparseExplicitTargetList().

How about adding another sentence to the header comment for
appendConditions() saying something like "This is used for both WHERE
clauses and for JOIN .. ON"?

+ * The targetlist is list of TargetEntry's which in turn contains Var nodes.

contain.

+/*
+ * Output join name for given join type */

Formatting.  This patch, overall, is badly in need of a pgindent run.

+    /*
+     * XXX
+     * Since the query is being built in recursive manner from bottom up,
+     * the FOR UPDATE/SHARE clause referring the base relations can
not be added
+     * at the top level. They need to be added to the subqueries corresponding
+     * to the base relations. This has an undesirable effect of locking more
+     * rows than specified by user, as it locks even those rows from base
+     * relations which are not part of the final join result. To avoid this
+     * undesirable effect, we need to build the join query without the
+     * subqueries, which for now, seems hard.
+     */

This is a second good reason that we should actually do that work
instead of complaining that it's too hard.  The first one is that the
queries that come out of this right now are too long and hard to read.
I actually don't see why this is all that hard.  Deparsing the target
list is simple enough; you just need to emit tab.col using varno to
determine what tab looks like and varattno to determine what col looks
like.  The trickier part is emitting the FROM clause. But this doesn't
seem all that hard either.  Suppose that when we construct the fpinfo
(PgFdwRelationInfo *) for a relation, we include in it a FROM clause
appropriate to that rel.  So, for a baserel, it's something like "foo
r4" where 4 is foo's RTI.  For a joinrel, do this:

1. Emit the FROM clause constructed for the outer relation,
surrounding it with parentheses if the outer relation is a joinrel.
2. Emit " JOIN ", " LEFT JOIN ", " RIGHT JOIN ", or " FULL JOIN "
according to the join type.
3. Emit the FROM clause constructed for the inner relation,
surrounding it with parentheses if the inner relation is a joinrel.
4. Emit " ON ".
5. Emit the joinqual.

This will produce nice things like (foo r3 JOIN bar r4 ON r3.x = r4.x)
JOIN baz r2 ON r3.y = r2.y

Then, you'd also need to stuff the conditions into the
PgFdwRelationInfo so that those could be added to paths constructed at
higher levels.  But that's not too hard either.  Basically you'd end
up with mostly the same stuff you have now, but the PgFdwRelationInfo
would store a join clause and a set of deparsed quals to be included
in the FROM and WHERE clauses respectively.  And then you'd pull the
information from the inner and outer sides to build up what you need
at the joinrel level.

This would actually be faster than what you've got right now, because
right now you're recursing down the whole join tree all over again at
each new join level, maybe not the best plan.

+    if (foreignrel->reloptkind == RELOPT_JOINREL)
+        return;
+    /* Add any necessary FOR UPDATE/SHARE. */    deparseLockingClause(&context);

Generally, I think in these kinds of cases it is better to reverse the
test and get rid of the return statement, like this:

if (foreignrel->reloptkind != RELOPT_JOINREL)   deparseLockingClause(&context);

The way you wrote it, somebody who wants to add more code to the end
of the function will probably have to make that change anyhow.
Really, your intention here was to skip that code for joins, not to
skip the rest of the function for baserels.

@@ -1424,9 +1875,7 @@ deparseVar(Var *node, deparse_expr_cxt *context)            printRemoteParam(pindex,
node->vartype,node->vartypmod, context);        }        else
 
-        {            printRemotePlaceholder(node->vartype, node->vartypmod, context);
-        }    }}

Useless hunk.

+     * Constructing queries representating SEMI and ANTI joins is hard, hence

Typo.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

02 февраля 2016 г., 19:18:16

On Tue, Feb 2, 2016 at 5:18 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Feb 1, 2016 at 8:27 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> Here are patches rebased on recent commit
> cc592c48c58d9c1920f8e2063756dcbcce79e4dd. Renamed original deparseSelectSql
> as deparseSelectSqlForBaseRel and added deparseSelectSqlForJoinRel to
> construct SELECT and FROM clauses for base and join relations.
>
> pg_fdw_core_v5.patch GetUserMappingById changes
> pg_fdw_join_v5.patch: postgres_fdw changes for join pushdown including
> suggestions as described below
> pg_join_pd_v5.patch: combined patch for ease of testing.
>
> The patches also have following changes along with the changes described in
> my last mail.
> 1. Revised the way targetlists are handled. For a bare base relation the
> SELECT clause is deparsed from fpinfo->attrs_used but for a base relation
> which is part of join relation, the expected targetlist is passed down to
> deparseSelectSqlForBaseRel(). This change removed 75 odd lines in
> build_tlist_to_deparse() which were very similar to
> deparseTargetListFromAttrsUsed() in the previous patch.

Nice!

> 2. Refactored postgresGetForeignJoinPaths to be more readable moving the
> code to assess safety of join pushdown into a separate function.

That looks good. But maybe call the function foreign_join_ok() or
something like that? is_foreign_join() isn't terrible but it sounds a
little odd to me.

I used name is_foreign_join(), which assesses push-down safety of a join, to have similar naming convention with is_foreign_expr(), which checks push-down safety of an expression. But foreign_join_ok() is fine too. Used that.

The path-copying stuff in get_path_for_epq_recheck() looks a lot
better now, but you neglected to add a comment explaining why you did
it that way (e.g. "Make a shallow copy of the join path, because the
planner might free the original structure after a future add_path().
We don't need to copy the substructure, though; that won't get freed."

I alluded to that in the second sentence of comment
3259 * Since we will need to replace any foreign paths for join with their alternate
3260 * paths, we need make a copy of the local path chosen. Also, that helps in case
3261 * the planner chooses to throw away the local path.

But that wasn't as clear as your wording. Rewrote the paragraph using your wording.
3259 * Since we will need to replace any foreign paths for join with their alternate
3260 * paths, we need make a copy of the local path chosen. Make a shallow copy of
3261 * the join path, because the planner might free the original structure after a
3262 * future add_path(). We don't need to copy the substructure, though; that won't
3263 * get freed.

I would forget about setting merge_path->materialize_inner = false;
that doesn't seem essential.

Done.

Also, I would arrange things so that if
you hit an unrecognized path type (like a custom join, or a gather)
you skip that particular path instead of erroring out.

Ok. Done.

I think this
whole function should be moved to core,

I have moved the function to foreign.c where most of the FDW APIs are located and declared it in fdwapi.h. Since the function deals with the paths, I thought of adding it to some path related file, but since it's a helper function that an FDW can use, I thought foreign.c would be better. I have also added documentation in fdwhandler.sgml. I have renamed the function as GetPathForEPQRecheck() in order to be consistent with other FDW APIs. In the description I have just mentioned copy of a local path. I am not sure whether we should say "shallow copy".

and I think the argument
should be a RelOptInfo * rather than a List *.

Done.

+ * We can't know VERBOSE option is specified or not, so always add shcema

We can't know "whether" VERBOSE...
shcema -> schema

Done.

+ * the join relaiton is already considered, so that we won't waste time in

Typo.

Done.

+ * judging safety of join pushdow and adding the same paths again if found

Typo.

Done.

Sorry for those typos.

Attaching patches with reply to your next mail.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

02 февраля 2016 г., 19:21:59

On Tue, Feb 2, 2016 at 10:13 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Feb 1, 2016 at 6:48 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> More when I have a bit more time to look at this...

Why does deparseSelectStmtForRel change the order of the existing
arguments? I have no issue with adding new arguments as required, but
rearranging the existing argument order doesn't serve any useful
purpose that is immediately apparent.

deparseSelectStmtForRel has two sets of arguments, input and output. They are separated in the declaration all inputs come first, followed by all outputs. The inputs were ordered according to their appearance in SELECT statement, so I added tlist before remote_conds. I should have added relations, which is an output argument, at the end, but I accidentally added it between existing output arguments. Anyway, I will go ahead and just add the new arguments after the existing ones.

There's also no real need for
the rel -> foreignrel renaming.

That was an unintentional change during merge. Sorry for that. Reverted it.

+ /*
+ * If we have constructed the SELECT clause from the targetlist, construct
+ * the retrieved attributes list as continuously increasing list of
+ * integers.
+ */
+ if (retrieved_attrs && tlist)
+ {
+ int i;
+ *retrieved_attrs = NIL;
+ for (i = 1; i <= list_length(tlist); i++)
+ *retrieved_attrs = lappend_int(*retrieved_attrs, i);
+ }

This is really wonky. First, you pass retrieved_attrs to
deparseSelectSqlForBaseRel, but then you have this code which blows it
away and replaces it if tlist != NIL. So I guess this will run always
for a join relation, and for a base relation only sometimes. But
that's certainly not at all clear. I think you need to find some way
of rearranging this so that it's more obvious what is going on here.

I have pushed retrieved_attrs as an argument to deparseExplicitTargetList() and deparseSelectSqlForJoinRel() to keep the things consistent.

I suggest not renaming the existing deparseTargetList() and instead
coming up with a different name for the new thing you need, maybe
deparseExplicitTargetList().

Done.

How about adding another sentence to the header comment for
appendConditions() saying something like "This is used for both WHERE
clauses and for JOIN .. ON"?

Done.

+ * The targetlist is list of TargetEntry's which in turn contains Var nodes.

contain.

Done.

+/*
+ * Output join name for given join type */

Formatting.

Done.

This patch, overall, is badly in need of a pgindent run.

Sorry, I haven't run pgindent on the attached patches. But will do that next.

+ /*
+ * XXX
+ * Since the query is being built in recursive manner from bottom up,
+ * the FOR UPDATE/SHARE clause referring the base relations can
not be added
+ * at the top level. They need to be added to the subqueries corresponding
+ * to the base relations. This has an undesirable effect of locking more
+ * rows than specified by user, as it locks even those rows from base
+ * relations which are not part of the final join result. To avoid this
+ * undesirable effect, we need to build the join query without the
+ * subqueries, which for now, seems hard.
+ */

This is a second good reason that we should actually do that work
instead of complaining that it's too hard. The first one is that the
queries that come out of this right now are too long and hard to read.
I actually don't see why this is all that hard. Deparsing the target
list is simple enough; you just need to emit tab.col using varno to
determine what tab looks like and varattno to determine what col looks
like. The trickier part is emitting the FROM clause. But this doesn't
seem all that hard either. Suppose that when we construct the fpinfo
(PgFdwRelationInfo *) for a relation, we include in it a FROM clause
appropriate to that rel. So, for a baserel, it's something like "foo
r4" where 4 is foo's RTI. For a joinrel, do this:

1. Emit the FROM clause constructed for the outer relation,
surrounding it with parentheses if the outer relation is a joinrel.
2. Emit " JOIN ", " LEFT JOIN ", " RIGHT JOIN ", or " FULL JOIN "
according to the join type.
3. Emit the FROM clause constructed for the inner relation,
surrounding it with parentheses if the inner relation is a joinrel.
4. Emit " ON ".
5. Emit the joinqual.

This will produce nice things like (foo r3 JOIN bar r4 ON r3.x = r4.x)
JOIN baz r2 ON r3.y = r2.y

Then, you'd also need to stuff the conditions into the
PgFdwRelationInfo so that those could be added to paths constructed at
higher levels. But that's not too hard either. Basically you'd end
up with mostly the same stuff you have now, but the PgFdwRelationInfo
would store a join clause and a set of deparsed quals to be included
in the FROM and WHERE clauses respectively. And then you'd pull the
information from the inner and outer sides to build up what you need
at the joinrel level.

I was thinking on the similar lines except rN aliases. I think there will be problem for queries like
postgres=# explain verbose select * from lt left join (select bar.a, foo.b from bar left join foo on (bar.a = foo.a) where bar.b + foo.b < 10) q on (lt.b = q.b);
                                   QUERY PLAN
--------------------------------------------------------------------------------
Hash Right Join (cost=318.03..872.45 rows=43 width=16)
   Output: lt.a, lt.b, bar.a, foo.b
   Hash Cond: (foo.b = lt.b)
   -> Merge Join (cost=317.01..839.07 rows=8513 width=8)
         Output: bar.a, foo.b
         Merge Cond: (bar.a = foo.a)
         Join Filter: ((bar.b + foo.b) < 10)
         -> Sort (cost=158.51..164.16 rows=2260 width=8)
               Output: bar.a, bar.b
               Sort Key: bar.a
               -> Seq Scan on public.bar (cost=0.00..32.60 rows=2260 width=8)
                     Output: bar.a, bar.b
         -> Sort (cost=158.51..164.16 rows=2260 width=8)
               Output: foo.b, foo.a
               Sort Key: foo.a
               -> Seq Scan on public.foo (cost=0.00..32.60 rows=2260 width=8)
                     Output: foo.b, foo.a
   -> Hash (cost=1.01..1.01 rows=1 width=8)
         Output: lt.a, lt.b
         -> Seq Scan on public.lt (cost=0.00..1.01 rows=1 width=8)
               Output: lt.a, lt.b
(21 rows)

The subquery q is pulled up, so there won't be trace of q in the join tree except may be a useless RTE for the subquery. There will be RelOptInfo representing join between lt, bar and foo and a RelOptInfo for join between bar and foo. The join filter bar.b + foo.b < 10 needs to be evaluated before joining (bar, foo) with lt and should go with bar left join foo. But the syntax doesn't support something like "bar left join foo on (bar.a = foo.a) where bar.b + foo.b". So we will have to construct a SELECT statement for this join and add to the FROM clause with a subquery alias and then refer the columns of foo and bar with that subquery alias.

Further during the process of qual placement, quals that can be evaluated at the level of given relation in the join tree are attached to that relation if they can be pushed down. Thus if we see a qual attached to a given relation, AFAIU, we can not say whether it needs to be evaluated there (similar to above query) or planner pushed it down for optimization, and thus for every join relation with quals we will need to build subqueries with aliases.

I am still looking at how we can make this work.

This would actually be faster than what you've got right now, because
right now you're recursing down the whole join tree all over again at
each new join level, maybe not the best plan.

+ if (foreignrel->reloptkind == RELOPT_JOINREL)
+ return;
+
/* Add any necessary FOR UPDATE/SHARE. */
deparseLockingClause(&context);

Done.

Generally, I think in these kinds of cases it is better to reverse the
test and get rid of the return statement, like this:

if (foreignrel->reloptkind != RELOPT_JOINREL)
deparseLockingClause(&context);

The way you wrote it, somebody who wants to add more code to the end
of the function will probably have to make that change anyhow.
Really, your intention here was to skip that code for joins, not to

skip the rest of the function for baserels.

Yes, you are right. Actually before deparseLockingClause change was committed, all of that code was here, so this way worked. But I forgot to change it while merging. Thanks for pointing out.

@@ -1424,9 +1875,7 @@ deparseVar(Var *node, deparse_expr_cxt *context)
printRemoteParam(pindex, node->vartype, node->vartypmod, context);
}
else
- {
printRemotePlaceholder(node->vartype, node->vartypmod, context);
- }
}
}

Useless hunk.

Reverted the change.

+ * Constructing queries representating SEMI and ANTI joins is hard, hence

Typo.

Done. Thanks for pointing out the typos.

Attached patches
pg_fdw_core_v6.patch: core changes
pg_fdw_join_v6.patch: postgres_fdw changes for join pushdown
pg_join_pd_v6.patch: combined patch for ease of testing.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

02 февраля 2016 г., 19:32:18

On Tue, Feb 2, 2016 at 11:21 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
>> Why does deparseSelectStmtForRel change the order of the existing
>> arguments?  I have no issue with adding new arguments as required, but
>> rearranging the existing argument order doesn't serve any useful
>> purpose that is immediately apparent.
>
> deparseSelectStmtForRel has two sets of arguments, input and output. They
> are separated in the declaration all inputs come first, followed by all
> outputs. The inputs were ordered according to their appearance in SELECT
> statement, so I added tlist before remote_conds. I should have added
> relations, which is an output argument, at the end, but I accidentally added
> it between existing output arguments. Anyway, I will go ahead and just add
> the new arguments after the existing ones.

No, that's not what I'm asking for, nor do I think it's right.  What
I'm complaining about is that originally params_list was after
retrieved_attrs, but in v5 it's before retrieved_attrs.  I'm fine with
inserting tlist after rel, or in general inserting new arguments in
the sequence.  But you reversed the relative ordering of params_list
and retrieved_attrs.

> I was thinking on the similar lines except rN aliases. I think there will be
> problem for queries like
> postgres=# explain verbose select * from lt left join (select bar.a, foo.b
> from bar left join foo on (bar.a = foo.a) where bar.b + foo.b < 10) q on
> (lt.b = q.b);
>                                    QUERY PLAN
> --------------------------------------------------------------------------------
>  Hash Right Join  (cost=318.03..872.45 rows=43 width=16)
>    Output: lt.a, lt.b, bar.a, foo.b
>    Hash Cond: (foo.b = lt.b)
>    ->  Merge Join  (cost=317.01..839.07 rows=8513 width=8)
>          Output: bar.a, foo.b
>          Merge Cond: (bar.a = foo.a)
>          Join Filter: ((bar.b + foo.b) < 10)
>          ->  Sort  (cost=158.51..164.16 rows=2260 width=8)
>                Output: bar.a, bar.b
>                Sort Key: bar.a
>                ->  Seq Scan on public.bar  (cost=0.00..32.60 rows=2260
> width=8)
>                      Output: bar.a, bar.b
>          ->  Sort  (cost=158.51..164.16 rows=2260 width=8)
>                Output: foo.b, foo.a
>                Sort Key: foo.a
>                ->  Seq Scan on public.foo  (cost=0.00..32.60 rows=2260
> width=8)
>                      Output: foo.b, foo.a
>    ->  Hash  (cost=1.01..1.01 rows=1 width=8)
>          Output: lt.a, lt.b
>          ->  Seq Scan on public.lt  (cost=0.00..1.01 rows=1 width=8)
>                Output: lt.a, lt.b
> (21 rows)
>
> The subquery q is pulled up, so there won't be trace of q in the join tree
> except may be a useless RTE for the subquery. There will be RelOptInfo
> representing join between lt, bar and foo and a RelOptInfo for join between
> bar and foo. The join filter bar.b + foo.b < 10 needs to be evaluated before
> joining (bar, foo) with lt and should go with bar left join foo. But the
> syntax doesn't support something like "bar left join foo on (bar.a = foo.a)
> where bar.b + foo.b". So we will have to construct a SELECT statement for
> this join and add to the FROM clause with a subquery alias and then refer
> the columns of foo and bar with that subquery alias.

Hmm, does it work if we put bar.b + foo.b < 10 in the ON clause for
the join between lt and foo/bar? I think so...

> Further during the process of qual placement, quals that can be evaluated at
> the level of given relation in the join tree are attached to that relation
> if they can be pushed down. Thus if we see a qual attached to a given
> relation, AFAIU, we can not say whether it needs to be evaluated there
> (similar to above query) or planner pushed it down for optimization, and
> thus for every join relation with quals we will need to build subqueries
> with aliases.

I don't think that's true.  I theorize that every qual can either go
into the top level WHERE clause or the ON clause of some join.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

03 февраля 2016 г., 20:09:09

PFA patches with naming conventions similar to previous ones.
pg_fdw_core_v7.patch: core changes
pg_fdw_join_v7.patch: postgres_fdw changes for join pushdown
pg_join_pd_v7.patch: combined patch for ease of testing.

On Tue, Feb 2, 2016 at 10:02 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Feb 2, 2016 at 11:21 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
>> Why does deparseSelectStmtForRel change the order of the existing
>> arguments? I have no issue with adding new arguments as required, but
>> rearranging the existing argument order doesn't serve any useful
>> purpose that is immediately apparent.
>
> deparseSelectStmtForRel has two sets of arguments, input and output. They
> are separated in the declaration all inputs come first, followed by all
> outputs. The inputs were ordered according to their appearance in SELECT
> statement, so I added tlist before remote_conds. I should have added
> relations, which is an output argument, at the end, but I accidentally added
> it between existing output arguments. Anyway, I will go ahead and just add
> the new arguments after the existing ones.

No, that's not what I'm asking for, nor do I think it's right. What
I'm complaining about is that originally params_list was after
retrieved_attrs, but in v5 it's before retrieved_attrs. I'm fine with
inserting tlist after rel, or in general inserting new arguments in
the sequence. But you reversed the relative ordering of params_list
and retrieved_attrs.

Ok, fixed in this patch.

> I was thinking on the similar lines except rN aliases. I think there will be
> problem for queries like
> postgres=# explain verbose select * from lt left join (select bar.a, foo.b
> from bar left join foo on (bar.a = foo.a) where bar.b + foo.b < 10) q on
> (lt.b = q.b);
> QUERY PLAN
> --------------------------------------------------------------------------------
> Hash Right Join (cost=318.03..872.45 rows=43 width=16)
> Output: lt.a, lt.b, bar.a, foo.b
> Hash Cond: (foo.b = lt.b)
> -> Merge Join (cost=317.01..839.07 rows=8513 width=8)
> Output: bar.a, foo.b
> Merge Cond: (bar.a = foo.a)
> Join Filter: ((bar.b + foo.b) < 10)
> -> Sort (cost=158.51..164.16 rows=2260 width=8)
> Output: bar.a, bar.b
> Sort Key: bar.a
> -> Seq Scan on public.bar (cost=0.00..32.60 rows=2260
> width=8)
> Output: bar.a, bar.b
> -> Sort (cost=158.51..164.16 rows=2260 width=8)
> Output: foo.b, foo.a
> Sort Key: foo.a
> -> Seq Scan on public.foo (cost=0.00..32.60 rows=2260
> width=8)
> Output: foo.b, foo.a
> -> Hash (cost=1.01..1.01 rows=1 width=8)
> Output: lt.a, lt.b
> -> Seq Scan on public.lt (cost=0.00..1.01 rows=1 width=8)
> Output: lt.a, lt.b
> (21 rows)
>
> The subquery q is pulled up, so there won't be trace of q in the join tree
> except may be a useless RTE for the subquery. There will be RelOptInfo
> representing join between lt, bar and foo and a RelOptInfo for join between
> bar and foo. The join filter bar.b + foo.b < 10 needs to be evaluated before
> joining (bar, foo) with lt and should go with bar left join foo. But the
> syntax doesn't support something like "bar left join foo on (bar.a = foo.a)
> where bar.b + foo.b". So we will have to construct a SELECT statement for
> this join and add to the FROM clause with a subquery alias and then refer
> the columns of foo and bar with that subquery alias.

Hmm, does it work if we put bar.b + foo.b < 10 in the ON clause for
the join between lt and foo/bar? I think so...

> Further during the process of qual placement, quals that can be evaluated at
> the level of given relation in the join tree are attached to that relation
> if they can be pushed down. Thus if we see a qual attached to a given
> relation, AFAIU, we can not say whether it needs to be evaluated there
> (similar to above query) or planner pushed it down for optimization, and
> thus for every join relation with quals we will need to build subqueries
> with aliases.

I don't think that's true. I theorize that every qual can either go
into the top level WHERE clause or the ON clause of some join.

The patch implements your algorithm to deparse a query as described in previous mail. The logic is largely coded in deparseFromExprForRel() and foreign_join_ok(). The later one pulls up the clauses from joining relations and first one deparses the FROM clause recursively.

While you suggested that we construct FROM clauses while constructing fpinfo, there is problem maintaining params_list. While we deparse the clauses, we will build the params lists independently for the joining relations and might have conflicting param indexes depending upon when a particular node is seen. Also, during the process as more and more outer relations get added to the join tree being pushed down, some parameters might vanish. So, if have to build it while we construct fpinfo we will have to find a way to sanitize the params_list.

I have modified the deparseColumnRef and related functions to output <relation alias>.<column name> while deparsing the join tree. Updated deparseLockingClause to output FOR SHARE/FOR UPDATE clauses at the outermost query level. Removed the useless functions and structure members.

I haven't run pgindent on the changes yet. Sorry. Mostly, I will be able to do that for the next version.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

04 февраля 2016 г., 00:12:51

On Wed, Feb 3, 2016 at 12:08 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> The patch implements your algorithm to deparse a query as described in
> previous mail. The logic is largely coded in deparseFromExprForRel() and
> foreign_join_ok(). The later one pulls up the clauses from joining relations
> and first one deparses the FROM clause recursively.

Cool!

+               /* Add outer relation. */
+               appendStringInfo(buf, "(%s", join_sql_o.data);
+
+               /* Add join type */
+               appendStringInfo(buf, " %s JOIN ",
get_jointype_name(fpinfo->jointype));
+
+               /* Add inner relation */
+               appendStringInfo(buf, "%s", join_sql_i.data);
+
+               /* Append ON clause; ON (TRUE) in case empty join clause list */
+               appendStringInfoString(buf, " ON ");

Uh, couldn't that all be done as a single appendStringInfo?

It seems a little tortured the way you're passing "relations" all the
way down the callstack from deparseSelectStmtForRel, and at each level
it might be NULL.  If you made a rule that the caller MUST pass a
StringInfo, then you could get rid of some conditional logic in
deparseFromExprForRel.   By the way, deparseSelectSql()'s header
comment could use an update to mention this additional argument.
Generally, it's helpful to say in each relevant function header
comment something like "May be NULL" or "Must not be NULL" in cases
like this to clarify the API contract.

Similarly, I would be inclined to continue to require that
deparseTargetList() have retrieved_attrs != NULL.  If the caller
doesn't want the value, it can pass a dummy variable and ignore the
return value.  This is of course slightly more cycles, but I think
it's unlikely to matter, and making the code simpler would be good.

+ * Function is the entry point to deparse routines while constructing
+ * ForeignScan plan or estimating cost and size for ForeignPath. It is called
+ * recursively to build SELECT statements for joining relations of a
pushed down
+ * foreign join.

"This function is the entrypoint to the routines, either when
constructing ForeignScan plan or when estimating" etc.

+ * tuple descriptor for the corresponding foreign scan. For a base relation,
+ * which is not part of a pushed down join, fpinfo->attrs_used can be used to
+ * construct SELECT clause, thus the function doesn't need tlist. Hence when
+ * tlist passed, the function assumes that it's constructing the SELECT
+ * statement to be part of a pushed down foreign join.

I thought you got rid of that assumption.  I think it should be gotten
rid of, and then the comment can go too.  If we're keeping the comment
for some reason, should be "doesn't need the tlist" and when "when the
tlist is passed".

+ * 1, since those are the attribute numbers are in the corresponding scan.

Extra "are".  Should be: "Those are the attribute numbers in the
corresponding scan."

Would it be nuts to set fdw_scan_tlist in all cases?  Would the code
come out simpler than what we have now?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

04 февраля 2016 г., 01:56:45

On Wed, Feb 3, 2016 at 12:08 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> PFA patches with naming conventions similar to previous ones.
> pg_fdw_core_v7.patch: core changes
> pg_fdw_join_v7.patch: postgres_fdw changes for join pushdown
> pg_join_pd_v7.patch: combined patch for ease of testing.

Hmm, I think that GetPathForEPQRecheck() is a pretty terrible name.
How about GetExistingJoinPath()?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

04 февраля 2016 г., 02:00:08

On Wed, Feb 3, 2016 at 5:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Feb 3, 2016 at 12:08 PM, Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com> wrote:
>> PFA patches with naming conventions similar to previous ones.
>> pg_fdw_core_v7.patch: core changes
>> pg_fdw_join_v7.patch: postgres_fdw changes for join pushdown
>> pg_join_pd_v7.patch: combined patch for ease of testing.
>
> Hmm, I think that GetPathForEPQRecheck() is a pretty terrible name.
> How about GetExistingJoinPath()?

Oops.  Hit Send too soon.  Also, how about writing if
(path->param_info != NULL) continue; instead of burying the core of
the function in another level of indentation?  I think you should skip
paths that aren't parallel_safe, too, and the documentation should be
clear that this will find an unparameterized, parallel-safe joinpath
if one exists.

+                               ForeignPath *foreign_path;
+                               foreign_path = (ForeignPath
*)joinpath->outerjoinpath;

Maybe insert a blank line between here, and in the other, similar case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Etsuro Fujita

Дата:

04 февраля 2016 г., 12:00:31

On 2016/02/04 8:00, Robert Haas wrote:
> On Wed, Feb 3, 2016 at 5:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Wed, Feb 3, 2016 at 12:08 PM, Ashutosh Bapat
>> <ashutosh.bapat@enterprisedb.com> wrote:
>>> PFA patches with naming conventions similar to previous ones.
>>> pg_fdw_core_v7.patch: core changes
>>> pg_fdw_join_v7.patch: postgres_fdw changes for join pushdown
>>> pg_join_pd_v7.patch: combined patch for ease of testing.

Thank you for working on this, Ashutosh and Robert!  I've not look at 
the patches closely yet, but ISTM the patches would be really in good shape!

>> Hmm, I think that GetPathForEPQRecheck() is a pretty terrible name.
>> How about GetExistingJoinPath()?

+1

> Oops.  Hit Send too soon.  Also, how about writing if
> (path->param_info != NULL) continue; instead of burying the core of
> the function in another level of indentation?  I think you should skip
> paths that aren't parallel_safe, too, and the documentation should be
> clear that this will find an unparameterized, parallel-safe joinpath
> if one exists.
>
> +                               ForeignPath *foreign_path;
> +                               foreign_path = (ForeignPath
> *)joinpath->outerjoinpath;
>
> Maybe insert a blank line between here, and in the other, similar case.

* Is it safe to replace outerjoinpath with its fdw_outerpath the 
following way?  I think that if the join relation represented by 
outerjoinpath has local conditions that can't be executed remotely, we 
have to keep outerjoinpath in the path tree; we will otherwise fail to 
execute the local conditions.  No?

+            /*
+             * If either inner or outer path is a ForeignPath corresponding to
+             * a pushed down join, replace it with the fdw_outerpath, so that we
+             * maintain path for EPQ checks built entirely of local join
+             * strategies.
+             */
+            if (IsA(joinpath->outerjoinpath, ForeignPath))
+            {
+                ForeignPath *foreign_path;
+                foreign_path = (ForeignPath *)joinpath->outerjoinpath;
+                if (foreign_path->path.parent->reloptkind == RELOPT_JOINREL)
+                    joinpath->outerjoinpath = foreign_path->fdw_outerpath;
+            }

* IIUC, that function will be used by custom joins, so I think it would 
be better to put that function somewhere in the /optimizer directory 
(pathnode.c?).

Best regards,
Etsuro Fujita

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Etsuro Fujita

Дата:

04 февраля 2016 г., 12:02:39

On 2016/01/29 17:52, Ashutosh Bapat wrote:
> On Fri, Jan 29, 2016 at 2:05 PM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote:

>     On 2016/01/29 1:26, Ashutosh Bapat wrote:
>         Here is the summary of changes from the last set of patches

>         2. Included fix for EvalPlanQual in postgres_fdw - an alternate
>         local
>         path is obtained from the list of paths linked to the joinrel. Since
>         this is done before adding the ForeignPath, we should be a local
>         path
>         available for given join.

>     I looked at that code in the patch (ie, postgresRecheckForeignScan
>     and the helper function that creates a local join path for a given
>     foreign join path.), briefly.  Maybe I'm missing something, but I
>     think that is basically the same as the fix I proposed for
>     addressing this issue, posted before [1], right?

> Yes, although I have added functions to copy the paths, not consider
> pathkeys and change the relevant members of the paths. Sorry  if I have
> missed giving you due credits.

>        If so, my concern is, the helper function probably wouldn't
>     extend to the parameterized-foreign-join-path cases, though that
>     would work well for the unparameterized-foreign-join-path cases.  We
>     don't support parameterized-foreign-join paths for 9.6?

> If we do not find a local path with given parameterization, it means
> there are other local parameterized paths which are superior to it. This
> possibly indicates that there will be foreign join parameterised paths
> which are superior to this parameterized path, so we basically do not
> create foreign join path with that parameterization.

The latest version of the postgres_fdw join pushdown patch will support 
only the unparameterized-path case, so we don't have to consider this, 
but why do you think the superiority of parameterizations is preserved 
between remote joining and local joining?

Best regards,
Etsuro Fujita

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Etsuro Fujita

Дата:

04 февраля 2016 г., 13:01:07

On 2016/02/04 17:58, Etsuro Fujita wrote:
> On 2016/02/04 8:00, Robert Haas wrote:
>> On Wed, Feb 3, 2016 at 5:56 PM, Robert Haas <robertmhaas@gmail.com>
>> wrote:
>>> On Wed, Feb 3, 2016 at 12:08 PM, Ashutosh Bapat
>>> <ashutosh.bapat@enterprisedb.com> wrote:
>>>> PFA patches with naming conventions similar to previous ones.
>>>> pg_fdw_core_v7.patch: core changes
>>>> pg_fdw_join_v7.patch: postgres_fdw changes for join pushdown
>>>> pg_join_pd_v7.patch: combined patch for ease of testing.

One more: I think the following in postgresGetForeignJoinPaths() is a 
good idea, but I think it's okay to just check whether root->rowMarks is 
non-NIL, because that since we have rowmarks for all base relations 
except the target, if we have root->parse->commandType==CMD_DELETE (or 
root->parse->commandType==CMD_UPDATE), then there would be at least one 
non-target base relation in the joinrel, which would have a rowmark.

+    if (root->parse->commandType == CMD_DELETE ||
+        root->parse->commandType == CMD_UPDATE ||
+        root->rowMarks)
+    {
+        epq_path = GetPathForEPQRecheck(joinrel);
+        if (!epq_path)
+        {
+            elog(DEBUG3, "could not push down foreign join because a local path 
suitable for EPQ checks was not found");
+            return;
+        }
+    }

Best regards,
Etsuro Fujita

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

04 февраля 2016 г., 15:42:52

* Is it safe to replace outerjoinpath with its fdw_outerpath the following way? I think that if the join relation represented by outerjoinpath has local conditions that can't be executed remotely, we have to keep outerjoinpath in the path tree; we will otherwise fail to execute the local conditions. No?

+ /*
+ * If either inner or outer path is a ForeignPath corresponding to
+ * a pushed down join, replace it with the fdw_outerpath, so that we
+ * maintain path for EPQ checks built entirely of local join
+ * strategies.
+ */
+ if (IsA(joinpath->outerjoinpath, ForeignPath))
+ {
+ ForeignPath *foreign_path;
+ foreign_path = (ForeignPath *)joinpath->outerjoinpath;
+ if (foreign_path->path.parent->reloptkind == RELOPT_JOINREL)
+ joinpath->outerjoinpath = foreign_path->fdw_outerpath;
+ }

all the conditions (local and remote) should be part of fdw_outerpath as well, since that's the alternate local path, which should produce (when converted to the plan) the same result as the foreign path. fdw_outerpath should be a local path set when paths for outerjoinpath->parent was being created. Am I missing something?

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

04 февраля 2016 г., 15:51:38

If so, my concern is, the helper function probably wouldn't
extend to the parameterized-foreign-join-path cases, though that
would work well for the unparameterized-foreign-join-path cases. We
don't support parameterized-foreign-join paths for 9.6?

If we do not find a local path with given parameterization, it means
there are other local parameterized paths which are superior to it. This
possibly indicates that there will be foreign join parameterised paths
which are superior to this parameterized path, so we basically do not
create foreign join path with that parameterization.

The latest version of the postgres_fdw join pushdown patch will support only the unparameterized-path case, so we don't have to consider this, but why do you think the superiority of parameterizations is preserved between remote joining and local joining?

AFAIU, parameterization for local paths bubbles up from base relations. For foreign relations, we calculate the cost of parameterization when use_remote_estimate is ON, which means it's accurate. So, except that we will get clause selectivity wrong (if foreign tables were analyzed regularly even that won't be the case, I guess) resulting in some small sway in the costs as compared to those of parameterized foreign join paths. So, I am guessing that the local estimates for parameterized join paths would be closer to parameterized foreign paths (if we were to produce those). Hence my statement. There is always a possibility that those two costs are way too different, hence I have used phrase "possibly" there. I could be wrong.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

04 февраля 2016 г., 15:57:11

On Thu, Feb 4, 2016 at 3:28 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:

On 2016/02/04 17:58, Etsuro Fujita wrote:
On 2016/02/04 8:00, Robert Haas wrote:
On Wed, Feb 3, 2016 at 5:56 PM, Robert Haas <robertmhaas@gmail.com>
wrote:
On Wed, Feb 3, 2016 at 12:08 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
PFA patches with naming conventions similar to previous ones.
pg_fdw_core_v7.patch: core changes
pg_fdw_join_v7.patch: postgres_fdw changes for join pushdown
pg_join_pd_v7.patch: combined patch for ease of testing.

One more: I think the following in postgresGetForeignJoinPaths() is a good idea, but I think it's okay to just check whether root->rowMarks is non-NIL, because that since we have rowmarks for all base relations except the target, if we have root->parse->commandType==CMD_DELETE (or root->parse->commandType==CMD_UPDATE), then there would be at least one non-target base relation in the joinrel, which would have a rowmark.

Sorry, I am unable to understand it fully. But what you are suggesting that if there are root->rowMarks, then we are sure that there is at least one base relation apart from the target, which needs locking rows. Even if we don't have one, still changes in a row of target relation after it was scanned, can result in firing EPQ check, which would need the local plan to be executed, thus even if root->rowMarks is NIL, EPQ check can fire and we will need alternate local plan.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

04 февраля 2016 г., 19:55:14

On Thu, Feb 4, 2016 at 2:42 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Feb 3, 2016 at 12:08 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> The patch implements your algorithm to deparse a query as described in
> previous mail. The logic is largely coded in deparseFromExprForRel() and
> foreign_join_ok(). The later one pulls up the clauses from joining relations
> and first one deparses the FROM clause recursively.

Cool!

+ /* Add outer relation. */
+ appendStringInfo(buf, "(%s", join_sql_o.data);
+
+ /* Add join type */
+ appendStringInfo(buf, " %s JOIN ",
get_jointype_name(fpinfo->jointype));
+
+ /* Add inner relation */
+ appendStringInfo(buf, "%s", join_sql_i.data);
+
+ /* Append ON clause; ON (TRUE) in case empty join clause list */
+ appendStringInfoString(buf, " ON ");

Uh, couldn't that all be done as a single appendStringInfo?

Done.

It seems a little tortured the way you're passing "relations" all the
way down the callstack from deparseSelectStmtForRel, and at each level
it might be NULL. If you made a rule that the caller MUST pass a
StringInfo, then you could get rid of some conditional logic in
deparseFromExprForRel. By the way, deparseSelectSql()'s header
comment could use an update to mention this additional argument.
Generally, it's helpful to say in each relevant function header
comment something like "May be NULL" or "Must not be NULL" in cases
like this to clarify the API contract.

Done.

How about building this string when we construct fpinfo? We will waste some cycles for base relations but we will have lesser arguments in deparsing routines. I have attached patch recursive_relations.patch implementing this idea. The patch can be applied on top of the attached patches.

Similarly, I would be inclined to continue to require that
deparseTargetList() have retrieved_attrs != NULL. If the caller
doesn't want the value, it can pass a dummy variable and ignore the
return value. This is of course slightly more cycles, but I think
it's unlikely to matter, and making the code simpler would be good.

Done.

+ * Function is the entry point to deparse routines while constructing
+ * ForeignScan plan or estimating cost and size for ForeignPath. It is called
+ * recursively to build SELECT statements for joining relations of a
pushed down
+ * foreign join.

"This function is the entrypoint to the routines, either when
constructing ForeignScan plan or when estimating" etc.

I have removed this comment altogether. The second sentence in the comment no more holds true as we are not calling this function recursively any more. The first statement too doesn't add much value, The thing that is says, was true even before join pushdown and at that time that sentence wasn't there. The opening comment says what that function does.

+ * tuple descriptor for the corresponding foreign scan. For a base relation,
+ * which is not part of a pushed down join, fpinfo->attrs_used can be used to
+ * construct SELECT clause, thus the function doesn't need tlist. Hence when
+ * tlist passed, the function assumes that it's constructing the SELECT
+ * statement to be part of a pushed down foreign join.

I thought you got rid of that assumption. I think it should be gotten
rid of, and then the comment can go too. If we're keeping the comment
for some reason, should be "doesn't need the tlist" and when "when the
tlist is passed".

Done. tlist will be used only for join relations. For base relations fpinfo->attrs_used will be used.

+ * 1, since those are the attribute numbers are in the corresponding scan.

Extra "are". Should be: "Those are the attribute numbers in the
corresponding scan."

I don't have that comment in the patch anymore. Probably got removed as part of the other refactoring.

Would it be nuts to set fdw_scan_tlist in all cases? Would the code
come out simpler than what we have now?

deparesTargetList has an optimization when whole-row reference appears. It doesn't include whole-row reference and instead includes all the attributes, so that whole-row reference can be constructed at the time of projection. We will have to mimic similar logic while creating fdw_scan_tlist for base relations. Otherwise, we will fetch larger row from the foreign table. I am still working on this part. Mostly will post it with the next patch.

In set_foreignscan_references(), we have
1109 if (fscan->fdw_scan_tlist != NIL || fscan->scan.scanrelid == 0)

If we are to set fdw_scan_tlist for base relation, it would stamp the Vars with INDEX_VAR which would be undesirable. May be we should just change that condition to if (fscan->scan.scanrelid == 0). What do you think?

Attaching patches
pg_fdw_core_v8.patch: core changes
pg_fdw_join_v8.patch: postgres_fdw changes for join pushdown
pg_join_pd_v8.patch: combined patch for ease of testing.

recursive_relations.patch: for building relation description while constructing fpinfo.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

04 февраля 2016 г., 19:55:25

On Thu, Feb 4, 2016 at 4:30 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Feb 3, 2016 at 5:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Feb 3, 2016 at 12:08 PM, Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com> wrote:
>> PFA patches with naming conventions similar to previous ones.
>> pg_fdw_core_v7.patch: core changes
>> pg_fdw_join_v7.patch: postgres_fdw changes for join pushdown
>> pg_join_pd_v7.patch: combined patch for ease of testing.
>
> Hmm, I think that GetPathForEPQRecheck() is a pretty terrible name.
> How about GetExistingJoinPath()?

GetExistingLocalJoinPath()? Used that.

Oops. Hit Send too soon. Also, how about writing if
(path->param_info != NULL) continue; instead of burying the core of
the function in another level of indentation?

Hmm. Done.

I think you should skip
paths that aren't parallel_safe, too, and the documentation should be
clear that this will find an unparameterized, parallel-safe joinpath
if one exists.

A query with FOR UPDATE/SHARE will be considered parallel unsafe in has_parallel_hazard_walker() and root->glob->parallelModeOK will be marked false. This implies that none of the base relations and hence join relations will be marked as consider_parallel. IIUC your logic, none of the queries with FOR UPDATE/SHARE will get a local path which is marked parallel_safe and thus join will not be pushed down. Why do you think we need to skip paths that aren't parallel_safe? I have left aside this change in the latest patches.

+ ForeignPath *foreign_path;
+ foreign_path = (ForeignPath
*)joinpath->outerjoinpath;

Maybe insert a blank line between here, and in the other, similar case.

Done.

Patches attached with the previous mail.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

05 февраля 2016 г., 01:19:16

On Thu, Feb 4, 2016 at 11:55 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> Patches attached with the previous mail.

The core patch seemed to me to be in good shape now, so I committed
that.  Not sure I'll be able to get to another read-through of the
main patch today.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

05 февраля 2016 г., 06:33:37

On Thu, Feb 4, 2016 at 11:55 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> A query with FOR UPDATE/SHARE will be considered parallel unsafe in
> has_parallel_hazard_walker() and root->glob->parallelModeOK will be marked
> false. This implies that none of the base relations and hence join relations
> will be marked as consider_parallel. IIUC your logic, none of the queries
> with FOR UPDATE/SHARE will get a local path which is marked parallel_safe
> and thus join will not be pushed down. Why do you think we need to skip
> paths that aren't parallel_safe? I have left aside this change in the latest
> patches.

I changed this back before committing but, ah nuts, you're right.  Sigh.  Sorry.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Etsuro Fujita

Дата:

05 февраля 2016 г., 11:21:02

On 2016/02/04 21:42, Ashutosh Bapat wrote:

>     * Is it safe to replace outerjoinpath with its fdw_outerpath the
>     following way?  I think that if the join relation represented by
>     outerjoinpath has local conditions that can't be executed remotely,
>     we have to keep outerjoinpath in the path tree; we will otherwise
>     fail to execute the local conditions.  No?
>
>     +                       /*
>     +                        * If either inner or outer path is a
>     ForeignPath corresponding to
>     +                        * a pushed down join, replace it with the
>     fdw_outerpath, so that we
>     +                        * maintain path for EPQ checks built
>     entirely of local join
>     +                        * strategies.
>     +                        */
>     +                       if (IsA(joinpath->outerjoinpath, ForeignPath))
>     +                       {
>     +                               ForeignPath *foreign_path;
>     +                               foreign_path = (ForeignPath
>     *)joinpath->outerjoinpath;
>     +                               if
>     (foreign_path->path.parent->reloptkind == RELOPT_JOINREL)
>     +                                       joinpath->outerjoinpath =
>     foreign_path->fdw_outerpath;
>     +                       }

> all the conditions (local and remote) should be part of fdw_outerpath as
> well, since that's the alternate local path, which should produce (when
> converted to the plan) the same result as the foreign path.
> fdw_outerpath should be a local path set when paths for
> outerjoinpath->parent was being created. Am I missing something?

I assumed by mistake that only the remote conditions were evaluated in a
plan created from each fdw_outerpath.  Sorry for that.  I think that is
a good idea!

Btw, IIUC, I think the patch fails to adjust the targetlist of the top
plan created that way, to output the fdw_scan_tlist, as discussed in [1]
(ie, I think the attached patch is needed, which is created on top of
your patch pg_fdw_join_v8.patch).

Best regards,
Etsuro Fujita

[1]
http://www.postgresql.org/message-id/CA+TgmobA4MSKgquicgt5CkbpQJ-TmpqEfHt_wy49ndwa91Wkpw@mail.gmail.com

Вложения

fdw-outer-plan-tlist.patch

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

05 февраля 2016 г., 11:50:09

Btw, IIUC, I think the patch fails to adjust the targetlist of the top plan created that way, to output the fdw_scan_tlist, as discussed in [1] (ie, I think the attached patch is needed, which is created on top of your patch pg_fdw_join_v8.patch).

fdw_scan_tlist represents the output fetched from the foreign server and is not necessarily the output of ForeignScan. ForeignScan node's output is represented by tlist argument to.

1119     return make_foreignscan(tlist,
1120                             local_exprs,
1121                             scan_relid,
1122                             params_list,
1123                             fdw_private,
1124                             fdw_scan_tlist,
1125                             remote_exprs,
1126                             outer_plan);

This tlist is built using build_path_tlist() for all join plans. IIUC, all of them output the same targetlist. We don't need to make sure that targetlist match as long as we are using the targetlist passed in by create_scan_plan(). Do you have a counter example?

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Etsuro Fujita

Дата:

05 февраля 2016 г., 12:08:17

On 2016/02/04 21:57, Ashutosh Bapat wrote:

>     One more: I think the following in postgresGetForeignJoinPaths() is
>     a good idea, but I think it's okay to just check whether
>     root->rowMarks is non-NIL, because that since we have rowmarks for
>     all base relations except the target, if we have
>     root->parse->commandType==CMD_DELETE (or
>     root->parse->commandType==CMD_UPDATE), then there would be at least
>     one non-target base relation in the joinrel, which would have a rowmark.

> Sorry, I am unable to understand it fully. But what you are suggesting
> that if there are root->rowMarks, then we are sure that there is at
> least one base relation apart from the target, which needs locking rows.
> Even if we don't have one, still changes in a row of target relation
> after it was scanned, can result in firing EPQ check, which would need
> the local plan to be executed, thus even if root->rowMarks is NIL, EPQ
> check can fire and we will need alternate local plan.

Yeah, I think that is true, but if root->rowMarks==NIL, we won't have 
non-target foreign tables, and therefore postgresGetForeignJoinPaths() 
will never be called.  No?

Best regards,
Etsuro Fujita

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

05 февраля 2016 г., 12:23:32

On Fri, Feb 5, 2016 at 9:03 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Feb 4, 2016 at 11:55 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> A query with FOR UPDATE/SHARE will be considered parallel unsafe in
> has_parallel_hazard_walker() and root->glob->parallelModeOK will be marked
> false. This implies that none of the base relations and hence join relations
> will be marked as consider_parallel. IIUC your logic, none of the queries
> with FOR UPDATE/SHARE will get a local path which is marked parallel_safe
> and thus join will not be pushed down. Why do you think we need to skip
> paths that aren't parallel_safe? I have left aside this change in the latest
> patches.

I changed this back before committing but, ah nuts, you're right. Sigh. Sorry.

I have corrected this in this set of patches. Also, I have included the change to build the join relation description while constructing fpinfo in the main patch since that avoids repeated building of the same at a small cost of constructing relation name for base relations, which goes waste if that relation is not going to be part of any pushable join tree.

Ran pgindent as well.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

05 февраля 2016 г., 17:10:03

Would it be nuts to set fdw_scan_tlist in all cases? Would the code
come out simpler than what we have now?

PFA the patch that can be applied on v9 patches.

If there is a whole-row reference for base relation, instead of adding that as an additional column deparseTargetList() creates a list of all the attributes of that foreign table . The whole-row reference gets constructed during projection. This saves some network bandwidth while transferring the data from the foreign server. If we build the target list for base relation, we should include Vars for all the columns (like deparseTargetList). Thus we borrow some code from deparseTargetList to get all the attributes of a relation. I included this logic in function build_tlist_from_attrs_used(), which is being called by build_tlist_to_deparse(). So, before calling deparseSelectStmtForRel() the callers can just call build_tlist_to_deparse() and pass the targetlist to deparseSelectStmtForRel() and use the same to be handed over to the core code as fdw_scan_tlist. build_tlist_to_deparse() also constructs retrieved_attrs list, so that doesn't need to be passed around in deparse routines.

But we now have similar code in three places deparseTargetList(), deparseAnalyzeSql() and build_tlist_from_attrs_used(). So, I re-wrote deparseTargetList() (which is now used to deparse returning list) to call build_tlist_from_attrs_used() followed by deparseExplicitTargetList(). The later and its minion deparseVar requires a deparse_expr_cxt to be passed. deparse_expr_cxt has a member to store RelOptInfo of the relation being deparsed. The callers of deparseReturningList() do not have it and thus deparse_expr_cxt required changes, which in turn required changes in other places. All in all, a larger refactoring. It touches more places than necessary for foreign join push down. So, I think, we should try that as a separate refactoring work. We may just do the work described in the first paragraph for join pushdown, but we will then be left with duplicate code, which I don't think is worth the output.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

targetlist_for_all.patch

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

05 февраля 2016 г., 23:25:50

On Fri, Feb 5, 2016 at 9:09 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
>> Would it be nuts to set fdw_scan_tlist in all cases?  Would the code
>> come out simpler than what we have now?
>
> PFA the patch that can be applied on v9 patches.
>
> If there is a whole-row reference for base relation, instead of adding that
> as an additional column deparseTargetList() creates a list of all the
> attributes of that foreign table . The whole-row reference gets constructed
> during projection. This saves some network bandwidth while transferring the
> data from the foreign server. If we build the target list for base relation,
> we should include Vars for all the columns (like deparseTargetList). Thus we
> borrow some code from deparseTargetList to get all the attributes of a
> relation. I included this logic in function build_tlist_from_attrs_used(),
> which is being called by build_tlist_to_deparse(). So, before calling
> deparseSelectStmtForRel() the callers can just call build_tlist_to_deparse()
> and pass the targetlist to deparseSelectStmtForRel() and use the same to be
> handed over to the core code as fdw_scan_tlist. build_tlist_to_deparse()
> also constructs retrieved_attrs list, so that doesn't need to be passed
> around in deparse routines.
>
> But we now have similar code in three places deparseTargetList(),
> deparseAnalyzeSql() and build_tlist_from_attrs_used(). So, I re-wrote
> deparseTargetList() (which is now used to deparse returning list) to call
> build_tlist_from_attrs_used() followed by deparseExplicitTargetList(). The
> later and its minion deparseVar requires a deparse_expr_cxt to be passed.
> deparse_expr_cxt has a member to store RelOptInfo of the relation being
> deparsed. The callers of deparseReturningList() do not have it and thus
> deparse_expr_cxt required changes, which in turn required changes in other
> places. All in all, a larger refactoring. It touches more places than
> necessary for foreign join push down. So, I think, we should try that as a
> separate refactoring work. We may just do the work described in the first
> paragraph for join pushdown, but we will then be left with duplicate code,
> which I don't think is worth the output.

Yeah, I'm not convinced this is actually simpler; at first look, it
seems like it is just moving the complexity around.  I don't like the
fact that there are so many places where we have to do one thing for a
baserel and something totally different for a joinrel, which was the
point of my comment.  But this seems to add one instance of that and
remove one instance of that, so I don't see how we're coming out
better than a wash.  Maybe it's got more merit than I'm giving it
credit for, and I'm just not seeing it right at the moment...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

05 февраля 2016 г., 23:30:57

On Fri, Feb 5, 2016 at 4:23 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> I have corrected this in this set of patches. Also, I have included the
> change to build the join relation description while constructing fpinfo in
> the main patch since that avoids repeated building of the same at a small
> cost of constructing relation name for base relations, which goes waste if
> that relation is not going to be part of any pushable join tree.
>
> Ran pgindent as well.

pg_fdw_join_v9.patch does not aplpy.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

06 февраля 2016 г., 08:46:40

On Sat, Feb 6, 2016 at 2:00 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Feb 5, 2016 at 4:23 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> I have corrected this in this set of patches. Also, I have included the
> change to build the join relation description while constructing fpinfo in
> the main patch since that avoids repeated building of the same at a small
> cost of constructing relation name for base relations, which goes waste if
> that relation is not going to be part of any pushable join tree.
>
> Ran pgindent as well.

pg_fdw_join_v9.patch does not aplpy.

Here it is rebased. Thanks for the pgindent run and committing core changes. I have to manage only one patch now :)

pgindent is giving trouble with following two comments

2213             /* Run time cost includes:
2214              * 1. Run time cost (total_cost - startup_cost) of relations being
2215              *    joined
2216              * 2. Run time cost of applying join clauses on the cross product of
2217              *    the joining relations.
2218              * 3. Run time cost of applying pushed down other clauses on the
2219              *    result of join
2220              * 4. Run time cost of applying nonpushable other clauses locally
2221              *    on the result fetched from the foreign server.
2222              */

which I want itemized with each item starting on separate line. pgindent just bunches everything together.

1159         /*
1160          * For a join relation FROM clause entry is deparsed as
1161          * ((outer relation) <join type> (inner relation) ON (joinclauses)
1162          */

where I want the second line as a separate line, but pgindent puts those two line together breaking the continuity of second line content.

How do I make pgindent respect those changes as they are?

Вложения

pg_join_pd_v10.patch

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

07 февраля 2016 г., 06:39:58

On Sat, Feb 6, 2016 at 12:46 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> Here it is rebased. Thanks for the pgindent run and committing core changes.
> I have to manage only one patch now :)
>
> pgindent is giving trouble with following two comments
>
> 2213             /* Run time cost includes:
> 2214              * 1. Run time cost (total_cost - startup_cost) of
> relations being
> 2215              *    joined
> 2216              * 2. Run time cost of applying join clauses on the cross
> product of
> 2217              *    the joining relations.
> 2218              * 3. Run time cost of applying pushed down other clauses
> on the
> 2219              *    result of join
> 2220              * 4. Run time cost of applying nonpushable other clauses
> locally
> 2221              *    on the result fetched from the foreign server.
> 2222              */
>
> which I want itemized with each item starting on separate line. pgindent
> just bunches everything together.

The thing to do here is leave a blank line between each one.  You can
also put a line of dashes before and after the comment (see many
examples elsewhere in the source tree) to force pgindent to leave that
section completely untouched, but I think that this sort of list looks
better with blank lines anyway, so I'd go for that solution.

> 1159         /*
> 1160          * For a join relation FROM clause entry is deparsed as
> 1161          * ((outer relation) <join type> (inner relation) ON
> (joinclauses)
> 1162          */
> where I want the second line as a separate line, but pgindent puts those two
> line together breaking the continuity of second line content.
>
> How do I make pgindent respect those changes as they are?

Same idea here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

07 февраля 2016 г., 15:11:46

On Sun, Feb 7, 2016 at 9:09 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Sat, Feb 6, 2016 at 12:46 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> Here it is rebased. Thanks for the pgindent run and committing core changes.
> I have to manage only one patch now :)
>
> pgindent is giving trouble with following two comments
>
> 2213 /* Run time cost includes:
> 2214 * 1. Run time cost (total_cost - startup_cost) of
> relations being
> 2215 * joined
> 2216 * 2. Run time cost of applying join clauses on the cross
> product of
> 2217 * the joining relations.
> 2218 * 3. Run time cost of applying pushed down other clauses
> on the
> 2219 * result of join
> 2220 * 4. Run time cost of applying nonpushable other clauses
> locally
> 2221 * on the result fetched from the foreign server.
> 2222 */
>
> which I want itemized with each item starting on separate line. pgindent
> just bunches everything together.

The thing to do here is leave a blank line between each one. You can
also put a line of dashes before and after the comment (see many
examples elsewhere in the source tree) to force pgindent to leave that
section completely untouched, but I think that this sort of list looks
better with blank lines anyway, so I'd go for that solution.

> 1159 /*
> 1160 * For a join relation FROM clause entry is deparsed as
> 1161 * ((outer relation) <join type> (inner relation) ON
> (joinclauses)
> 1162 */
> where I want the second line as a separate line, but pgindent puts those two
> line together breaking the continuity of second line content.
>
> How do I make pgindent respect those changes as they are?

Same idea here.

Thanks a lot for the trick. Attached patch with the comments fixed.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

pg_join_pd.patch

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Etsuro Fujita

Дата:

08 февраля 2016 г., 13:48:32

On 2016/02/05 17:50, Ashutosh Bapat wrote:

>     Btw, IIUC, I think the patch fails to adjust the targetlist of the
>     top plan created that way, to output the fdw_scan_tlist, as
>     discussed in [1] (ie, I think the attached patch is needed, which is
>     created on top of your patch pg_fdw_join_v8.patch).

> fdw_scan_tlist represents the output fetched from the foreign server and
> is not necessarily the output of ForeignScan. ForeignScan node's output
> is represented by tlist argument to.
>
> 1119     return make_foreignscan(tlist,
> 1120                             local_exprs,
> 1121                             scan_relid,
> 1122                             params_list,
> 1123                             fdw_private,
> 1124                             fdw_scan_tlist,
> 1125                             remote_exprs,
> 1126                             outer_plan);
>
> This tlist is built using build_path_tlist() for all join plans. IIUC,
> all of them output the same targetlist. We don't need to make sure that
> targetlist match as long as we are using the targetlist passed in by
> create_scan_plan(). Do you have a counter example?

Maybe my explanation was not correct, but I'm saying that the 
targertlist of the above outer_plan should be set to the fdw_scan_tlist, 
to avoid misbehavior.  Here is such an example (add() in the example is 
a user defined function that simply adds two arguments, defined by: 
create function add(integer, integer) returns integer as 
'/path/to/func', 'add' language c strict):

postgres=# create foreign table foo (a int) server myserver options 
(table_name 'foo');
postgres=# create foreign table bar (a int) server myserver options 
(table_name 'bar');
postgres=# create table tab (a int, b int);
postgres=# insert into foo select a from generate_series(1, 1000) a;
postgres=# insert into bar select a from generate_series(1, 1000) a;
postgres=# insert into tab values (1, 1);
postgres=# analyze foo;
postgres=# analyze bar;
postgres=# analyze tab;

[Terminal 1]
postgres=# begin;
BEGIN
postgres=# update tab set b = b + 1 where a = 1;
UPDATE 1

[Terminal 2]
postgres=# explain verbose select tab.* from tab, foo, bar where foo.a = 
bar.a and add(foo.a, bar.a) > 0 limit 1 for update;                QUERY PLAN


--------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------- Limit  (cost=100.00..107.70 rows=1 width=70)   Output: tab.a, tab.b, tab.ctid, foo.*,
bar.*  ->  LockRows  (cost=100.00..2663.48 rows=333 width=70)         Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
  ->  Nested Loop  (cost=100.00..2660.15 rows=333 width=70)               Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
            ->  Foreign Scan  (cost=100.00..2654.97 rows=333 width=56)                     Output: foo.*, bar.*
           Filter: (add(foo.a, bar.a) > 0)                     Relations: (public.foo) INNER JOIN (public.bar)
          Remote SQL: SELECT ROW(r2.a), ROW(r3.a), r2.a, 
 
r3.a FROM (public.foo r2 INNER JOIN public.bar r3 ON (TRUE)) WHERE 
((r2.a = r3.a)) F
OR UPDATE OF r2 FOR UPDATE OF r3                     ->  Hash Join  (cost=247.50..301.25 rows=333 width=56)
             Output: foo.*, bar.*                           Hash Cond: (foo.a = bar.a)                           Join
Filter:(add(foo.a, bar.a) > 0)                           ->  Foreign Scan on public.foo 
 
(cost=100.00..135.00 rows=1000 width=32)                                 Output: foo.*, foo.a
     Remote SQL: SELECT a FROM public.foo 
 
FOR UPDATE                           ->  Hash  (cost=135.00..135.00 rows=1000 
width=32)                                 Output: bar.*, bar.a                                 ->  Foreign Scan on
public.bar
 
(cost=100.00..135.00 rows=1000 width=32)                                       Output: bar.*, bar.a
                 Remote SQL: SELECT a FROM 
 
public.bar FOR UPDATE               ->  Materialize  (cost=0.00..1.01 rows=1 width=14)                     Output:
tab.a,tab.b, tab.ctid                     ->  Seq Scan on public.tab  (cost=0.00..1.01 
 
rows=1 width=14)                           Output: tab.a, tab.b, tab.ctid
(27 rows)

postgres=# select tab.* from tab, foo, bar where foo.a = bar.a and 
add(foo.a, bar.a) > 0 limit 1 for update;

[Terminal 1]
postgres=# commit;
COMMIT

[Terminal 2] (After the commit in Terminal 1, Terminal 2 will show the 
following.) a | b
---+---
(0 rows)

This is wrong.  (Note that since the SELECT FOR UPDATE doesn't impose 
any condition on a tuple from the local table tab, the EvalPlanQual 
recheck executed should succeed.)  The reason for that is that the 
targetlist of the local join plan is the same as for the ForeignScan, 
which outputs neither foo.a nor bar.a required as an argument of the 
function add().

Best regards,
Etsuro Fujita

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

08 февраля 2016 г., 15:11:28

Jeevan Chalke pointed out that there was another conflict. This was because of commit 392998bc58a985ea978c94c23594eb214d04c744. Here's patch rebased.

On Sun, Feb 7, 2016 at 5:41 PM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:

On Sun, Feb 7, 2016 at 9:09 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Sat, Feb 6, 2016 at 12:46 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> Here it is rebased. Thanks for the pgindent run and committing core changes.
> I have to manage only one patch now :)
>
> pgindent is giving trouble with following two comments
>
> 2213 /* Run time cost includes:
> 2214 * 1. Run time cost (total_cost - startup_cost) of
> relations being
> 2215 * joined
> 2216 * 2. Run time cost of applying join clauses on the cross
> product of
> 2217 * the joining relations.
> 2218 * 3. Run time cost of applying pushed down other clauses
> on the
> 2219 * result of join
> 2220 * 4. Run time cost of applying nonpushable other clauses
> locally
> 2221 * on the result fetched from the foreign server.
> 2222 */
>
> which I want itemized with each item starting on separate line. pgindent
> just bunches everything together.

The thing to do here is leave a blank line between each one. You can
also put a line of dashes before and after the comment (see many
examples elsewhere in the source tree) to force pgindent to leave that
section completely untouched, but I think that this sort of list looks
better with blank lines anyway, so I'd go for that solution.

> 1159 /*
> 1160 * For a join relation FROM clause entry is deparsed as
> 1161 * ((outer relation) <join type> (inner relation) ON
> (joinclauses)
> 1162 */
> where I want the second line as a separate line, but pgindent puts those two
> line together breaking the continuity of second line content.
>
> How do I make pgindent respect those changes as they are?

Same idea here.

Thanks a lot for the trick. Attached patch with the comments fixed.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

pg_join_pd_v10.patch

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

08 февраля 2016 г., 15:43:26

On Mon, Feb 8, 2016 at 5:45 AM, Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp> wrote:
> Maybe my explanation was not correct, but I'm saying that the targertlist of
> the above outer_plan should be set to the fdw_scan_tlist, to avoid
> misbehavior.

Yeah, I think you're right.  So in this hunk:

+       if (foreignrel->reloptkind == RELOPT_JOINREL)
+       {
+               /* For a join relation, get the conditions from
fdw_private structure */
+               remote_conds = fpinfo->remote_conds;
+               local_exprs = fpinfo->local_conds;
+
+               /* Build the list of columns to be fetched from the
foreign server. */
+               fdw_scan_tlist = build_tlist_to_deparse(foreignrel);
+       }

I think we should also be doing outer_plan->targetlist =
fdw_scan_tlist in this block, with a comment like "Ensure that the
outer plan produces the a tuple whose descriptor matches our scan
tuple slot.  This is safe because all scans and joins support
projection, so we never need to insert a Result node."   It would
probably be good to Assert(outer_plan != NULL) before doing the
assignment, too, just as a guard against future bugs.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

08 февраля 2016 г., 16:42:05

On Mon, Feb 8, 2016 at 4:15 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:

On 2016/02/05 17:50, Ashutosh Bapat wrote:

Btw, IIUC, I think the patch fails to adjust the targetlist of the
top plan created that way, to output the fdw_scan_tlist, as
discussed in [1] (ie, I think the attached patch is needed, which is
created on top of your patch pg_fdw_join_v8.patch).

fdw_scan_tlist represents the output fetched from the foreign server and
is not necessarily the output of ForeignScan. ForeignScan node's output
is represented by tlist argument to.

1119 return make_foreignscan(tlist,
1120 local_exprs,
1121 scan_relid,
1122 params_list,
1123 fdw_private,
1124 fdw_scan_tlist,
1125 remote_exprs,
1126 outer_plan);

This tlist is built using build_path_tlist() for all join plans. IIUC,
all of them output the same targetlist. We don't need to make sure that
targetlist match as long as we are using the targetlist passed in by
create_scan_plan(). Do you have a counter example?

Maybe my explanation was not correct, but I'm saying that the targertlist of the above outer_plan should be set to the fdw_scan_tlist, to avoid misbehavior. Here is such an example (add() in the example is a user defined function that simply adds two arguments, defined by: create function add(integer, integer) returns integer as '/path/to/func', 'add' language c strict):

postgres=# create foreign table foo (a int) server myserver options (table_name 'foo');
postgres=# create foreign table bar (a int) server myserver options (table_name 'bar');
postgres=# create table tab (a int, b int);
postgres=# insert into foo select a from generate_series(1, 1000) a;
postgres=# insert into bar select a from generate_series(1, 1000) a;
postgres=# insert into tab values (1, 1);
postgres=# analyze foo;
postgres=# analyze bar;
postgres=# analyze tab;

[Terminal 1]
postgres=# begin;
BEGIN
postgres=# update tab set b = b + 1 where a = 1;
UPDATE 1

[Terminal 2]
postgres=# explain verbose select tab.* from tab, foo, bar where foo.a = bar.a and add(foo.a, bar.a) > 0 limit 1 for update;

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------
Limit (cost=100.00..107.70 rows=1 width=70)
Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
-> LockRows (cost=100.00..2663.48 rows=333 width=70)
Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
-> Nested Loop (cost=100.00..2660.15 rows=333 width=70)
Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
-> Foreign Scan (cost=100.00..2654.97 rows=333 width=56)
Output: foo.*, bar.*
Filter: (add(foo.a, bar.a) > 0)
Relations: (public.foo) INNER JOIN (public.bar)
Remote SQL: SELECT ROW(r2.a), ROW(r3.a), r2.a, r3.a FROM (public.foo r2 INNER JOIN public.bar r3 ON (TRUE)) WHERE ((r2.a = r3.a)) F
OR UPDATE OF r2 FOR UPDATE OF r3
-> Hash Join (cost=247.50..301.25 rows=333 width=56)
Output: foo.*, bar.*
Hash Cond: (foo.a = bar.a)
Join Filter: (add(foo.a, bar.a) > 0)
-> Foreign Scan on public.foo (cost=100.00..135.00 rows=1000 width=32)
Output: foo.*, foo.a
Remote SQL: SELECT a FROM public.foo FOR UPDATE
-> Hash (cost=135.00..135.00 rows=1000 width=32)
Output: bar.*, bar.a
-> Foreign Scan on public.bar (cost=100.00..135.00 rows=1000 width=32)
Output: bar.*, bar.a
Remote SQL: SELECT a FROM public.bar FOR UPDATE
-> Materialize (cost=0.00..1.01 rows=1 width=14)
Output: tab.a, tab.b, tab.ctid
-> Seq Scan on public.tab (cost=0.00..1.01 rows=1 width=14)
Output: tab.a, tab.b, tab.ctid
(27 rows)

postgres=# select tab.* from tab, foo, bar where foo.a = bar.a and add(foo.a, bar.a) > 0 limit 1 for update;

[Terminal 1]
postgres=# commit;
COMMIT

[Terminal 2] (After the commit in Terminal 1, Terminal 2 will show the following.)
a | b
---+---
(0 rows)

This is wrong. (Note that since the SELECT FOR UPDATE doesn't impose any condition on a tuple from the local table tab, the EvalPlanQual recheck executed should succeed.) The reason for that is that the targetlist of the local join plan is the same as for the ForeignScan, which outputs neither foo.a nor bar.a required as an argument of the function add().

I see what you are trying to say now. In ExecScan, ExecScanFetch will execute the outer plan for EvalPlanQual check and then at

208 if (!qual || ExecQual(qual, econtext, false))

it will try to evaluate the local conditions, where it needs the foo.a and bar.a which are not part of the projected output for ForeignScan and the outer plan.

But then aren't the local conditions being evaluated twice, once by the outer plan and then again by ExecScan? Is this OK? What happens when the local conditions have side effects? We should probably delete them from the outer_plan's quals.

The patch attached fixes the targetlist as per mail from Robert and the quals as explained above.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

pg_join_pd_v11.patch

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Jeevan Chalke

Дата:

09 февраля 2016 г., 14:30:26

Hi,

I have reviewed the patch and it looks good to me.
make/make install/make check is fine (when done without -Wall -Werror).

Here are few comments:
1.
With -Wall -Werror, I see couple of warnings:

postgres_fdw.c: In function ‘estimate_path_cost_size’:
postgres_fdw.c:2248:13: error: ‘run_cost’ may be used uninitialized in this function [-Werror=uninitialized]
postgres_fdw.c: In function ‘conversion_error_callback’:
postgres_fdw.c:3832:6: error: ‘attname’ may be used uninitialized in this function [-Werror=uninitialized]
cc1: all warnings being treated as errors
make: *** [postgres_fdw.o] Error 1

2. Typo:
scna_clauses => scan_clauses

3. Does this new addition requires documentation?

I did not see any issues with my testing. Code changes are good too.
Patch has very good test-cases testing everything required. Nice work.

Thanks.

On Mon, Feb 8, 2016 at 7:11 PM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:

On Mon, Feb 8, 2016 at 4:15 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
On 2016/02/05 17:50, Ashutosh Bapat wrote:

Btw, IIUC, I think the patch fails to adjust the targetlist of the
top plan created that way, to output the fdw_scan_tlist, as
discussed in [1] (ie, I think the attached patch is needed, which is
created on top of your patch pg_fdw_join_v8.patch).

fdw_scan_tlist represents the output fetched from the foreign server and
is not necessarily the output of ForeignScan. ForeignScan node's output
is represented by tlist argument to.

1119 return make_foreignscan(tlist,
1120 local_exprs,
1121 scan_relid,
1122 params_list,
1123 fdw_private,
1124 fdw_scan_tlist,
1125 remote_exprs,
1126 outer_plan);

This tlist is built using build_path_tlist() for all join plans. IIUC,
all of them output the same targetlist. We don't need to make sure that
targetlist match as long as we are using the targetlist passed in by
create_scan_plan(). Do you have a counter example?

Maybe my explanation was not correct, but I'm saying that the targertlist of the above outer_plan should be set to the fdw_scan_tlist, to avoid misbehavior. Here is such an example (add() in the example is a user defined function that simply adds two arguments, defined by: create function add(integer, integer) returns integer as '/path/to/func', 'add' language c strict):

postgres=# create foreign table foo (a int) server myserver options (table_name 'foo');
postgres=# create foreign table bar (a int) server myserver options (table_name 'bar');
postgres=# create table tab (a int, b int);
postgres=# insert into foo select a from generate_series(1, 1000) a;
postgres=# insert into bar select a from generate_series(1, 1000) a;
postgres=# insert into tab values (1, 1);
postgres=# analyze foo;
postgres=# analyze bar;
postgres=# analyze tab;

[Terminal 1]
postgres=# begin;
BEGIN
postgres=# update tab set b = b + 1 where a = 1;
UPDATE 1

[Terminal 2]
postgres=# explain verbose select tab.* from tab, foo, bar where foo.a = bar.a and add(foo.a, bar.a) > 0 limit 1 for update;

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------
Limit (cost=100.00..107.70 rows=1 width=70)
Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
-> LockRows (cost=100.00..2663.48 rows=333 width=70)
Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
-> Nested Loop (cost=100.00..2660.15 rows=333 width=70)
Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
-> Foreign Scan (cost=100.00..2654.97 rows=333 width=56)
Output: foo.*, bar.*
Filter: (add(foo.a, bar.a) > 0)
Relations: (public.foo) INNER JOIN (public.bar)
Remote SQL: SELECT ROW(r2.a), ROW(r3.a), r2.a, r3.a FROM (public.foo r2 INNER JOIN public.bar r3 ON (TRUE)) WHERE ((r2.a = r3.a)) F
OR UPDATE OF r2 FOR UPDATE OF r3
-> Hash Join (cost=247.50..301.25 rows=333 width=56)
Output: foo.*, bar.*
Hash Cond: (foo.a = bar.a)
Join Filter: (add(foo.a, bar.a) > 0)
-> Foreign Scan on public.foo (cost=100.00..135.00 rows=1000 width=32)
Output: foo.*, foo.a
Remote SQL: SELECT a FROM public.foo FOR UPDATE
-> Hash (cost=135.00..135.00 rows=1000 width=32)
Output: bar.*, bar.a
-> Foreign Scan on public.bar (cost=100.00..135.00 rows=1000 width=32)
Output: bar.*, bar.a
Remote SQL: SELECT a FROM public.bar FOR UPDATE
-> Materialize (cost=0.00..1.01 rows=1 width=14)
Output: tab.a, tab.b, tab.ctid
-> Seq Scan on public.tab (cost=0.00..1.01 rows=1 width=14)
Output: tab.a, tab.b, tab.ctid
(27 rows)

postgres=# select tab.* from tab, foo, bar where foo.a = bar.a and add(foo.a, bar.a) > 0 limit 1 for update;

[Terminal 1]
postgres=# commit;
COMMIT

[Terminal 2] (After the commit in Terminal 1, Terminal 2 will show the following.)
a | b
---+---
(0 rows)

This is wrong. (Note that since the SELECT FOR UPDATE doesn't impose any condition on a tuple from the local table tab, the EvalPlanQual recheck executed should succeed.) The reason for that is that the targetlist of the local join plan is the same as for the ForeignScan, which outputs neither foo.a nor bar.a required as an argument of the function add().

I see what you are trying to say now. In ExecScan, ExecScanFetch will execute the outer plan for EvalPlanQual check and then at
208 if (!qual || ExecQual(qual, econtext, false))
it will try to evaluate the local conditions, where it needs the foo.a and bar.a which are not part of the projected output for ForeignScan and the outer plan.

But then aren't the local conditions being evaluated twice, once by the outer plan and then again by ExecScan? Is this OK? What happens when the local conditions have side effects? We should probably delete them from the outer_plan's quals.

The patch attached fixes the targetlist as per mail from Robert and the quals as explained above.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: +91 20 30589500

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message.

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

09 февраля 2016 г., 16:39:43

Thanks Jeevan for your review and comments. PFA the patch which fixes those.

On Tue, Feb 9, 2016 at 5:00 PM, Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote:

Hi,

I have reviewed the patch and it looks good to me.
make/make install/make check is fine (when done without -Wall -Werror).

Here are few comments:
1.
With -Wall -Werror, I see couple of warnings:

postgres_fdw.c: In function ‘estimate_path_cost_size’:
postgres_fdw.c:2248:13: error: ‘run_cost’ may be used uninitialized in this function [-Werror=uninitialized]

Done. run_cost was declared in a block enclosing the one where it was used. So moved run_cost and initialized it. The initialized value is never used.

postgres_fdw.c: In function ‘conversion_error_callback’:
postgres_fdw.c:3832:6: error: ‘attname’ may be used uninitialized in this function [-Werror=uninitialized]
cc1: all warnings being treated as errors
make: *** [postgres_fdw.o] Error 1

Thanks for catching it. Fixed as well.

2. Typo:
scna_clauses => scan_clauses

Done.

3. Does this new addition requires documentation?

The patch pg_fdw_doc.patch adds a paragraph about join pushdown in postgres_fdw documentation.

I did not see any issues with my testing. Code changes are good too.
Patch has very good test-cases testing everything required. Nice work.

Thanks.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

09 февраля 2016 г., 22:16:28

On Tue, Feb 9, 2016 at 8:39 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> Thanks Jeevan for your review and comments. PFA the patch which fixes those.

Committed with a couple more small adjustments.

Woohoo, finally!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Jeff Janes

Дата:

10 февраля 2016 г., 09:33:28

On Tue, Feb 9, 2016 at 11:16 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Feb 9, 2016 at 8:39 AM, Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com> wrote:
>> Thanks Jeevan for your review and comments. PFA the patch which fixes those.
>
> Committed with a couple more small adjustments.

I'm getting a compiler warning which I think is coming from this commit.

postgres_fdw.c: In function 'fetch_more_data':
postgres_fdw.c:2526:17: warning: unused variable 'fsplan' [-Wunused-variable]   ForeignScan *fsplan = (ForeignScan *)
node->ss.ps.plan;

Thanks,

Jeff

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

10 февраля 2016 г., 09:33:51

Yay, finally!

Thanks.

On Wed, Feb 10, 2016 at 12:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Feb 9, 2016 at 8:39 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> Thanks Jeevan for your review and comments. PFA the patch which fixes those.

Committed with a couple more small adjustments.

Woohoo, finally!

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

10 февраля 2016 г., 15:12:41

Here's patch to remove this declaration. The Assert next probably prevents the warning for build with asserts. But both those lines are not needed.

On Wed, Feb 10, 2016 at 12:01 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

On Tue, Feb 9, 2016 at 11:16 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Feb 9, 2016 at 8:39 AM, Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com> wrote:
>> Thanks Jeevan for your review and comments. PFA the patch which fixes those.
>
> Committed with a couple more small adjustments.

I'm getting a compiler warning which I think is coming from this commit.

postgres_fdw.c: In function 'fetch_more_data':
postgres_fdw.c:2526:17: warning: unused variable 'fsplan' [-Wunused-variable]
ForeignScan *fsplan = (ForeignScan *) node->ss.ps.plan;

Thanks,

Jeff

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

extra_fsplan.patch

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

10 февраля 2016 г., 16:18:44

On Wed, Feb 10, 2016 at 7:12 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> Here's patch to remove this declaration. The Assert next probably prevents
> the warning for build with asserts. But both those lines are not needed.

I like the Assert(), so I kept that and ditched the variable.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Etsuro Fujita

Дата:

15 февраля 2016 г., 10:44:00

On 2016/02/10 4:16, Robert Haas wrote:
> On Tue, Feb 9, 2016 at 8:39 AM, Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com> wrote:
>> Thanks Jeevan for your review and comments. PFA the patch which fixes those.

> Committed with a couple more small adjustments.

Thanks for working on this, Robert, Ashutosh, and everyone involved!

I happened to notice that this code in foreign_join_ok():

     switch (jointype)
     {
         case JOIN_INNER:
             fpinfo->remote_conds = list_concat(fpinfo->remote_conds,
                                                fpinfo_i->remote_conds);
             fpinfo->remote_conds = list_concat(fpinfo->remote_conds,
                                                fpinfo_o->remote_conds);
             break;

         case JOIN_LEFT:
             fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
                                               fpinfo_i->remote_conds);
             fpinfo->remote_conds = list_concat(fpinfo->remote_conds,
                                                fpinfo_o->remote_conds);
             break;

         case JOIN_RIGHT:
             fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
                                               fpinfo_o->remote_conds);
             fpinfo->remote_conds = list_concat(fpinfo->remote_conds,
                                                fpinfo_i->remote_conds);
             break;

         case JOIN_FULL:
             fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
                                               fpinfo_i->remote_conds);
             fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
                                               fpinfo_o->remote_conds);
             break;

         default:
             /* Should not happen, we have just check this above */
             elog(ERROR, "unsupported join type %d", jointype);
     }

would break the list fpinfo_i->remote_conds in the case of INNER JOIN or
FULL JOIN.  You can see the list breakage from e.g., the following
queries on an Assert-enabled build:

postgres=# create extension postgres_fdw;
CREATE EXTENSION
postgres=# create server myserver foreign data wrapper postgres_fdw
options (dbname 'mydatabase');
CREATE SERVER
postgres=# create user mapping for current_user server myserver;
CREATE USER MAPPING
postgres=# create foreign table foo (a int) server myserver options
(table_name 'foo');
CREATE FOREIGN TABLE
postgres=# create foreign table bar (a int) server myserver options
(table_name 'bar');
CREATE FOREIGN TABLE
postgres=# create foreign table baz (a int) server myserver options
(table_name 'baz');
CREATE FOREIGN TABLE
postgres=# select * from foo, bar, baz where foo.a = bar.a and bar.a =
baz.a and foo.a < 10 and bar.a < 10 and baz.a < 10;

Attached is a patch to avoid the breakage.

Best regards,
Etsuro Fujita

Вложения

foreign_join_ok.patch

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

15 февраля 2016 г., 15:33:43

Thanks Fujita-san for bug report and the fix. Sorry for bug.

Here's patch with better way to fix it. I think while concatenating the lists, we need to copy the lists being appended and in all the cases. If we don't copy, a change in those lists can cause changes in the upward linkages and thus lists of any higher level joins.

On Mon, Feb 15, 2016 at 1:10 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:

On 2016/02/10 4:16, Robert Haas wrote:
On Tue, Feb 9, 2016 at 8:39 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
Thanks Jeevan for your review and comments. PFA the patch which fixes those.

Committed with a couple more small adjustments.

Thanks for working on this, Robert, Ashutosh, and everyone involved!

I happened to notice that this code in foreign_join_ok():

switch (jointype)
{
case JOIN_INNER:
fpinfo->remote_conds = list_concat(fpinfo->remote_conds,
fpinfo_i->remote_conds);
fpinfo->remote_conds = list_concat(fpinfo->remote_conds,
fpinfo_o->remote_conds);
break;

case JOIN_LEFT:
fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
fpinfo_i->remote_conds);
fpinfo->remote_conds = list_concat(fpinfo->remote_conds,
fpinfo_o->remote_conds);
break;

case JOIN_RIGHT:
fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
fpinfo_o->remote_conds);
fpinfo->remote_conds = list_concat(fpinfo->remote_conds,
fpinfo_i->remote_conds);
break;

case JOIN_FULL:
fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
fpinfo_i->remote_conds);
fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
fpinfo_o->remote_conds);
break;

default:
/* Should not happen, we have just check this above */
elog(ERROR, "unsupported join type %d", jointype);
}

would break the list fpinfo_i->remote_conds in the case of INNER JOIN or FULL JOIN. You can see the list breakage from e.g., the following queries on an Assert-enabled build:

postgres=# create extension postgres_fdw;
CREATE EXTENSION
postgres=# create server myserver foreign data wrapper postgres_fdw options (dbname 'mydatabase');
CREATE SERVER
postgres=# create user mapping for current_user server myserver;
CREATE USER MAPPING
postgres=# create foreign table foo (a int) server myserver options (table_name 'foo');
CREATE FOREIGN TABLE
postgres=# create foreign table bar (a int) server myserver options (table_name 'bar');
CREATE FOREIGN TABLE
postgres=# create foreign table baz (a int) server myserver options (table_name 'baz');
CREATE FOREIGN TABLE
postgres=# select * from foo, bar, baz where foo.a = bar.a and bar.a = baz.a and foo.a < 10 and bar.a < 10 and baz.a < 10;

Attached is a patch to avoid the breakage.

Best regards,
Etsuro Fujita

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

foreign_join_ok_v2.patch

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

16 февраля 2016 г., 09:22:46

<div dir="ltr">During join planning, the planner tries multiple combinations of joining relations, thus the same base
orjoin relation can be part of multiple of combination. Hence remote_conds or joinclauses will get linked multiple
timesas they are bidirectional lists, thus breaking linkages of previous join combinations tried. E.g. while planning A
joinB join C join D planner will come up with combinations like A(B(CD)) or (AB)(CD) or ((AB)C)D etc. and remote_conds
fromA will first be linked into A(B(CD)), then AB breaking the first linkages.<br /></div><div class="gmail_extra"><br
/><divclass="gmail_quote">On Tue, Feb 16, 2016 at 11:36 AM, Etsuro Fujita <span dir="ltr"><<a
href="mailto:fujita.etsuro@lab.ntt.co.jp"target="_blank">fujita.etsuro@lab.ntt.co.jp</a>></span> wrote:<br
/><blockquoteclass="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span
class="">On2016/02/15 21:33, Ashutosh Bapat wrote:<br /><blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px#ccc solid;padding-left:1ex"> Here's patch with better way to fix it. I think while concatenating
the<br/> lists, we need to copy the lists being appended and in all the cases. If<br /> we don't copy, a change in
thoselists can cause changes in the upward<br /> linkages and thus lists of any higher level joins.<br
/></blockquote><br/></span> Maybe I'm missing something, but I don't understand why such a change in those lists
happens. Could you explain about that in more detail?<br /><br /> Best regards,<br /> Etsuro Fujita<br /><br /><br
/></blockquote></div><br/><br clear="all" /><br />-- <br /><div class="gmail_signature"><div dir="ltr">Best Wishes,<br
/>AshutoshBapat<br />EnterpriseDB Corporation<br />The Postgres Database Company<br /></div></div></div>

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Etsuro Fujita

Дата:

16 февраля 2016 г., 09:23:19

On 2016/02/15 21:33, Ashutosh Bapat wrote:
> Here's patch with better way to fix it. I think while concatenating the
> lists, we need to copy the lists being appended and in all the cases. If
> we don't copy, a change in those lists can cause changes in the upward
> linkages and thus lists of any higher level joins.

Maybe I'm missing something, but I don't understand why such a change in 
those lists happens.  Could you explain about that in more detail?

Best regards,
Etsuro Fujita

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Etsuro Fujita

Дата:

16 февраля 2016 г., 09:58:56

On 2016/02/16 15:22, Ashutosh Bapat wrote:
> During join planning, the planner tries multiple combinations of joining
> relations, thus the same base or join relation can be part of multiple
> of combination. Hence remote_conds or joinclauses will get linked
> multiple times as they are bidirectional lists, thus breaking linkages
> of previous join combinations tried. E.g. while planning A join B join C
> join D planner will come up with combinations like A(B(CD)) or (AB)(CD)
> or ((AB)C)D etc. and remote_conds from A will first be linked into
> A(B(CD)), then AB breaking the first linkages.

Exactly, but I don't think that that needs to be considered because we 
have this at the beginning of postgresGetGForeignJoinPaths:
    /*     * Skip if this join combination has been considered already.     */    if (joinrel->fdw_private)
return;

Best regards,
Etsuro Fujita

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

16 февраля 2016 г., 10:02:33

On Tue, Feb 16, 2016 at 12:26 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:

On 2016/02/16 15:22, Ashutosh Bapat wrote:
During join planning, the planner tries multiple combinations of joining
relations, thus the same base or join relation can be part of multiple
of combination. Hence remote_conds or joinclauses will get linked
multiple times as they are bidirectional lists, thus breaking linkages
of previous join combinations tried. E.g. while planning A join B join C
join D planner will come up with combinations like A(B(CD)) or (AB)(CD)
or ((AB)C)D etc. and remote_conds from A will first be linked into
A(B(CD)), then AB breaking the first linkages.

Exactly, but I don't think that that needs to be considered because we have this at the beginning of postgresGetGForeignJoinPaths:

/*
* Skip if this join combination has been considered already.
*/
if (joinrel->fdw_private)
return;

There will be different joinrels for A(B(CD)) and (AB) where A's remote_conds need to be pulled up. The check you have mentioned above only protects us from adding paths multiple times to (AB) when we encounter it for (AB)(CD) and ((AB)C)D.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Etsuro Fujita

Дата:

16 февраля 2016 г., 10:43:25

On 2016/02/16 16:02, Ashutosh Bapat wrote:
> On Tue, Feb 16, 2016 at 12:26 PM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote:

>     On 2016/02/16 15:22, Ashutosh Bapat wrote:

>         During join planning, the planner tries multiple combinations of
>         joining
>         relations, thus the same base or join relation can be part of
>         multiple
>         of combination. Hence remote_conds or joinclauses will get linked
>         multiple times as they are bidirectional lists, thus breaking
>         linkages
>         of previous join combinations tried. E.g. while planning A join
>         B join C
>         join D planner will come up with combinations like A(B(CD)) or
>         (AB)(CD)
>         or ((AB)C)D etc. and remote_conds from A will first be linked into
>         A(B(CD)), then AB breaking the first linkages.

>     Exactly, but I don't think that that needs to be considered because
>     we have this at the beginning of postgresGetGForeignJoinPaths:
>
>          /*
>           * Skip if this join combination has been considered already.
>           */
>          if (joinrel->fdw_private)
>              return;

> There will be different joinrels for A(B(CD)) and (AB) where A's
> remote_conds need to be pulled up.

Agreed.

> The check you have mentioned above
> only protects us from adding paths multiple times to (AB) when we
> encounter it for (AB)(CD) and ((AB)C)D.

Sorry, I don't understand this fully.

Best regards,
Etsuro Fujita

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Etsuro Fujita

Дата:

18 февраля 2016 г., 13:22:17

On 2016/02/16 16:40, Etsuro Fujita wrote:
> On 2016/02/16 16:02, Ashutosh Bapat wrote:
>> On Tue, Feb 16, 2016 at 12:26 PM, Etsuro Fujita
>> <fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote:

>>     On 2016/02/16 15:22, Ashutosh Bapat wrote:

>>         During join planning, the planner tries multiple combinations of
>>         joining
>>         relations, thus the same base or join relation can be part of
>>         multiple
>>         of combination. Hence remote_conds or joinclauses will get linked
>>         multiple times as they are bidirectional lists, thus breaking
>>         linkages
>>         of previous join combinations tried. E.g. while planning A join
>>         B join C
>>         join D planner will come up with combinations like A(B(CD)) or
>>         (AB)(CD)
>>         or ((AB)C)D etc. and remote_conds from A will first be linked
>> into
>>         A(B(CD)), then AB breaking the first linkages.

>>     Exactly, but I don't think that that needs to be considered because
>>     we have this at the beginning of postgresGetGForeignJoinPaths:
>>
>>          /*
>>           * Skip if this join combination has been considered already.
>>           */
>>          if (joinrel->fdw_private)
>>              return;

>> There will be different joinrels for A(B(CD)) and (AB) where A's
>> remote_conds need to be pulled up.

> Agreed.

>> The check you have mentioned above
>> only protects us from adding paths multiple times to (AB) when we
>> encounter it for (AB)(CD) and ((AB)C)D.

> Sorry, I don't understand this fully.

Another thing I don't really understand is why list_copy is needed in 
the second list_concat for the case of INNER/FULL JOIN or in both 
list_concats for the case of LEFT/RIGHT JOIN, in your patch.  Since 
list_concat is nondestructive of its second argument, I don't think 
list_copy is needed in any such list_concat.  Maybe I'm missing 
something, though.

Best regards,
Etsuro Fujita

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Ashutosh Bapat

Дата:

18 февраля 2016 г., 14:22:38

On Thu, Feb 18, 2016 at 3:48 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:

On 2016/02/16 16:40, Etsuro Fujita wrote:
On 2016/02/16 16:02, Ashutosh Bapat wrote:
On Tue, Feb 16, 2016 at 12:26 PM, Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote:

On 2016/02/16 15:22, Ashutosh Bapat wrote:

During join planning, the planner tries multiple combinations of
joining
relations, thus the same base or join relation can be part of
multiple
of combination. Hence remote_conds or joinclauses will get linked
multiple times as they are bidirectional lists, thus breaking
linkages
of previous join combinations tried. E.g. while planning A join
B join C
join D planner will come up with combinations like A(B(CD)) or
(AB)(CD)
or ((AB)C)D etc. and remote_conds from A will first be linked
into
A(B(CD)), then AB breaking the first linkages.

Exactly, but I don't think that that needs to be considered because
we have this at the beginning of postgresGetGForeignJoinPaths:

/*
* Skip if this join combination has been considered already.
*/
if (joinrel->fdw_private)
return;

There will be different joinrels for A(B(CD)) and (AB) where A's
remote_conds need to be pulled up.

Agreed.

The check you have mentioned above
only protects us from adding paths multiple times to (AB) when we
encounter it for (AB)(CD) and ((AB)C)D.

Sorry, I don't understand this fully.

Another thing I don't really understand is why list_copy is needed in the second list_concat for the case of INNER/FULL JOIN or in both list_concats for the case of LEFT/RIGHT JOIN, in your patch. Since list_concat is nondestructive of its second argument, I don't think list_copy is needed in any such list_concat. Maybe I'm missing something, though.

If the list in the joining relation changes (may be because we appended parameterized conditions), we would be breaking links on all the upper relations in the join tree in an unpredictable manner. The problem may not show up now, but it's an avenue for unrecognizable bugs. So, it's safer to copy the lists in the state that we want them.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

От

Robert Haas

Дата:

21 февраля 2016 г., 11:52:35

On Thu, Feb 18, 2016 at 4:52 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> If the list in the joining relation changes (may be because we appended
> parameterized conditions), we would be breaking links on all the upper
> relations in the join tree in an unpredictable manner. The problem may not
> show up now, but it's an avenue for unrecognizable bugs. So, it's safer to
> copy the lists in the state that we want them.

Agreed.  The lists figure to be short, so copying them shouldn't be
very expensive, and it's better to do that in all cases than to leave
shared-substructure hazards around for future patch authors to worry
about.

Committed Ashutosh's version of the patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения