Обсуждение: Join push-down support for foreign tables

Поиск

Список

Период

Сортировка

Join push-down support for foreign tables

От

Shigeru Hanada

Дата:

03 сентября 2014 г., 09:16:39

Hi all,

In 2011 I proposed join push-down support for foreign tables, which
would improve performance of queries which contain join between
foreign tables in one server, but it has not finished before time-up.
This performance improvement would widen application range of foreign
tables, so I'd like to tackle the work again.

The descriptions below are based on previous discussions and additional studies.

Background
==========

At the moment FDWs can't handle join, so every join are processed on
local side even if the source relations are on the same server.  It's
apparently inefficient to fetch possible rows from remote and join
them on local and waste some of them since join condition doesn't
match.  If FDW (typically SQL-based FDWs like postgres_fdw) can get
control of JOIN operation, it would optimize queries for source tables
into a join query and avoid transfer of un-match rows.

With this improvement, most of joins in usual use, especially joins
between large foreign tables which don't match much, would become
remarkablly fast, for the reasons below.

a) less data transfer
Especially for inner joins, result of join is usually much smaller
than source tables.  If the original target list doesn't contain join
keys, FDW might be able to omit from the SELECT list of remote queries
because they are only necessary on remote side.

b) more optimization on remote side
Join query would provide remote data source more optimization chances,
such as using index.

Changes expected
================

In the past development trial, these changes seem necessary at least.

(1) Add server oid field to RelOptInfo
This attribute is set only when the RelOptInfo is a joinrel, and all
underlying base relations are foreign tables and they have same server
oid.  This field is set through join consideration from lower join
level to high (many tables) level, IOW from the bottom to the top.  If
all base relations joined in a query are on same server, top
RelOptInfo which represents final output has valid server oid.  In
such case, whole query could be pushed down to the server and user can
get most efficient result.

New helper function GetFdwRoutineByServerId(Oid serverid) which
returns FdwRoutine of given server oid would be handy.

(2) Add new path node for foreign join
New path node ForeignJoinPath, which inherits JoinPath like other join
path nodes, represents a join between ForeignPath or ForeignJoinPath.
ForeignJoinPath has fdw_private list to hold FDW-specific information
through the path consideration phase.  This is similar to fdw_private
of ForeignPath path node.

This node cares only type of join such as INNER JOIN and LEFT OUTER
JOIN, but doesn't care how to do it.  IOW foreign join is not special
case of existing join nodes such as nested loops, merge join and hash
join.  FDW can implement a foreign join in arbitrary way, for
instance, file_fdw can have already-joined file for particular
combination for optimization, and postgres_fdw can generate a SELECT
query which contains JOIN clause and avoid essentially unnecessary
data transfer.

At the moment I'm not sure whether we should support SEMI/ANTI join in
the context of foreign join.  It would require postgres_fdw (or other
SQL-based FDWs) to generate query with subquery connected with IN/NOT
IN clause, but it seems too far to head to in the first version.

We (and especially FDW authors) need to note that join push-down is
not the best way in some situations.  In most cases OUTER JOIN
populates data on remote side more than current FDWs transfer,
especially for FULL OUTER JOIN and
CROSS JOIN (cartesian product).

(3) Add new plan node for foreign join
New plan node ForeignJoin, which inherits Join  like other join plan
nodes.  This node is similar to other join plan nodes such as
NestLoop, MergeJoin and HashJoin, but it delegates actual processing
to FDW associated to the server.

This means that new plan state node for ForeignJoin, say
ForeignJoinState, is also needed.

(4) Add new FDW API functions
Adding Join push-down support requires some functions to be added to
FdwRoutine to give control to FDWs.

a) GetForeignJoinPaths()
This allows FDWs to provide alternative join paths for a join
RelOptInfo.  This is called from add_paths_to_joinrel() after
considering other join possibilities, and FDW should call add_path()
for each possible foreign join path.  Foreign join paths are built
similarly to existing join paths, in a bottom-up manner.

FDWs may push ordered or unordered paths here, but combination of sort
keys would bloat up easily if FDW has no information about efficient
patterns such as remote indexes.  FDW should not add too many paths to
prevent exponential overhead of join combination.

b) GetForeignJoinPlan()
This creates ForeignJoin plan node from ForeignJoinPath and other
planner infromation.

c) Executor functions for ForeignJoin plan node
A set of funcitons for executing ForeignJoin plan node is also needed.
Begin/ReScan/Iterate/End are basic operations of a plan node, so we
need to provide them for ForeignJoin node.

Issues
======

(1) Separate cost estimation phases?
For existing join paths, planner estimates their costs in two phaeses.
In the first phase initial_cost_foo(), here foo is one of
nestloop/mergejoin/hashjoin, produces lower-bound estimates for
elimination.  The second phase is done for only promising paths which
passed add_path_precheck(), by final_cost_foo() for cost and result
size.  I'm not sure that we need to follow this manner, since FDWs
would be able to estimate final cost/size with their own methods.

(2) How to reflect cost of transfer
Cost of transfer is dominant in foreign table operations, including
foreign scans.  It would be nice to have some mechanism to reflect
actual time of transfer to the cost estimation.  An idea is to have a
FDW option which represents cost factor of transfer, say
transfer_cost.

(3) SELECT-with-Join SQL generation in postgres_fdw
Probably Postgres-XC's shipping code would help us for implementing
deparse JOIN SQL, but I've not studied it fully, I'll continue the
study.

(4) criteria for push-down
It is assumed that FDWs can push joins down to remote when all foreign
tables are in same server.  IMO a SERVER objects represents a logical
data source.  For instance database for postgres_fdw and other
connection-based FDWs, and disk volumes (or directory?) for file_fdw.
Is this reasonable assumption?

Perhaps more issues would come out later, but I'd like to get comments
about the design.

(5) Terminology
I used "foreign join" as a process which joins foreign tables on
*remote* side, but is this enough intuitive?  Another idea is using
"remote join", is this more appropriate for this kind of process?  I
hesitate to use "remote join" because it implies client-server FDWs,
but foreign join is not limited to such FDWs, e.g. file_fdw can have
extra file which is already joined files accessed via foreign tables.


-- 
Shigeru HANADA

Re: Join push-down support for foreign tables

От

Robert Haas

Дата:

04 сентября 2014 г., 12:37:17

On Wed, Sep 3, 2014 at 5:16 AM, Shigeru Hanada <shigeru.hanada@gmail.com> wrote:
> In 2011 I proposed join push-down support for foreign tables, which
> would improve performance of queries which contain join between
> foreign tables in one server, but it has not finished before time-up.
> This performance improvement would widen application range of foreign
> tables, so I'd like to tackle the work again.
>
> The descriptions below are based on previous discussions and additional studies.

Hanada-san, it is fantastic to see you working on this again.

I think your proposal sounds promising and it is along the lines of
what I have considered in the past.

> (1) Separate cost estimation phases?
> For existing join paths, planner estimates their costs in two phaeses.
> In the first phase initial_cost_foo(), here foo is one of
> nestloop/mergejoin/hashjoin, produces lower-bound estimates for
> elimination.  The second phase is done for only promising paths which
> passed add_path_precheck(), by final_cost_foo() for cost and result
> size.  I'm not sure that we need to follow this manner, since FDWs
> would be able to estimate final cost/size with their own methods.

The main problem I see here is that accurate costing may require a
round-trip to the remote server.  If there is only one path that is
probably OK; the cost of asking the question will usually be more than
paid for by hearing that the pushed-down join clobbers the other
possible methods of executing the query.  But if there are many paths,
for example because there are multiple sets of useful pathkeys, it
might start to get a bit expensive.

Probably both the initial cost and final cost calculations should be
delegated to the FDW, but maybe within postgres_fdw, the initial cost
should do only the work that can be done without contacting the remote
server; then, let the final cost step do that if appropriate.  But I'm
not entirely sure what is best here.

> (2) How to reflect cost of transfer
> Cost of transfer is dominant in foreign table operations, including
> foreign scans.  It would be nice to have some mechanism to reflect
> actual time of transfer to the cost estimation.  An idea is to have a
> FDW option which represents cost factor of transfer, say
> transfer_cost.

That would be reasonable.  I assume users would normally wish to
specify this per-server, and the default should be something
reasonable for a LAN.

> (4) criteria for push-down
> It is assumed that FDWs can push joins down to remote when all foreign
> tables are in same server.  IMO a SERVER objects represents a logical
> data source.  For instance database for postgres_fdw and other
> connection-based FDWs, and disk volumes (or directory?) for file_fdw.
> Is this reasonable assumption?

I think it's probably good to give an FDW the option of producing a
ForeignJoinPath for any join against a ForeignPath *or
ForeignJoinPath* for the same FDW.  It's perhaps unlikely that an FDW
can perform a join efficiently between two data sources with different
server definitions, but why not give it the option?  It should be
pretty fast for the FDW to realize, oh, the server OIDs don't match -
and at that point it can exit without doing anything further if that
seems desirable.  And there might be some kinds of data sources where
cross-server joins actually can be executed quickly (e.g. when the
underlying data is just in two files in different places on the local
machine).

> (5) Terminology
> I used "foreign join" as a process which joins foreign tables on
> *remote* side, but is this enough intuitive?  Another idea is using
> "remote join", is this more appropriate for this kind of process?  I
> hesitate to use "remote join" because it implies client-server FDWs,
> but foreign join is not limited to such FDWs, e.g. file_fdw can have
> extra file which is already joined files accessed via foreign tables.

Foreign join is perfect.

As I alluded to above, it's pretty important to make sure that this
works with large join trees; that is, if I join four foreign tables, I
don't want it to push down a join between two of the tables and a join
between the other two tables and then join the results of those joins
locally.  Instead, I want to push the entire join tree to the foreign
server and execute the whole thing there.  Some care may be needed in
designing the hooks to make sure this works as desired.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Join push-down support for foreign tables

От

Bruce Momjian

Дата:

04 сентября 2014 г., 14:58:53

On Thu, Sep  4, 2014 at 08:37:08AM -0400, Robert Haas wrote:
> The main problem I see here is that accurate costing may require a
> round-trip to the remote server.  If there is only one path that is
> probably OK; the cost of asking the question will usually be more than
> paid for by hearing that the pushed-down join clobbers the other
> possible methods of executing the query.  But if there are many paths,
> for example because there are multiple sets of useful pathkeys, it
> might start to get a bit expensive.
> 
> Probably both the initial cost and final cost calculations should be
> delegated to the FDW, but maybe within postgres_fdw, the initial cost
> should do only the work that can be done without contacting the remote
> server; then, let the final cost step do that if appropriate.  But I'm
> not entirely sure what is best here.

I am thinking eventually we will need to cache the foreign server
statistics on the local server.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +

Re: Join push-down support for foreign tables

От

Atri Sharma

Дата:

04 сентября 2014 г., 15:15:43

On Thursday, September 4, 2014, Bruce Momjian <bruce@momjian.us> wrote:

On Thu, Sep 4, 2014 at 08:37:08AM -0400, Robert Haas wrote:
> The main problem I see here is that accurate costing may require a
> round-trip to the remote server. If there is only one path that is
> probably OK; the cost of asking the question will usually be more than
> paid for by hearing that the pushed-down join clobbers the other
> possible methods of executing the query. But if there are many paths,
> for example because there are multiple sets of useful pathkeys, it
> might start to get a bit expensive.
>
> Probably both the initial cost and final cost calculations should be
> delegated to the FDW, but maybe within postgres_fdw, the initial cost
> should do only the work that can be done without contacting the remote
> server; then, let the final cost step do that if appropriate. But I'm
> not entirely sure what is best here.

I am thinking eventually we will need to cache the foreign server
statistics on the local server.

Wouldn't that lead to issues where the statistics get outdated and we have to anyways query the foreign server before planning any joins? Or are you thinking of dropping the foreign table statistics once the foreign join is complete?

Regards,

Atri

Regards,

Atri

l'apprenant

Re: Join push-down support for foreign tables

От

Bruce Momjian

Дата:

04 сентября 2014 г., 15:56:47

On Thu, Sep  4, 2014 at 08:41:43PM +0530, Atri Sharma wrote:
> 
> 
> On Thursday, September 4, 2014, Bruce Momjian <bruce@momjian.us> wrote:
> 
>     On Thu, Sep  4, 2014 at 08:37:08AM -0400, Robert Haas wrote:
>     > The main problem I see here is that accurate costing may require a
>     > round-trip to the remote server.  If there is only one path that is
>     > probably OK; the cost of asking the question will usually be more than
>     > paid for by hearing that the pushed-down join clobbers the other
>     > possible methods of executing the query.  But if there are many paths,
>     > for example because there are multiple sets of useful pathkeys, it
>     > might start to get a bit expensive.
>     >
>     > Probably both the initial cost and final cost calculations should be
>     > delegated to the FDW, but maybe within postgres_fdw, the initial cost
>     > should do only the work that can be done without contacting the remote
>     > server; then, let the final cost step do that if appropriate.  But I'm
>     > not entirely sure what is best here.
> 
>     I am thinking eventually we will need to cache the foreign server
>     statistics on the local server.
> 
> 
> 
> 
> Wouldn't that lead to issues where the statistics get outdated and we have to
> anyways query the foreign server before planning any joins? Or are you thinking
> of dropping the foreign table statistics once the foreign join is complete?

I am thinking we would eventually have to cache the statistics, then get
some kind of invalidation message from the foreign server.  I am also
thinking that cache would have to be global across all backends, I guess
similar to our invalidation cache.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +

Re: Join push-down support for foreign tables

От

Atri Sharma

Дата:

04 сентября 2014 г., 16:01:31

On Thu, Sep 4, 2014 at 9:26 PM, Bruce Momjian <bruce@momjian.us> wrote:

On Thu, Sep 4, 2014 at 08:41:43PM +0530, Atri Sharma wrote:
>
>
> On Thursday, September 4, 2014, Bruce Momjian <bruce@momjian.us> wrote:
>
> On Thu, Sep 4, 2014 at 08:37:08AM -0400, Robert Haas wrote:
> > The main problem I see here is that accurate costing may require a
> > round-trip to the remote server. If there is only one path that is
> > probably OK; the cost of asking the question will usually be more than
> > paid for by hearing that the pushed-down join clobbers the other
> > possible methods of executing the query. But if there are many paths,
> > for example because there are multiple sets of useful pathkeys, it
> > might start to get a bit expensive.
> >
> > Probably both the initial cost and final cost calculations should be
> > delegated to the FDW, but maybe within postgres_fdw, the initial cost
> > should do only the work that can be done without contacting the remote
> > server; then, let the final cost step do that if appropriate. But I'm
> > not entirely sure what is best here.
>
> I am thinking eventually we will need to cache the foreign server
> statistics on the local server.
>
>
>
>
> Wouldn't that lead to issues where the statistics get outdated and we have to
> anyways query the foreign server before planning any joins? Or are you thinking
> of dropping the foreign table statistics once the foreign join is complete?

I am thinking we would eventually have to cache the statistics, then get
some kind of invalidation message from the foreign server. I am also
thinking that cache would have to be global across all backends, I guess
similar to our invalidation cache.

That could lead to some bloat in storing statistics since we may have a lot of tables for a lot of foreign servers. Also, will we have VACUUM look at ANALYZING the foreign tables?

Also, how will we decide that the statistics are invalid? Will we have the FDW query the foreign server and do some sort of comparison between the statistics the foreign server has and the statistics we locally have? I am trying to understand how the idea of invalidation message from foreign server will work.

Regards,

Atri

Re: Join push-down support for foreign tables

От

Bruce Momjian

Дата:

04 сентября 2014 г., 16:04:07

On Thu, Sep  4, 2014 at 09:31:20PM +0530, Atri Sharma wrote:
>     I am thinking we would eventually have to cache the statistics, then get
>     some kind of invalidation message from the foreign server.  I am also
>     thinking that cache would have to be global across all backends, I guess
>     similar to our invalidation cache.
> 
> 
> 
> 
> That could lead to some bloat in storing statistics since we may have a lot of
> tables for a lot of foreign servers. Also, will we have VACUUM look at
> ANALYZING the foreign tables?

> Also, how will we decide that the statistics are invalid? Will we have the FDW
> query the foreign server and do some sort of comparison between the statistics
> the foreign server has and the statistics we locally have? I am trying to
> understand how the idea of invalidation message from foreign server will work.

Well, ANALYZING is running on the foreign server, and somehow it would
be nice if it would send a message to us about its new statistics, or we
can do it like http does and it gives us a last-refresh statistics date
when we connect.

I am not sure how it will work --- I am just suspecting that we might
get to a point where the statistics lookup overhead on the foreign
server might become a bottleneck.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +

Re: Join push-down support for foreign tables

От

Atri Sharma

Дата:

04 сентября 2014 г., 16:14:21

On Thu, Sep 4, 2014 at 9:33 PM, Bruce Momjian <bruce@momjian.us> wrote:

On Thu, Sep 4, 2014 at 09:31:20PM +0530, Atri Sharma wrote:
> I am thinking we would eventually have to cache the statistics, then get
> some kind of invalidation message from the foreign server. I am also
> thinking that cache would have to be global across all backends, I guess
> similar to our invalidation cache.
>
>
>
>
> That could lead to some bloat in storing statistics since we may have a lot of
> tables for a lot of foreign servers. Also, will we have VACUUM look at
> ANALYZING the foreign tables?

> Also, how will we decide that the statistics are invalid? Will we have the FDW
> query the foreign server and do some sort of comparison between the statistics
> the foreign server has and the statistics we locally have? I am trying to
> understand how the idea of invalidation message from foreign server will work.

Well, ANALYZING is running on the foreign server, and somehow it would
be nice if it would send a message to us about its new statistics, or we
can do it like http does and it gives us a last-refresh statistics date
when we connect.

Not sure how that would work without changing the way ANALYZE works on the foreign server. http idea could work,though.

I am not sure how it will work --- I am just suspecting that we might
get to a point where the statistics lookup overhead on the foreign
server might become a bottleneck.

Totally agree, but doing the planning only locally opens the questions I mentioned above, and also deprives the foreign server database to do any optimizations that it may want to do (assuming that the foreign database and postgres query planner do not generate identical plans). This is only my thought though, we could also be planning better than the foreign server database, so the optimization part I raised is debatable.

Regards,

Atri

Regards,

Atri

l'apprenant

Re: Join push-down support for foreign tables

От

Robert Haas

Дата:

04 сентября 2014 г., 20:50:44

On Thu, Sep 4, 2014 at 11:56 AM, Bruce Momjian <bruce@momjian.us> wrote:
>>     I am thinking eventually we will need to cache the foreign server
>>     statistics on the local server.
>>
>> Wouldn't that lead to issues where the statistics get outdated and we have to
>> anyways query the foreign server before planning any joins? Or are you thinking
>> of dropping the foreign table statistics once the foreign join is complete?
>
> I am thinking we would eventually have to cache the statistics, then get
> some kind of invalidation message from the foreign server.  I am also
> thinking that cache would have to be global across all backends, I guess
> similar to our invalidation cache.

Maybe ... but I think this isn't really related to the ostensible
topic of this thread.  We can do join pushdown just fine without the
ability to do anything like this.

I'm in full agreement that we should probably have a way to cache some
kind of statistics locally, but making that work figures to be tricky,
because (as I'm pretty sure Tom has pointed out before) there's no
guarantee that the remote side's statistics look anything like
PostgreSQL statistics, and so we might not be able to easily store
them or make sense of them.  But it would be nice to at least have the
option to store such statistics if they do happen to be something we
can store and interpret.

It's also coming to seem to me more and more that we need a way to
designate several PostgreSQL machines as a cooperating cluster.  This
would mean they'd keep connections to each other open and notify each
other about significant events, which could include "hey, I updated
the statistics on this table, you might want to get the new ones" or
"hey, i've replicated your definition for function X so it's safe to
push it down now" as well as "hey, I have just been promoted to be the
new master" or even automatic negotiation of which of a group of
machines should become the master after a server failure.  So far,
we've taken the approach that postgres_fdw is just another FDW which
enjoys no special privileges, and I think that's a good approach on
the whole, but  think if we want to create a relatively seamless
multi-node experience as some of the NoSQL databases do, we're going
to need something more than that.

But all of that is a bit pie in the sky, and the join pushdown
improvements we're talking about here don't necessitate any of it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Join push-down support for foreign tables

От

Atri Sharma

Дата:

05 сентября 2014 г., 09:18:51

On Fri, Sep 5, 2014 at 2:20 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Sep 4, 2014 at 11:56 AM, Bruce Momjian <bruce@momjian.us> wrote:
>> I am thinking eventually we will need to cache the foreign server
>> statistics on the local server.
>>
>> Wouldn't that lead to issues where the statistics get outdated and we have to
>> anyways query the foreign server before planning any joins? Or are you thinking
>> of dropping the foreign table statistics once the foreign join is complete?
>
> I am thinking we would eventually have to cache the statistics, then get
> some kind of invalidation message from the foreign server. I am also
> thinking that cache would have to be global across all backends, I guess
> similar to our invalidation cache.

Maybe ... but I think this isn't really related to the ostensible
topic of this thread. We can do join pushdown just fine without the
ability to do anything like this.

I'm in full agreement that we should probably have a way to cache some
kind of statistics locally, but making that work figures to be tricky,
because (as I'm pretty sure Tom has pointed out before) there's no
guarantee that the remote side's statistics look anything like
PostgreSQL statistics, and so we might not be able to easily store
them or make sense of them. But it would be nice to at least have the
option to store such statistics if they do happen to be something we
can store and interpret.

I agree that we need local statistics too (full agreement to Bruce's proposal) but playing the Devil's advocate here and trying to figure how will things like invalidation and as you mentioned, cross compatibility work.

It's also coming to seem to me more and more that we need a way to
designate several PostgreSQL machines as a cooperating cluster. This
would mean they'd keep connections to each other open and notify each
other about significant events, which could include "hey, I updated
the statistics on this table, you might want to get the new ones" or
"hey, i've replicated your definition for function X so it's safe to
push it down now" as well as "hey, I have just been promoted to be the
new master" or even automatic negotiation of which of a group of
machines should become the master after a server failure.

Thats a brilliant idea, and shouldnt be too much of a problem. One race condition that is possible is that multiple backend may try to globally propagate different statistics of the same table, but I think that any standard logical ordering algorithm should handle that. Also, the automatic master promotion seems like a brilliant idea and is also great since we have time tested standard algorithms for that.

One thing I would like to see is that assuming all the interacting nodes do not have identical schemas, if we can somehow maintain cross node statistics and use them for planning cross node joins. That would lead to similar problems as the ones already noted for having local statistics for foreign databases, but if we solve those anyways for storing local statistics, we could potentially look at having cross node relation statistics as well.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Regards,

Atri

l'apprenant

Re: Join push-down support for foreign tables

От

Shigeru HANADA

Дата:

07 сентября 2014 г., 23:08:10

(2014/09/04 21:37), Robert Haas wrote:> On Wed, Sep 3, 2014 at 5:16 AM, 
Shigeru Hanada <shigeru.hanada@gmail.com> wrote:>> (1) Separate cost estimation phases?>> For existing join paths,
plannerestimates their costs in two phaeses.>> In the first phase initial_cost_foo(), here foo is one of>>
nestloop/mergejoin/hashjoin,produces lower-bound estimates for>> elimination.  The second phase is done for only
promisingpaths which>> passed add_path_precheck(), by final_cost_foo() for cost and result>> size.  I'm not sure that
weneed to follow this manner, since FDWs>> would be able to estimate final cost/size with their own methods.>> The main
problemI see here is that accurate costing may require a> round-trip to the remote server.  If there is only one path
thatis> probably OK; the cost of asking the question will usually be more than> paid for by hearing that the
pushed-downjoin clobbers the other> possible methods of executing the query.  But if there are many paths,> for example
becausethere are multiple sets of useful pathkeys, it> might start to get a bit expensive.

I agree that requiring round-trip per path is unbearable, so main source 
of plan cost should be local statistics gathered by ANALYZE command.  If 
an FDW needs extra information for planning, it should obtain that from 
FDW options or its private catalogs to avoid undesirable round-trips.
FDWs like postgres_fdw would want to optimize plan by providing paths 
with pathkeys (not only use remote index, but it also allows MergeJoin 
at upper level),

I noticed that order of join considering is an issue too.  Planner 
compares currently-cheapest path to newly generated path, and mostly 
foreign join path would be the cheapest, so considering foreign join 
would reduce planner overhead.
> Probably both the initial cost and final cost calculations should be> delegated to the FDW, but maybe within
postgres_fdw,the initial cost> should do only the work that can be done without contacting the remote> server; then,
letthe final cost step do that if appropriate.  But I'm> not entirely sure what is best here.

Agreed.  I'll design planner API along that way for now.
>> (2) How to reflect cost of transfer>> Cost of transfer is dominant in foreign table operations, including>> foreign
scans. It would be nice to have some mechanism to reflect>> actual time of transfer to the cost estimation.  An idea is
tohave a>> FDW option which represents cost factor of transfer, say>> transfer_cost.>> That would be reasonable.  I
assumeusers would normally wish to> specify this per-server, and the default should be something> reasonable for a
LAN.

This enhancement could be applied separately from foreign join patch.
>> (4) criteria for push-down>> It is assumed that FDWs can push joins down to remote when all foreign>> tables are in
sameserver.  IMO a SERVER objects represents a logical>> data source.  For instance database for postgres_fdw and
other>>connection-based FDWs, and disk volumes (or directory?) for file_fdw.>> Is this reasonable assumption?>> I think
it'sprobably good to give an FDW the option of producing a> ForeignJoinPath for any join against a ForeignPath *or>
ForeignJoinPath*for the same FDW.  It's perhaps unlikely that an FDW> can perform a join efficiently between two data
sourceswith different> server definitions, but why not give it the option?  It should be> pretty fast for the FDW to
realize,oh, the server OIDs don't match -> and at that point it can exit without doing anything further if that> seems
desirable. And there might be some kinds of data sources where> cross-server joins actually can be executed quickly
(e.g.when the> underlying data is just in two files in different places on the local> machine).

Indeed how to separate servers is left to users, or author of FDWs, 
though postgres_fdw and most of other FDWs can join foreign tables in a 
server.  I think it would be good if we can know two foreign tables are 
managed by same FDW, from FdwRoutine, maybe adding new API which returns 
FDW identifier?
>> (5) Terminology>> I used "foreign join" as a process which joins foreign tables on>> *remote* side, but is this
enoughintuitive?  Another idea is using>> "remote join", is this more appropriate for this kind of process?  I>>
hesitateto use "remote join" because it implies client-server FDWs,>> but foreign join is not limited to such FDWs,
e.g.file_fdw can have>> extra file which is already joined files accessed via foreign tables.>> Foreign join is
perfect.>>As I alluded to above, it's pretty important to make sure that this> works with large join trees; that is, if
Ijoin four foreign tables, I> don't want it to push down a join between two of the tables and a join> between the other
twotables and then join the results of those joins> locally.  Instead, I want to push the entire join tree to the
foreign>server and execute the whole thing there.  Some care may be needed in> designing the hooks to make sure this
worksas desired.>

I think so too, so ForeignJoinPath should be able to be an input of 
another ForeignJoinPath in upper join level.  But I also think joining 
on remote or not should be decided based on cost, as existing joins are 
planned with bottom-up approach.

Regards,
--
Shigeru HANADA

Re: Join push-down support for foreign tables

От

Shigeru HANADA

Дата:

07 сентября 2014 г., 23:27:36

(2014/09/05 0:56), Bruce Momjian wrote:> On Thu, Sep  4, 2014 at 
08:41:43PM +0530, Atri Sharma wrote:>> On Thursday, September 4, 2014, Bruce Momjian <bruce@momjian.us> wrote:>>>>
OnThu, Sep  4, 2014 at 08:37:08AM -0400, Robert Haas wrote:>>      > The main problem I see here is that accurate
costingmay 
 
require a>>      > round-trip to the remote server.  If there is only one path 
that is>>      > probably OK; the cost of asking the question will usually be 
more than>>      > paid for by hearing that the pushed-down join clobbers the other>>      > possible methods of
executingthe query.  But if there are 
 
many paths,>>      > for example because there are multiple sets of useful 
pathkeys, it>>      > might start to get a bit expensive.>>      >>>      > Probably both the initial cost and final
costcalculations 
 
should be>>      > delegated to the FDW, but maybe within postgres_fdw, the 
initial cost>>      > should do only the work that can be done without contacting 
the remote>>      > server; then, let the final cost step do that if appropriate.  But I'm>>      > not entirely sure
whatis best here.>>>>      I am thinking eventually we will need to cache the foreign server>>      statistics on the
localserver.>>>>>>>>>> Wouldn't that lead to issues where the statistics get outdated and 
 
we have to>> anyways query the foreign server before planning any joins? Or are 
you thinking>> of dropping the foreign table statistics once the foreign join is 
complete?>> I am thinking we would eventually have to cache the statistics, then get> some kind of invalidation message
fromthe foreign server.  I am also> thinking that cache would have to be global across all backends, I guess> similar
toour invalidation cache.
 

If a FDW needs to know more information than pg_statistics and pg_class 
have, yes, it should cache some statistics on the local side.  But such 
statistics would have FDW-specific shape so it would be hard to have API 
to manage.  FDW can have their own functions and tables to manage their 
own statistics, and it can have even background-worker for messaging. 
But it would be another story.

Regards,
--
Shigeru HANADA

Re: Join push-down support for foreign tables

От

Robert Haas

Дата:

09 сентября 2014 г., 14:34:14

On Sun, Sep 7, 2014 at 7:07 PM, Shigeru HANADA <shigeru.hanada@gmail.com> wrote:
>> I think it's probably good to give an FDW the option of producing a
>> ForeignJoinPath for any join against a ForeignPath *or
>> ForeignJoinPath* for the same FDW.  It's perhaps unlikely that an FDW
>> can perform a join efficiently between two data sources with different
>> server definitions, but why not give it the option?  It should be
>> pretty fast for the FDW to realize, oh, the server OIDs don't match -
>> and at that point it can exit without doing anything further if that
>> seems desirable.  And there might be some kinds of data sources where
>> cross-server joins actually can be executed quickly (e.g. when the
>> underlying data is just in two files in different places on the local
>> machine).
>
> Indeed how to separate servers is left to users, or author of FDWs, though
> postgres_fdw and most of other FDWs can join foreign tables in a server.  I
> think it would be good if we can know two foreign tables are managed by same
> FDW, from FdwRoutine, maybe adding new API which returns FDW identifier?

Do we need this?  I mean, if you get the FdwRoutine, don't you have
the OID of the FDW or the foreign table also?

> I think so too, so ForeignJoinPath should be able to be an input of another
> ForeignJoinPath in upper join level.  But I also think joining on remote or
> not should be decided based on cost, as existing joins are planned with
> bottom-up approach.

Definitely.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Join push-down support for foreign tables

От

Shigeru Hanada

Дата:

03 октября 2014 г., 09:45:08

2014-09-08 8:07 GMT+09:00 Shigeru HANADA <shigeru.hanada@gmail.com>:
> (2014/09/04 21:37), Robert Haas wrote:> On Wed, Sep 3, 2014 at 5:16 AM,
>> Probably both the initial cost and final cost calculations should be
>> delegated to the FDW, but maybe within postgres_fdw, the initial cost
>> should do only the work that can be done without contacting the remote
>> server; then, let the final cost step do that if appropriate.  But I'm
>> not entirely sure what is best here.
>
> Agreed.  I'll design planner API along that way for now.

I tried some patterns of implementation but I've not gotten feasible
way yet.  So I'd like to hear hackers' idea.

* Foreign join hook point
First I thought that following existing cost estimating manner is the
way to go, but I tend to think it doesn't fit foreign joins because
join method is tightly-coupled to sort-ness, but foreign join would
not.

In current planner, add_paths_to_joinrel is conscious of sort-ness,
and functions directly called from it are conscious of join method.
But this seems not fit the abstraction level of FDW.  FDW is highly
abstracted, say differ from custom plan providers, so most of work
should be delegated to FDW, including pathkeys consideration, IMO.

Besides that, order of join consideration is another issue.  First I
try to add foreign join consideration at the last (after hash join
consideration), but after some thought I noticed that
early-elimination would work better if we try foreign join first,
because in most cases foreign join is the cheapest way to accomplish a
join between two foreign relations.

So currently I'm thinking that delegating whole join consideration to
FDWs before other join consideration in add_paths_to_joinrel, by
calling new FDW API would be promising.

This means that FDWs can add multiple arbitrary paths to RelOptInfo in
a call.  Of course this allows FDWs to do round-trip per path, but it
would be optimization issue,  and they can compare their own
candidates they can get without round-trip.

* Supported join types
INNER and (LEFT|RIGHT|FULL) OUTER would be safe to push down, even
though some of OUTER JOIN might not be much faster than local join.
I'm not sure that SEMI and ANTI joins are safe to push-down.  Can we
leave the matter to FDWs, or should we forbid FDWs pushing down by not
calling foreign join API?  Anyway SEMI/ANTI would not be supported in
the first version.

* Blockers of join push-down
Pushing down join means that foreign scans for inner and outer are
skipped, so some elements blocks pushing down.  Basically the criteria
is same as scan push-down and update push-down.

After some thoughts, we should check only unsafe expression in join
qual and restrict qual.  This limitation is necessary to avoid
difference between results of pushe-down or not.  Target list seems to
contain only Var for necessary columns, but we should check that too.

* WIP patch
Attached is WIP patch for reviewing the design.  Works should be done
are 1) judging push-down or not, and 2) generating join SQL.  For 2),
I'm thinking about referring Postgres-XC's join shipping mechanism.

Any comments or questions are welcome.
--
Shigeru HANADA

Вложения

join_pushdown_wip.patch

Re: Join push-down support for foreign tables

От

Shigeru Hanada

Дата:

15 декабря 2014 г., 08:40:28

Hi hackers,

I'm working on $SUBJECT and would like to get comments about the
design.  Attached patch is for the design below.  Note that the patch
requires Kaigai-san's custom foriegn join patch[1]

[1] http://www.postgresql.org/message-id/9A28C8860F777E439AA12E8AEA7694F80108C355@BPXM15GP.gisp.nec.co.jp

Joins to be pushed down
=======================
We have two levels of decision about Join push-down, core and FDW.  I
think we should allow them to push down joins as much as we can unless
it doesn't break the semantics of join.  Anyway FDWs should decide
whether the join can be pushed down or not, on the basis of the FDW's
capability.

Here is the list of checks which should be done in core:

1. Join source relations
All of foreign tables used in a join should be managed by one foreign
data wrapper.  I once proposed that all source tables should belong to
one server, because at that time I assumed that FDWs use SERVER to
express physical place of data source.  But Robert's comment gave me
an idea that SERVER is not important for some FDWs, so now I think
check about server matching should be done by FDWs.

USER MAPPING is another important attribute of foreign scan/join, and
IMO it should be checked by FDW because some of FDWs don't require
USER MAPPING.  If an FDW want to check user mapping, all tables in the
join should belong to the same server and have same
RangeTablEntry#checkAsUser to ensure that only one user mapping is
derived.

2. Join type
Join type can be any, except JOIN_UNIQUE_OUTER and JOIN_UNIQUE_INNER,
though most of FDWs would support only INNER and OUTER.
Pushing down CROSS joins might seem inefficient, because obviously
CROSS JOIN always produces more result than retrieving all rows from
each foreign table separately.  However, some FDW might be able to
accelerate such join with cache or something.  So I think we should
leave such decision to FDWs.

Here is the list of checks which shold be done in postgres_fdw:

1. Join source relations
As described above, postgres_fdw (and most of SQL-based FDWs) needs to
check that 1) all foreign tables in the join belong to a server, and
2) all foreign tables have same checkAsUser.
In addition to that, I add extra limitation that both inner/outer
should be plain foreign tables, not a result of foreign join.  This
limiation makes SQL generator simple.  Fundamentally it's possible to
join even join relations, so N-way join is listed as enhancement item
below.

2. Join type
In the first proposal, postgres_fdw allows INNER and OUTER joins to be
pushed down.  CROSS, SEMI and ANTI would have much less use cases.

3. Join conditions and WHERE clauses
Join conditions should consist of semantically safe expressions.
Where the "semantically safe" means is same as WHERE clause push-down.

Planned enhancements for 9.5
============================
These features will be proposed as enhancements, hopefully in the 9.5
development cycle, but probably in 9.6.

1. Remove unnecessary column from SELECT clause
Columns which are used for only join conditions can be removed from
the target list, as postgres_fdw does in simple foreign scans.

2. Support N-way joins
Mostly for difficulty of SQL generation, I didn't add support of N-Way joins.

3. Proper cost estimation
Currently postgres_fdw always gives 0 as the cost of a foreign join,
as a compromise.  This is because estimating costs of each join
without round-trip (EXPLAIN) is not easy.  A rough idea about that I
currently have is to use local statistics, but determining join method
used at remote  might require whole planner to run for the join
subtree.

Regards,
--
Shigeru HANADA

Вложения

join_pushdown.patch

Re: Join push-down support for foreign tables

От

Tom Lane

Дата:

15 декабря 2014 г., 15:46:03

Shigeru Hanada <shigeru.hanada@gmail.com> writes:
> I'm working on $SUBJECT and would like to get comments about the
> design.  Attached patch is for the design below.  Note that the patch
> requires Kaigai-san's custom foriegn join patch[1]

For the record, I'm not particularly on-board with custom scan, and
even less so with custom join.  I don't want FDW features like this
depending on those kluges, especially not when you're still in need
of core-code changes (which really points up the inadequacy of those
concepts).

Also, please don't redefine struct NestPath like that.  That adds a
whole bunch of notational churn (and backpatch risk) for no value
that I can see.  It might've been better to do it like that in a
green field, but you're about twenty years too late to do it now.
        regards, tom lane

Re: Join push-down support for foreign tables

От

Robert Haas

Дата:

15 декабря 2014 г., 16:22:16

On Mon, Dec 15, 2014 at 3:40 AM, Shigeru Hanada
<shigeru.hanada@gmail.com> wrote:
> I'm working on $SUBJECT and would like to get comments about the
> design.  Attached patch is for the design below.

I'm glad you are working on this.

> 1. Join source relations
> As described above, postgres_fdw (and most of SQL-based FDWs) needs to
> check that 1) all foreign tables in the join belong to a server, and
> 2) all foreign tables have same checkAsUser.
> In addition to that, I add extra limitation that both inner/outer
> should be plain foreign tables, not a result of foreign join.  This
> limiation makes SQL generator simple.  Fundamentally it's possible to
> join even join relations, so N-way join is listed as enhancement item
> below.

It seems pretty important to me that we have a way to push the entire
join nest down.  Being able to push down a 2-way join but not more
seems like quite a severe limitation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Join push-down support for foreign tables

От

Kouhei Kaigai

Дата:

16 декабря 2014 г., 00:01:27

Hanada-san,

Thanks for proposing this great functionality people waited for.

> On Mon, Dec 15, 2014 at 3:40 AM, Shigeru Hanada <shigeru.hanada@gmail.com>
> wrote:
> > I'm working on $SUBJECT and would like to get comments about the
> > design.  Attached patch is for the design below.
> 
> I'm glad you are working on this.
> 
> > 1. Join source relations
> > As described above, postgres_fdw (and most of SQL-based FDWs) needs to
> > check that 1) all foreign tables in the join belong to a server, and
> > 2) all foreign tables have same checkAsUser.
> > In addition to that, I add extra limitation that both inner/outer
> > should be plain foreign tables, not a result of foreign join.  This
> > limiation makes SQL generator simple.  Fundamentally it's possible to
> > join even join relations, so N-way join is listed as enhancement item
> > below.
> 
> It seems pretty important to me that we have a way to push the entire join
> nest down.  Being able to push down a 2-way join but not more seems like
> quite a severe limitation.
> 
As long as we don't care about simpleness/gracefulness of the remote
query, what we need to do is not complicated. All the optimization jobs
are responsibility of remote system.

Let me explain my thought:
We have three cases to be considered; (1) a join between foreign tables
that is the supported case, (2) a join either of relations is foreign
join, and (3) a join both of relations are foreign joins.

In case of (1), remote query shall have the following form: SELECT <tlist> FROM FT1 JOIN FT2 ON <cond> WHERE <qual>

In case of (2) or (3), because we already construct inner/outer join,
it is not difficult to replace the FT1 or FT2 above by sub-query, like: SELECT <tlist> FROM FT3 JOIN   (SELECT <tlist>
FROMFT1 JOIN FT2 ON <cond> WHERE <qual>) as FJ1   ON <joincond> WHERE <qual>

How about your thought?

Let me comment on some other points at this moment:

* Enhancement in src/include/foreign/fdwapi.h

It seems to me GetForeignJoinPath_function and GetForeignJoinPlan_function
are not used anywhere. Is it an oversight to remove definitions from your
previous work, isn't it?
Now ForeignJoinPath is added on set_join_pathlist_hook, but not callback of
FdwRoutine.

* Is ForeignJoinPath really needed?

I guess the reason why you added ForeignJoinPath is, to have the fields
of inner_path/outer_path. If we want to have paths of underlying relations,
a straightforward way for the concept (join relations is replaced by
foreign-/custom-scan on a result set of remote join) is enhancement of the
fields in ForeignPath.
How about an idea that adds "List *fdw_subpaths" to save the list of
underlying Path nodes. It also allows to have more than two relations
to be joined.
(Probably, it should be a feature of interface portion. I may have to
enhance my portion.)

* Why NestPath is re-defined?

-typedef JoinPath NestPath;
+typedef struct NestPath
+{
+    JoinPath    jpath;
+} NestPath;

It looks to me this change makes patch scale larger...

Best regards,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: Join push-down support for foreign tables

От

Kouhei Kaigai

Дата:

16 декабря 2014 г., 02:59:15

Hanada-san,

One other question from my side:
How postgres_fdw tries to solve the varno/varattno mapping when it
replaces relations join by foreign-scan?

Let me explain the issue using an example. If SELECT has a target-
list that references 2nd-column of the inner relation and 2nd-column
of the outer relation, how varno/varattno of ForeignScan shall be
assigned on?
Unless FDW driver does not set fdw_ps_tlist, setrefs.c deals with
this ForeignScan as usual relation scan, then varno of Var will
have non-special varno (even though it may be shifted by rtoffset
in setrefs.c).
Then, ExecEvalScalarVar() is invoked on executor to evaluate the
value of fetched tuple. At that time, which slot and attribute will
be referenced? The varattno of Var-node is neutral on setrefs.c, so
both of the var-nodes that references 2nd-column of the inner relation
and 2nd-column of the outer relation will reference the 2nd-column
of the econtext->ecxt_scantuple, however, it is uncertain which
column of foreign-table is mapped to 2nd-column of the ecxt_scantuple.
So, it needs to inform the planner which underlying column is
mapped to the pseudo scan tlist.

Another expression of what I'm saying is:
 SELECT   ft_1.second_col X,   --> varno=1 / varattno=2   ft_2.second_col Y    --> varno=2 / varattno=2 FROM   ft_1
NATURALJOIN ft_2;

When FROM-clause is replaced by ForeignScan, the relevant varattno
also needs to be updated, according to the underlying remote query.
If postgres_fdw runs the following remote query, X should have varattno=1
and Y should have varattno=2 on the pseudo scan tlist. remote: SELECT t_1.second_col, t_2.second_col           FROM t_1
NATURALJOIN t_2;

You can inform the planner this mapping using fdw_ps_tlist field of
ForeignScan, if FDW driver put a list of TargetEntry.
In above example, fdw_ps_tlist will have two elements and both of then
has Var-node of the underlying foreign tables.

The patch to replace join by foreign-/custom-scan adds a functionality
to fix-up varno/varattno in these cases.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Kouhei Kaigai
> Sent: Tuesday, December 16, 2014 9:01 AM
> To: Robert Haas; Shigeru Hanada
> Cc: PostgreSQL-development
> Subject: Re: [HACKERS] Join push-down support for foreign tables
> 
> Hanada-san,
> 
> Thanks for proposing this great functionality people waited for.
> 
> > On Mon, Dec 15, 2014 at 3:40 AM, Shigeru Hanada
> > <shigeru.hanada@gmail.com>
> > wrote:
> > > I'm working on $SUBJECT and would like to get comments about the
> > > design.  Attached patch is for the design below.
> >
> > I'm glad you are working on this.
> >
> > > 1. Join source relations
> > > As described above, postgres_fdw (and most of SQL-based FDWs) needs
> > > to check that 1) all foreign tables in the join belong to a server,
> > > and
> > > 2) all foreign tables have same checkAsUser.
> > > In addition to that, I add extra limitation that both inner/outer
> > > should be plain foreign tables, not a result of foreign join.  This
> > > limiation makes SQL generator simple.  Fundamentally it's possible
> > > to join even join relations, so N-way join is listed as enhancement
> > > item below.
> >
> > It seems pretty important to me that we have a way to push the entire
> > join nest down.  Being able to push down a 2-way join but not more
> > seems like quite a severe limitation.
> >
> As long as we don't care about simpleness/gracefulness of the remote query,
> what we need to do is not complicated. All the optimization jobs are
> responsibility of remote system.
> 
> Let me explain my thought:
> We have three cases to be considered; (1) a join between foreign tables
> that is the supported case, (2) a join either of relations is foreign join,
> and (3) a join both of relations are foreign joins.
> 
> In case of (1), remote query shall have the following form:
>   SELECT <tlist> FROM FT1 JOIN FT2 ON <cond> WHERE <qual>
> 
> In case of (2) or (3), because we already construct inner/outer join, it
> is not difficult to replace the FT1 or FT2 above by sub-query, like:
>   SELECT <tlist> FROM FT3 JOIN
>     (SELECT <tlist> FROM FT1 JOIN FT2 ON <cond> WHERE <qual>) as FJ1
>     ON <joincond> WHERE <qual>
> 
> How about your thought?
> 
> 
> Let me comment on some other points at this moment:
> 
> * Enhancement in src/include/foreign/fdwapi.h
> 
> It seems to me GetForeignJoinPath_function and
> GetForeignJoinPlan_function are not used anywhere. Is it an oversight to
> remove definitions from your previous work, isn't it?
> Now ForeignJoinPath is added on set_join_pathlist_hook, but not callback
> of FdwRoutine.
> 
> 
> * Is ForeignJoinPath really needed?
> 
> I guess the reason why you added ForeignJoinPath is, to have the fields
> of inner_path/outer_path. If we want to have paths of underlying relations,
> a straightforward way for the concept (join relations is replaced by
> foreign-/custom-scan on a result set of remote join) is enhancement of the
> fields in ForeignPath.
> How about an idea that adds "List *fdw_subpaths" to save the list of
> underlying Path nodes. It also allows to have more than two relations to
> be joined.
> (Probably, it should be a feature of interface portion. I may have to enhance
> my portion.)
> 
> * Why NestPath is re-defined?
> 
> -typedef JoinPath NestPath;
> +typedef struct NestPath
> +{
> +    JoinPath    jpath;
> +} NestPath;
> 
> It looks to me this change makes patch scale larger...
> 
> Best regards,
> --
> NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei
> <kaigai@ak.jp.nec.com>
> 
> 
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make
> changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

Re: Join push-down support for foreign tables

От

Shigeru Hanada

Дата:

26 декабря 2014 г., 04:45:07

2014-12-16 0:45 GMT+09:00 Tom Lane <tgl@sss.pgh.pa.us>:
> Shigeru Hanada <shigeru.hanada@gmail.com> writes:
>> I'm working on $SUBJECT and would like to get comments about the
>> design.  Attached patch is for the design below.  Note that the patch
>> requires Kaigai-san's custom foriegn join patch[1]
>
> For the record, I'm not particularly on-board with custom scan, and
> even less so with custom join.  I don't want FDW features like this
> depending on those kluges, especially not when you're still in need
> of core-code changes (which really points up the inadequacy of those
> concepts).

This design derived from discussion about custom scan/join.  In that
discussion Robert suggested common infrastructure for replacing Join
path with Scan node.  Here I agree to user such common infrastructure.
One concern is introducing hook function I/F which seems to break
FdwRoutine I/F boundary...

>
> Also, please don't redefine struct NestPath like that.  That adds a
> whole bunch of notational churn (and backpatch risk) for no value
> that I can see.  It might've been better to do it like that in a
> green field, but you're about twenty years too late to do it now.

Ok, will revert it.

-- 
Shigeru HANADA

Re: Join push-down support for foreign tables

От

Shigeru Hanada

Дата:

26 декабря 2014 г., 04:49:09

2014-12-16 1:22 GMT+09:00 Robert Haas <robertmhaas@gmail.com>:
> On Mon, Dec 15, 2014 at 3:40 AM, Shigeru Hanada
> <shigeru.hanada@gmail.com> wrote:
>> I'm working on $SUBJECT and would like to get comments about the
>> design.  Attached patch is for the design below.
>
> I'm glad you are working on this.
>
>> 1. Join source relations
>> As described above, postgres_fdw (and most of SQL-based FDWs) needs to
>> check that 1) all foreign tables in the join belong to a server, and
>> 2) all foreign tables have same checkAsUser.
>> In addition to that, I add extra limitation that both inner/outer
>> should be plain foreign tables, not a result of foreign join.  This
>> limiation makes SQL generator simple.  Fundamentally it's possible to
>> join even join relations, so N-way join is listed as enhancement item
>> below.
>
> It seems pretty important to me that we have a way to push the entire
> join nest down.  Being able to push down a 2-way join but not more
> seems like quite a severe limitation.

Hmm, I agree to support N-way join is very useful.  Postgres-XC's SQL
generator seems to give us a hint for such case, I'll check it out
again.

-- 
Shigeru HANADA

Re: Join push-down support for foreign tables

От

Michael Paquier

Дата:

15 января 2015 г., 07:39:04

On Fri, Dec 26, 2014 at 1:48 PM, Shigeru Hanada
<shigeru.hanada@gmail.com> wrote:
> Hmm, I agree to support N-way join is very useful.  Postgres-XC's SQL
> generator seems to give us a hint for such case, I'll check it out
> again.
Switching to returned with feedback, as this patch is waiting for
feedback for a couple of weeks now.
-- 
Michael

Re: Join push-down support for foreign tables

От

Shigeru Hanada

Дата:

16 февраля 2015 г., 04:03:28

Hi

I've revised the patch based on Kaigai-san's custom/foreign join patch
posted in the thread below.

http://www.postgresql.org/message-id/9A28C8860F777E439AA12E8AEA7694F80108C355@BPXM15GP.gisp.nec.co.jp

Basically not changed from the version in the last CF, but as Robert
commented before, N-way (not only 2-way) joins should be supported in
the first version by construct SELECT SQL by containing source query
in FROM clause as inline views (a.k.a. from clause subquery).

2014-12-26 13:48 GMT+09:00 Shigeru Hanada <shigeru.hanada@gmail.com>:
> 2014-12-16 1:22 GMT+09:00 Robert Haas <robertmhaas@gmail.com>:
>> On Mon, Dec 15, 2014 at 3:40 AM, Shigeru Hanada
>> <shigeru.hanada@gmail.com> wrote:
>>> I'm working on $SUBJECT and would like to get comments about the
>>> design.  Attached patch is for the design below.
>>
>> I'm glad you are working on this.
>>
>>> 1. Join source relations
>>> As described above, postgres_fdw (and most of SQL-based FDWs) needs to
>>> check that 1) all foreign tables in the join belong to a server, and
>>> 2) all foreign tables have same checkAsUser.
>>> In addition to that, I add extra limitation that both inner/outer
>>> should be plain foreign tables, not a result of foreign join.  This
>>> limiation makes SQL generator simple.  Fundamentally it's possible to
>>> join even join relations, so N-way join is listed as enhancement item
>>> below.
>>
>> It seems pretty important to me that we have a way to push the entire
>> join nest down.  Being able to push down a 2-way join but not more
>> seems like quite a severe limitation.
>
> Hmm, I agree to support N-way join is very useful.  Postgres-XC's SQL
> generator seems to give us a hint for such case, I'll check it out
> again.
>
> --
> Shigeru HANADA



--
Shigeru HANADA

Вложения

foreign_join.patch

Re: Join push-down support for foreign tables

От

Kouhei Kaigai

Дата:

16 февраля 2015 г., 04:13:43

Hanada-san,

Your patch mixtures enhancement of custom-/foreign-scan interface and
enhancement of contrib/postgres_fdw... Probably, it is a careless mis-
operation.
Please make your patch as differences from my infrastructure portion.


Also, I noticed this "Join pushdown support for foreign tables" patch
is unintentionally rejected in the last commit fest. https://commitfest.postgresql.org/3/20/
I couldn't register myself as reviewer. How do I operate it on the
new commitfest application?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Shigeru Hanada
> Sent: Monday, February 16, 2015 1:03 PM
> To: Robert Haas
> Cc: PostgreSQL-development
> Subject: Re: [HACKERS] Join push-down support for foreign tables
> 
> Hi
> 
> I've revised the patch based on Kaigai-san's custom/foreign join patch
> posted in the thread below.
> 
> http://www.postgresql.org/message-id/9A28C8860F777E439AA12E8AEA7694F80
> 108C355@BPXM15GP.gisp.nec.co.jp
> 
> Basically not changed from the version in the last CF, but as Robert
> commented before, N-way (not only 2-way) joins should be supported in the
> first version by construct SELECT SQL by containing source query in FROM
> clause as inline views (a.k.a. from clause subquery).
> 
> 2014-12-26 13:48 GMT+09:00 Shigeru Hanada <shigeru.hanada@gmail.com>:
> > 2014-12-16 1:22 GMT+09:00 Robert Haas <robertmhaas@gmail.com>:
> >> On Mon, Dec 15, 2014 at 3:40 AM, Shigeru Hanada
> >> <shigeru.hanada@gmail.com> wrote:
> >>> I'm working on $SUBJECT and would like to get comments about the
> >>> design.  Attached patch is for the design below.
> >>
> >> I'm glad you are working on this.
> >>
> >>> 1. Join source relations
> >>> As described above, postgres_fdw (and most of SQL-based FDWs) needs
> >>> to check that 1) all foreign tables in the join belong to a server,
> >>> and
> >>> 2) all foreign tables have same checkAsUser.
> >>> In addition to that, I add extra limitation that both inner/outer
> >>> should be plain foreign tables, not a result of foreign join.  This
> >>> limiation makes SQL generator simple.  Fundamentally it's possible
> >>> to join even join relations, so N-way join is listed as enhancement
> >>> item below.
> >>
> >> It seems pretty important to me that we have a way to push the entire
> >> join nest down.  Being able to push down a 2-way join but not more
> >> seems like quite a severe limitation.
> >
> > Hmm, I agree to support N-way join is very useful.  Postgres-XC's SQL
> > generator seems to give us a hint for such case, I'll check it out
> > again.
> >
> > --
> > Shigeru HANADA
> 
> 
> 
> --
> Shigeru HANADA

Re: Join push-down support for foreign tables

От

Shigeru Hanada

Дата:

16 февраля 2015 г., 04:53:58

Kaigai-san,

Oops.  I rebased the patch onto your v4 custom/foreign join patch.
But as you mentioned off-list, I found a flaw about inappropriate
change about NestPath still remains in the patch...  I might have made
my dev branch into unexpected state.  I'll check it soon.

2015-02-16 13:13 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:
> Hanada-san,
>
> Your patch mixtures enhancement of custom-/foreign-scan interface and
> enhancement of contrib/postgres_fdw... Probably, it is a careless mis-
> operation.
> Please make your patch as differences from my infrastructure portion.
>
>
> Also, I noticed this "Join pushdown support for foreign tables" patch
> is unintentionally rejected in the last commit fest.
>   https://commitfest.postgresql.org/3/20/
> I couldn't register myself as reviewer. How do I operate it on the
> new commitfest application?
>
> Thanks,
> --
> NEC OSS Promotion Center / PG-Strom Project
> KaiGai Kohei <kaigai@ak.jp.nec.com>
>
>
>> -----Original Message-----
>> From: pgsql-hackers-owner@postgresql.org
>> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Shigeru Hanada
>> Sent: Monday, February 16, 2015 1:03 PM
>> To: Robert Haas
>> Cc: PostgreSQL-development
>> Subject: Re: [HACKERS] Join push-down support for foreign tables
>>
>> Hi
>>
>> I've revised the patch based on Kaigai-san's custom/foreign join patch
>> posted in the thread below.
>>
>> http://www.postgresql.org/message-id/9A28C8860F777E439AA12E8AEA7694F80
>> 108C355@BPXM15GP.gisp.nec.co.jp
>>
>> Basically not changed from the version in the last CF, but as Robert
>> commented before, N-way (not only 2-way) joins should be supported in the
>> first version by construct SELECT SQL by containing source query in FROM
>> clause as inline views (a.k.a. from clause subquery).
>>
>> 2014-12-26 13:48 GMT+09:00 Shigeru Hanada <shigeru.hanada@gmail.com>:
>> > 2014-12-16 1:22 GMT+09:00 Robert Haas <robertmhaas@gmail.com>:
>> >> On Mon, Dec 15, 2014 at 3:40 AM, Shigeru Hanada
>> >> <shigeru.hanada@gmail.com> wrote:
>> >>> I'm working on $SUBJECT and would like to get comments about the
>> >>> design.  Attached patch is for the design below.
>> >>
>> >> I'm glad you are working on this.
>> >>
>> >>> 1. Join source relations
>> >>> As described above, postgres_fdw (and most of SQL-based FDWs) needs
>> >>> to check that 1) all foreign tables in the join belong to a server,
>> >>> and
>> >>> 2) all foreign tables have same checkAsUser.
>> >>> In addition to that, I add extra limitation that both inner/outer
>> >>> should be plain foreign tables, not a result of foreign join.  This
>> >>> limiation makes SQL generator simple.  Fundamentally it's possible
>> >>> to join even join relations, so N-way join is listed as enhancement
>> >>> item below.
>> >>
>> >> It seems pretty important to me that we have a way to push the entire
>> >> join nest down.  Being able to push down a 2-way join but not more
>> >> seems like quite a severe limitation.
>> >
>> > Hmm, I agree to support N-way join is very useful.  Postgres-XC's SQL
>> > generator seems to give us a hint for such case, I'll check it out
>> > again.
>> >
>> > --
>> > Shigeru HANADA
>>
>>
>>
>> --
>> Shigeru HANADA



--
Shigeru HANADA

Вложения

foreign_join.patch

Re: Join push-down support for foreign tables

От

Kouhei Kaigai

Дата:

17 февраля 2015 г., 01:39:36

Let me put some comments in addition to where you're checking now.

[design issues]
* Cost estimation
Estimation and evaluation of cost for remote join query is not an
obvious issue. In principle, local side cannot determine the cost
to run remote join without remote EXPLAIN, because local side has
no information about JOIN logic applied on the remote side.
Probably, we have to put an assumption for remote join algorithm,
because local planner has no idea about remote planner's choice
unless foreign-join don't take "use_remote_estimate".
I think, it is reasonable assumption (even if it is incorrect) to
calculate remote join cost based on local hash-join algorithm.
If user wants more correct estimation, remote EXPLAIN will make
more reliable cost estimation.

It also needs a consensus whether cost for remote CPU execution is
equivalent to local CPU. If we think local CPU is rare resource
than remote one, a discount rate will make planner more preferable
to choose remote join than local one.

Once we assume a join algorithm for remote join, unit cost for
remote CPU, we can calculate a cost for foreign join based on
the local join logic plus cost for network translation (maybe
fdw_tuple_cost?).

* FDW options
Unlike table scan, FDW options we should refer is unclear.
Table level FDW options are associated with a foreign table as
literal. I think we have two options here:
1. Foreign-join refers FDW options for foreign-server, but ones  for foreign-tables are ignored.
2. Foreign-join is prohibited when both of relations don't have  identical FDW options.
My preference is 2. Even though N-way foreign join, it ensures
all the tables involved with (N-1)-way foreign join has identical
FDW options, thus it leads we can make N-way foreign join with
all identical FDW options.
One exception is "updatable" flag of postgres_fdw. It does not
make sense on remote join, so I think mixture of updatable and
non-updatable foreign tables should be admitted, however, it is
a decision by FDW driver.

Probably, above points need to take time for getting consensus.
I'd like to see your opinion prior to editing your patch.

[implementation issues]
The interface does not intend to add new Path/Plan type for each scan
that replaces foreign joins. What postgres_fdw should do is, adding
ForeignPath towards a particular joinrel, then it populates ForeignScan
with remote join query once it got chosen by the planner.

A few functions added in src/backend/foreign/foreign.c are not
called by anywhere, at this moment.

create_plan_recurse() is reverted to static. It is needed for custom-
join enhancement, if no other infrastructure can support.

I'll check the code to construct remote query later.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

> -----Original Message-----
> From: Shigeru Hanada [mailto:shigeru.hanada@gmail.com]
> Sent: Monday, February 16, 2015 1:54 PM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Robert Haas; PostgreSQL-development
> Subject: ##freemail## Re: [HACKERS] Join push-down support for foreign
> tables
> 
> Kaigai-san,
> 
> Oops.  I rebased the patch onto your v4 custom/foreign join patch.
> But as you mentioned off-list, I found a flaw about inappropriate change
> about NestPath still remains in the patch...  I might have made my dev branch
> into unexpected state.  I'll check it soon.
> 
> 2015-02-16 13:13 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:
> > Hanada-san,
> >
> > Your patch mixtures enhancement of custom-/foreign-scan interface and
> > enhancement of contrib/postgres_fdw... Probably, it is a careless mis-
> > operation.
> > Please make your patch as differences from my infrastructure portion.
> >
> >
> > Also, I noticed this "Join pushdown support for foreign tables" patch
> > is unintentionally rejected in the last commit fest.
> >   https://commitfest.postgresql.org/3/20/
> > I couldn't register myself as reviewer. How do I operate it on the new
> > commitfest application?
> >
> > Thanks,
> > --
> > NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei
> > <kaigai@ak.jp.nec.com>
> >
> >
> >> -----Original Message-----
> >> From: pgsql-hackers-owner@postgresql.org
> >> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Shigeru
> >> Hanada
> >> Sent: Monday, February 16, 2015 1:03 PM
> >> To: Robert Haas
> >> Cc: PostgreSQL-development
> >> Subject: Re: [HACKERS] Join push-down support for foreign tables
> >>
> >> Hi
> >>
> >> I've revised the patch based on Kaigai-san's custom/foreign join
> >> patch posted in the thread below.
> >>
> >>
> http://www.postgresql.org/message-id/9A28C8860F777E439AA12E8AEA7694F8
> >> 0
> >> 108C355@BPXM15GP.gisp.nec.co.jp
> >>
> >> Basically not changed from the version in the last CF, but as Robert
> >> commented before, N-way (not only 2-way) joins should be supported in
> >> the first version by construct SELECT SQL by containing source query
> >> in FROM clause as inline views (a.k.a. from clause subquery).
> >>
> >> 2014-12-26 13:48 GMT+09:00 Shigeru Hanada <shigeru.hanada@gmail.com>:
> >> > 2014-12-16 1:22 GMT+09:00 Robert Haas <robertmhaas@gmail.com>:
> >> >> On Mon, Dec 15, 2014 at 3:40 AM, Shigeru Hanada
> >> >> <shigeru.hanada@gmail.com> wrote:
> >> >>> I'm working on $SUBJECT and would like to get comments about the
> >> >>> design.  Attached patch is for the design below.
> >> >>
> >> >> I'm glad you are working on this.
> >> >>
> >> >>> 1. Join source relations
> >> >>> As described above, postgres_fdw (and most of SQL-based FDWs)
> >> >>> needs to check that 1) all foreign tables in the join belong to a
> >> >>> server, and
> >> >>> 2) all foreign tables have same checkAsUser.
> >> >>> In addition to that, I add extra limitation that both inner/outer
> >> >>> should be plain foreign tables, not a result of foreign join.
> >> >>> This limiation makes SQL generator simple.  Fundamentally it's
> >> >>> possible to join even join relations, so N-way join is listed as
> >> >>> enhancement item below.
> >> >>
> >> >> It seems pretty important to me that we have a way to push the
> >> >> entire join nest down.  Being able to push down a 2-way join but
> >> >> not more seems like quite a severe limitation.
> >> >
> >> > Hmm, I agree to support N-way join is very useful.  Postgres-XC's
> >> > SQL generator seems to give us a hint for such case, I'll check it
> >> > out again.
> >> >
> >> > --
> >> > Shigeru HANADA
> >>
> >>
> >>
> >> --
> >> Shigeru HANADA
> 
> 
> 
> --
> Shigeru HANADA

Re: Join push-down support for foreign tables

От

Shigeru Hanada

Дата:

19 февраля 2015 г., 07:19:13

2015-02-17 10:39 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:
> Let me put some comments in addition to where you're checking now.
>
> [design issues]
> * Cost estimation
> Estimation and evaluation of cost for remote join query is not an
> obvious issue. In principle, local side cannot determine the cost
> to run remote join without remote EXPLAIN, because local side has
> no information about JOIN logic applied on the remote side.
> Probably, we have to put an assumption for remote join algorithm,
> because local planner has no idea about remote planner's choice
> unless foreign-join don't take "use_remote_estimate".
> I think, it is reasonable assumption (even if it is incorrect) to
> calculate remote join cost based on local hash-join algorithm.
> If user wants more correct estimation, remote EXPLAIN will make
> more reliable cost estimation.

Hm, I guess that you chose hash-join as "least-costed join".  In the
pgbench model, most combination between two tables generate hash join
as cheapest path.  Remote EXPLAIN is very expensive in the context of
planning, so it would easily make the plan optimization meaningless.
But giving an option to users is good, I agree.

>
> It also needs a consensus whether cost for remote CPU execution is
> equivalent to local CPU. If we think local CPU is rare resource
> than remote one, a discount rate will make planner more preferable
> to choose remote join than local one

Something like cpu_cost_ratio as a new server-level FDW option?

>
> Once we assume a join algorithm for remote join, unit cost for
> remote CPU, we can calculate a cost for foreign join based on
> the local join logic plus cost for network translation (maybe
> fdw_tuple_cost?).

Yes, sum of these costs is the total cost of a remote join.   o fdw_startup_cost   o hash-join cost, estimated as a
localjoin   o fdw_tuple_cost * rows * width
 

> * FDW options
> Unlike table scan, FDW options we should refer is unclear.
> Table level FDW options are associated with a foreign table as
> literal. I think we have two options here:
> 1. Foreign-join refers FDW options for foreign-server, but ones
>    for foreign-tables are ignored.
> 2. Foreign-join is prohibited when both of relations don't have
>    identical FDW options.
> My preference is 2. Even though N-way foreign join, it ensures
> all the tables involved with (N-1)-way foreign join has identical
> FDW options, thus it leads we can make N-way foreign join with
> all identical FDW options.
> One exception is "updatable" flag of postgres_fdw. It does not
> make sense on remote join, so I think mixture of updatable and
> non-updatable foreign tables should be admitted, however, it is
> a decision by FDW driver.
>
> Probably, above points need to take time for getting consensus.
> I'd like to see your opinion prior to editing your patch.

postgres_fdw can't push down a join which contains foreign tables on
multiple servers, so use_remote_estimate and fdw_startup_cost  are the
only FDW options to consider.  So we have options for each option.

1-a. If all foreign tables in the join has identical
use_remote_estimate, allow pushing down.
1-b. If any of foreign table in the join has true as
use_remote_estimate, use remote estimate.

2-a. If all foreign tables in the join has identical fdw_startup_cost,
allow pushing down.
2-b. Always use max value in the join. (cost would be more expensive)
2-c. Always use min value in the join.  (cost would be cheaper)

I prefer 1-a and 2-b, so more joins avoid remote EXPLAIN but have
reasonable cost about startup.

I agree about "updatable" option.

>
> [implementation issues]
> The interface does not intend to add new Path/Plan type for each scan
> that replaces foreign joins. What postgres_fdw should do is, adding
> ForeignPath towards a particular joinrel, then it populates ForeignScan
> with remote join query once it got chosen by the planner.

That idea is interesting, and make many things simpler.  Please let me consider.

>
> A few functions added in src/backend/foreign/foreign.c are not
> called by anywhere, at this moment.
>
> create_plan_recurse() is reverted to static. It is needed for custom-
> join enhancement, if no other infrastructure can support.

I made it back to static because I thought that create_plan_recurse
can be called by core before giving control to FDWs.  But I'm not sure
it can be applied to custom scans.  I'll recheck that part.


-- 
Shigeru HANADA

Re: Join push-down support for foreign tables

От

Shigeru Hanada

Дата:

02 марта 2015 г., 12:48:37

Attached is the revised/rebased version of the $SUBJECT.

This patch is based on Kaigai-san's custom/foreign join patch, so
please apply it before this patch.  In this version I changed some
points from original postgres_fdw.

1) Disabled SELECT clause optimization
~9.4 postgres_fdw lists only columns actually used in SELECT clause,
but AFAIS it makes SQL generation complex.  So I disabled such
optimization and put "NULL" for unnecessary columns in SELECT clause
of remote query.

2) Extended deparse context
To allow deparsing based on multiple source relations, I added some
members to context structure.  They are unnecessary for simple query
with single foreign table, but IMO it should be integrated.

With Kaigai-san's advise, changes for supporting foreign join on
postgres_fdw is minimized into postgres_fdw itself.  But I added new
FDW API named GetForeignJoinPaths() to keep the policy that all
interface between core and FDW should be in FdwRoutine, instead of
using hook function.  Now I'm writing document about it, and will post
it in a day.

2015-02-19 16:19 GMT+09:00 Shigeru Hanada <shigeru.hanada@gmail.com>:
> 2015-02-17 10:39 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:
>> Let me put some comments in addition to where you're checking now.
>>
>> [design issues]
>> * Cost estimation
>> Estimation and evaluation of cost for remote join query is not an
>> obvious issue. In principle, local side cannot determine the cost
>> to run remote join without remote EXPLAIN, because local side has
>> no information about JOIN logic applied on the remote side.
>> Probably, we have to put an assumption for remote join algorithm,
>> because local planner has no idea about remote planner's choice
>> unless foreign-join don't take "use_remote_estimate".
>> I think, it is reasonable assumption (even if it is incorrect) to
>> calculate remote join cost based on local hash-join algorithm.
>> If user wants more correct estimation, remote EXPLAIN will make
>> more reliable cost estimation.
>
> Hm, I guess that you chose hash-join as "least-costed join".  In the
> pgbench model, most combination between two tables generate hash join
> as cheapest path.  Remote EXPLAIN is very expensive in the context of
> planning, so it would easily make the plan optimization meaningless.
> But giving an option to users is good, I agree.
>
>>
>> It also needs a consensus whether cost for remote CPU execution is
>> equivalent to local CPU. If we think local CPU is rare resource
>> than remote one, a discount rate will make planner more preferable
>> to choose remote join than local one
>
> Something like cpu_cost_ratio as a new server-level FDW option?
>
>>
>> Once we assume a join algorithm for remote join, unit cost for
>> remote CPU, we can calculate a cost for foreign join based on
>> the local join logic plus cost for network translation (maybe
>> fdw_tuple_cost?).
>
> Yes, sum of these costs is the total cost of a remote join.
>     o fdw_startup_cost
>     o hash-join cost, estimated as a local join
>     o fdw_tuple_cost * rows * width
>
>> * FDW options
>> Unlike table scan, FDW options we should refer is unclear.
>> Table level FDW options are associated with a foreign table as
>> literal. I think we have two options here:
>> 1. Foreign-join refers FDW options for foreign-server, but ones
>>    for foreign-tables are ignored.
>> 2. Foreign-join is prohibited when both of relations don't have
>>    identical FDW options.
>> My preference is 2. Even though N-way foreign join, it ensures
>> all the tables involved with (N-1)-way foreign join has identical
>> FDW options, thus it leads we can make N-way foreign join with
>> all identical FDW options.
>> One exception is "updatable" flag of postgres_fdw. It does not
>> make sense on remote join, so I think mixture of updatable and
>> non-updatable foreign tables should be admitted, however, it is
>> a decision by FDW driver.
>>
>> Probably, above points need to take time for getting consensus.
>> I'd like to see your opinion prior to editing your patch.
>
> postgres_fdw can't push down a join which contains foreign tables on
> multiple servers, so use_remote_estimate and fdw_startup_cost  are the
> only FDW options to consider.  So we have options for each option.
>
> 1-a. If all foreign tables in the join has identical
> use_remote_estimate, allow pushing down.
> 1-b. If any of foreign table in the join has true as
> use_remote_estimate, use remote estimate.
>
> 2-a. If all foreign tables in the join has identical fdw_startup_cost,
> allow pushing down.
> 2-b. Always use max value in the join. (cost would be more expensive)
> 2-c. Always use min value in the join.  (cost would be cheaper)
>
> I prefer 1-a and 2-b, so more joins avoid remote EXPLAIN but have
> reasonable cost about startup.
>
> I agree about "updatable" option.
>
>>
>> [implementation issues]
>> The interface does not intend to add new Path/Plan type for each scan
>> that replaces foreign joins. What postgres_fdw should do is, adding
>> ForeignPath towards a particular joinrel, then it populates ForeignScan
>> with remote join query once it got chosen by the planner.
>
> That idea is interesting, and make many things simpler.  Please let me consider.
>
>>
>> A few functions added in src/backend/foreign/foreign.c are not
>> called by anywhere, at this moment.
>>
>> create_plan_recurse() is reverted to static. It is needed for custom-
>> join enhancement, if no other infrastructure can support.
>
> I made it back to static because I thought that create_plan_recurse
> can be called by core before giving control to FDWs.  But I'm not sure
> it can be applied to custom scans.  I'll recheck that part.
>
>
> --
> Shigeru HANADA



--
Shigeru HANADA

Вложения

foreign_join_v2.patch

Re: Join push-down support for foreign tables

От

Thom Brown

Дата:

02 марта 2015 г., 13:51:21

On 2 March 2015 at 12:48, Shigeru Hanada <shigeru.hanada@gmail.com> wrote:

Attached is the revised/rebased version of the $SUBJECT.

This patch is based on Kaigai-san's custom/foreign join patch, so
please apply it before this patch. In this version I changed some
points from original postgres_fdw.

1) Disabled SELECT clause optimization
~9.4 postgres_fdw lists only columns actually used in SELECT clause,
but AFAIS it makes SQL generation complex. So I disabled such
optimization and put "NULL" for unnecessary columns in SELECT clause
of remote query.

2) Extended deparse context
To allow deparsing based on multiple source relations, I added some
members to context structure. They are unnecessary for simple query
with single foreign table, but IMO it should be integrated.

With Kaigai-san's advise, changes for supporting foreign join on
postgres_fdw is minimized into postgres_fdw itself. But I added new
FDW API named GetForeignJoinPaths() to keep the policy that all
interface between core and FDW should be in FdwRoutine, instead of
using hook function. Now I'm writing document about it, and will post
it in a day.

I seem to be getting a problem with whole-row references:

# SELECT p.name, c.country, e.pet_name, p FROM pets e INNER JOIN people p on e.person_id = p.id inner join countries c on p.country_id = c.id;
ERROR: table "r" has 3 columns available but 4 columns specified
CONTEXT: Remote SQL command: SELECT r.a_0, r.a_1, r.a_2, l.a_1 FROM (SELECT id, country FROM public.countries) l (a_0, a_1) INNER JOIN (SELECT id, name, country_id FROM public.people) r (a_0, a_1, a_2, a_3) ON ((r.a_3 = l.a_0))

And the error message could be somewhat confusing. This mentions table "r", but there's no such table or alias in my actual query.

Another issue:

# EXPLAIN VERBOSE SELECT NULL FROM (SELECT people.id FROM people INNER JOIN countries ON people.country_id = countries.id LIMIT 3) x;
ERROR: could not open relation with OID 0

Thom

Re: Join push-down support for foreign tables

От

Kouhei Kaigai

Дата:

02 марта 2015 г., 14:08:24

> I seem to be getting a problem with whole-row references:
> 
> # SELECT p.name, c.country, e.pet_name, p FROM pets e INNER JOIN people p on
> e.person_id = p.id inner join countries c on p.country_id = c.id;
> ERROR:  table "r" has 3 columns available but 4 columns specified
> CONTEXT:  Remote SQL command: SELECT r.a_0, r.a_1, r.a_2, l.a_1 FROM (SELECT id,
> country FROM public.countries) l (a_0, a_1) INNER JOIN (SELECT id, name,
> country_id FROM public.people) r (a_0, a_1, a_2, a_3)  ON ((r.a_3 = l.a_0))
>
In this case, the 4th target-entry should be "l", not l.a_1.

> And the error message could be somewhat confusing.  This mentions table "r", but
> there's no such table or alias in my actual query.
>
However, do we have a mechanical/simple way to distinguish the cases when
we need relation alias from the case when we don't need it?
Like a self-join cases, we has to construct a remote query even if same
table is referenced multiple times in a query. Do you have a good idea?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


> -----Original Message-----
> From: thombrown@gmail.com [mailto:thombrown@gmail.com] On Behalf Of Thom Brown
> Sent: Monday, March 02, 2015 10:51 PM
> To: Shigeru Hanada
> Cc: Kaigai Kouhei(海外 浩平); Robert Haas; PostgreSQL-development
> Subject: ##freemail## Re: [HACKERS] Join push-down support for foreign tables
> 
> On 2 March 2015 at 12:48, Shigeru Hanada <shigeru.hanada@gmail.com> wrote:
> 
> 
>     Attached is the revised/rebased version of the $SUBJECT.
> 
>     This patch is based on Kaigai-san's custom/foreign join patch, so
>     please apply it before this patch.  In this version I changed some
>     points from original postgres_fdw.
> 
>     1) Disabled SELECT clause optimization
>     ~9.4 postgres_fdw lists only columns actually used in SELECT clause,
>     but AFAIS it makes SQL generation complex.  So I disabled such
>     optimization and put "NULL" for unnecessary columns in SELECT clause
>     of remote query.
> 
>     2) Extended deparse context
>     To allow deparsing based on multiple source relations, I added some
>     members to context structure.  They are unnecessary for simple query
>     with single foreign table, but IMO it should be integrated.
> 
>     With Kaigai-san's advise, changes for supporting foreign join on
>     postgres_fdw is minimized into postgres_fdw itself.  But I added new
>     FDW API named GetForeignJoinPaths() to keep the policy that all
>     interface between core and FDW should be in FdwRoutine, instead of
>     using hook function.  Now I'm writing document about it, and will post
>     it in a day.
> 
> 
> 
> I seem to be getting a problem with whole-row references:
> 
> # SELECT p.name, c.country, e.pet_name, p FROM pets e INNER JOIN people p on
> e.person_id = p.id inner join countries c on p.country_id = c.id;
> ERROR:  table "r" has 3 columns available but 4 columns specified
> CONTEXT:  Remote SQL command: SELECT r.a_0, r.a_1, r.a_2, l.a_1 FROM (SELECT id,
> country FROM public.countries) l (a_0, a_1) INNER JOIN (SELECT id, name,
> country_id FROM public.people) r (a_0, a_1, a_2, a_3)  ON ((r.a_3 = l.a_0))
> 
> 
> And the error message could be somewhat confusing.  This mentions table "r", but
> there's no such table or alias in my actual query.
> 
> 
> 
> Another issue:
> 
> # EXPLAIN VERBOSE SELECT NULL FROM (SELECT people.id FROM people INNER JOIN
> countries ON people.country_id = countries.id LIMIT 3) x;
> ERROR:  could not open relation with OID 0
> 
> 
> --
> 
> Thom

Re: Join push-down support for foreign tables

От

Thom Brown

Дата:

02 марта 2015 г., 14:36:43

On 2 March 2015 at 14:07, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

> I seem to be getting a problem with whole-row references:
>
> # SELECT p.name, c.country, e.pet_name, p FROM pets e INNER JOIN people p on
> e.person_id = p.id inner join countries c on p.country_id = c.id;
> ERROR: table "r" has 3 columns available but 4 columns specified
> CONTEXT: Remote SQL command: SELECT r.a_0, r.a_1, r.a_2, l.a_1 FROM (SELECT id,
> country FROM public.countries) l (a_0, a_1) INNER JOIN (SELECT id, name,
> country_id FROM public.people) r (a_0, a_1, a_2, a_3) ON ((r.a_3 = l.a_0))
>
In this case, the 4th target-entry should be "l", not l.a_1.

This will no doubt be my naivety talking, but if we know we need the whole row, can we not request the row without additionally requesting individual columns?

> And the error message could be somewhat confusing. This mentions table "r", but
> there's no such table or alias in my actual query.
>
However, do we have a mechanical/simple way to distinguish the cases when
we need relation alias from the case when we don't need it?
Like a self-join cases, we has to construct a remote query even if same
table is referenced multiple times in a query. Do you have a good idea?

Then perhaps all that's really needed here is to clarify that the error pertains to the remote execution plan rather than the query crafted by the user. Or maybe I'm nitpicking.

Thom

Re: Join push-down support for foreign tables

От

Kouhei Kaigai

Дата:

03 марта 2015 г., 01:56:06

> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Thom Brown
> Sent: Monday, March 02, 2015 11:36 PM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Shigeru Hanada; Robert Haas; PostgreSQL-development
> Subject: Re: [HACKERS] Join push-down support for foreign tables
> 
> On 2 March 2015 at 14:07, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> 
> 
>     > I seem to be getting a problem with whole-row references:
>     >
>     > # SELECT p.name, c.country, e.pet_name, p FROM pets e INNER JOIN people
> p on
>     > e.person_id = p.id inner join countries c on p.country_id = c.id;
>     > ERROR:  table "r" has 3 columns available but 4 columns specified
>     > CONTEXT:  Remote SQL command: SELECT r.a_0, r.a_1, r.a_2, l.a_1 FROM
> (SELECT id,
>     > country FROM public.countries) l (a_0, a_1) INNER JOIN (SELECT id, name,
>     > country_id FROM public.people) r (a_0, a_1, a_2, a_3)  ON ((r.a_3 =
> l.a_0))
>     >
>     In this case, the 4th target-entry should be "l", not l.a_1.
> 
> This will no doubt be my naivety talking, but if we know we need the whole row,
> can we not request the row without additionally requesting individual columns?
>
I doubt whether local construction of whole-row reference is more efficient
than redundant copy. Tuple deform/projection is not a work we can ignore its
cost from my experience. Even if redundant column is copied over the network,
local CPU can skip tuple modification to fit remote query results for local
expectation. (NOTE: FDW driver is responsible to return a tuple according to
the tts_tupleDescriptor, so we don't need to transform it if target-list of
remote query is identical.)

>     > And the error message could be somewhat confusing.  This mentions table
> "r", but
>     > there's no such table or alias in my actual query.
>     >
>     However, do we have a mechanical/simple way to distinguish the cases when
>     we need relation alias from the case when we don't need it?
>     Like a self-join cases, we has to construct a remote query even if same
>     table is referenced multiple times in a query. Do you have a good idea?
> 
> Then perhaps all that's really needed here is to clarify that the error pertains
> to the remote execution plan rather than the query crafted by the user.  Or maybe
> I'm nitpicking.
>
I could understand your concern about this version. However, it is a cosmetic
feature that we can fix up later, and what we should focus on at this moment
is to ensure the design concept; that run foreign/custom-scan instead of built-
in join node and they performs as like a usual scan on materialized relations.
So, it is not a good idea to make Hanada-san improve this feature _at this moment_.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: Join push-down support for foreign tables

От

Shigeru Hanada

Дата:

03 марта 2015 г., 08:13:59

Thanks for reviewing my patch.

2015-03-02 22:50 GMT+09:00 Thom Brown <thom@linux.com>:
> I seem to be getting a problem with whole-row references:
>
> # SELECT p.name, c.country, e.pet_name, p FROM pets e INNER JOIN people p on
> e.person_id = p.id inner join countries c on p.country_id = c.id;
> ERROR:  table "r" has 3 columns available but 4 columns specified
> CONTEXT:  Remote SQL command: SELECT r.a_0, r.a_1, r.a_2, l.a_1 FROM (SELECT
> id, country FROM public.countries) l (a_0, a_1) INNER JOIN (SELECT id, name,
> country_id FROM public.people) r (a_0, a_1, a_2, a_3)  ON ((r.a_3 = l.a_0))
>
> And the error message could be somewhat confusing.  This mentions table "r",
> but there's no such table or alias in my actual query.

Your concern would not be limited to the feature I'm proposing,
because fdw_options about object name also introduce such mismatch
between the query user constructed and another one which postgres_fdw
constructed for remote execution.  Currently we put CONTEXT line for
such purpose, but it might hard to understand soon only from error
messages.  So I'd like to add section about query debugging for
postgres_fdw.

> Another issue:
>
> # EXPLAIN VERBOSE SELECT NULL FROM (SELECT people.id FROM people INNER JOIN
> countries ON people.country_id = countries.id LIMIT 3) x;
> ERROR:  could not open relation with OID 0

Good catch.  In my quick trial, removing LIMIT3 avoids this error.
I'll check it right now.


-- 
Shigeru HANADA

Re: Join push-down support for foreign tables

От

Shigeru Hanada

Дата:

03 марта 2015 г., 08:16:08

2015-03-02 23:07 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:
>> I seem to be getting a problem with whole-row references:
>>
>> # SELECT p.name, c.country, e.pet_name, p FROM pets e INNER JOIN people p on
>> e.person_id = p.id inner join countries c on p.country_id = c.id;
>> ERROR:  table "r" has 3 columns available but 4 columns specified
>> CONTEXT:  Remote SQL command: SELECT r.a_0, r.a_1, r.a_2, l.a_1 FROM (SELECT id,
>> country FROM public.countries) l (a_0, a_1) INNER JOIN (SELECT id, name,
>> country_id FROM public.people) r (a_0, a_1, a_2, a_3)  ON ((r.a_3 = l.a_0))
>>
> In this case, the 4th target-entry should be "l", not l.a_1.

Actually.  I fixed that part.

>> And the error message could be somewhat confusing.  This mentions table "r", but
>> there's no such table or alias in my actual query.
>>
> However, do we have a mechanical/simple way to distinguish the cases when
> we need relation alias from the case when we don't need it?
> Like a self-join cases, we has to construct a remote query even if same
> table is referenced multiple times in a query. Do you have a good idea?

I'd like to vote for keeping current aliasing style, use "l" and "r"
for join source relations, and use a_0, a_1, ... for each column of
them.

-- 
Shigeru HANADA

Re: Join push-down support for foreign tables

От

Kouhei Kaigai

Дата:

03 марта 2015 г., 09:04:29

Hanada-san,

I checked the patch, below is the random comments from my side.

* Context variables
-------------------
Sorry, I might give you a wrong suggestion.
The foreign_glob_cxt and deparse_expr_cxt were re-defined as follows:

typedef struct foreign_glob_cxt{       PlannerInfo *root;                      /* global planner state */
-       RelOptInfo *foreignrel;         /* the foreign relation we are planning
+       RelOptInfo *outerrel;           /* the foreign relation, or outer child
+       RelOptInfo *innerrel;           /* inner child, only set for join */} foreign_glob_cxt;
/*
@@ -86,9 +89,12 @@ typedef struct foreign_loc_cxttypedef struct deparse_expr_cxt{       PlannerInfo *root;
       /* global planner state */
 
-       RelOptInfo *foreignrel;         /* the foreign relation we are planning
+       RelOptInfo *outerrel;           /* the foreign relation, or outer child
+       RelOptInfo *innerrel;           /* inner child, only set for join */       StringInfo      buf;
  /* output buffer to append to */       List      **params_list;        /* exprs that will become remote Params
 
+       ForeignScan *outerplan;         /* outer child's ForeignScan node */
+       ForeignScan *innerplan;         /* inner child's ForeignScan node */} deparse_expr_cxt;

However, the outerrel does not need to have double-meaning.

RelOptInfo->reloptkind gives us information whether the target
relation is base-relation or join-relation.
So, foreign_expr_walker() can be implemented as follows:   if (bms_is_member(var->varno, glob_cxt->foreignrel->relids)
&&      var->varlevelsup == 0)   {           :
 
also, deparseVar() can checks relation type using:   if (context->foreignrel->reloptkind == RELOPT_JOINREL)   {
deparseJoinVar(...);

In addition, what we need in deparse_expr_cxt are target-list of
outer and inner relation in deparse_expr_cxt.
How about to add inner_tlist/outer_tlist instead of innerplan and
outerplan in deparse_expr_cxt?
The deparseJoinVar() references these fields, but only targetlist.



* GetForeignJoinPath method of FDW
----------------------------------
It should be portion of the interface patch, so I added these
enhancement of FDW APIs with documentation updates.
Please see the newer version of foreign/custom-join interface patch.



* enable_foreignjoin parameter
------------------------------
I'm uncertain whether we should have this GUC parameter that affects
to all FDW implementation. Rather than a built-in one, my preference
is an individual GUC variable defined with DefineCustomBoolVariable(),
by postgres_fdw.
Pros: It offers user more flexible configuration.
Cons: Each FDW has to implement this GUC by itself?



* syntax violated query if empty targetlist
-------------------------------------------
At least NULL shall be injected if no columns are referenced.
Also, add a dummy entry to fdw_ps_tlist to fit slot tuple descriptor.

postgres=# explain verbose select NULL from ft1,ft2 where aid=bid;                        QUERY PLAN
---------------------------------------------------------------------------Foreign Scan  (cost=100.00..129.25 rows=2925
width=0) Output: NULL::unknown  Remote SQL: SELECT  FROM (SELECT bid, NULL FROM public.t2) l (a_0, a_1) INNERJOIN
(SELECTaid, NULL FROM public.t1) r (a_0, a_1)  ON ((r.a_0 = l.a_0))
 



* Bug reported by Thom Brown
-----------------------------
# EXPLAIN VERBOSE SELECT NULL FROM (SELECT people.id FROM people INNER JOIN countries ON people.country_id =
countries.idLIMIT 3) x;
 
ERROR:  could not open relation with OID 0

Sorry, it was a problem caused by my portion. The patched setrefs.c
checks fdw_/custom_ps_tlist to determine whether Foreign/CustomScan
node is associated with a certain base relation. If *_ps_tlist is
valid, it also expects scanrelid == 0.
However, things we should check is incorrect. We may have a case
with empty *_ps_tlist if remote join expects no columns.
So, I adjusted the condition to check scanrelid instead.


* make_tuple_from_result_row() call
------------------------------------
The 4th argument (newly added) is referenced without NULL checks,
however, store_returning_result() and analyze_row_processor() put
NULL on this argument, then it leads segmentation fault.
RelationGetDescr(fmstate->rel or astate->rel) should be given on
the caller.

* regression test crash
------------------------
The query below gets crashed: UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT FROM ft1 WHERE
ft1.c1= ft2.c2 AND ft1.c1 % 10 = 9;
 

According to the crash dump, tidin() got a cstring input with
unexpected format. I guess column number is incorrectly assigned,
but have no clear scenario at this moment.

#0  0x00007f9b45a11513 in conversion_error_callback (arg=0x7fffc257ecc0)   at postgres_fdw.c:3293
#1  0x00000000008e51d6 in errfinish (dummy=0) at elog.c:436
#2  0x00000000008935cd in tidin (fcinfo=0x7fffc257e8a0) at tid.c:69
/home/kaigai/repo/sepgsql/src/backend/utils/adt/tid.c:69:1734:beg:0x8935cd
(gdb) p str
$6 = 0x1d17cf7 "foo"

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


> -----Original Message-----
> From: Shigeru Hanada [mailto:shigeru.hanada@gmail.com]
> Sent: Monday, March 02, 2015 9:48 PM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Robert Haas; PostgreSQL-development
> Subject: ##freemail## Re: [HACKERS] Join push-down support for foreign tables
> 
> Attached is the revised/rebased version of the $SUBJECT.
> 
> This patch is based on Kaigai-san's custom/foreign join patch, so
> please apply it before this patch.  In this version I changed some
> points from original postgres_fdw.
> 
> 1) Disabled SELECT clause optimization
> ~9.4 postgres_fdw lists only columns actually used in SELECT clause,
> but AFAIS it makes SQL generation complex.  So I disabled such
> optimization and put "NULL" for unnecessary columns in SELECT clause
> of remote query.
> 
> 2) Extended deparse context
> To allow deparsing based on multiple source relations, I added some
> members to context structure.  They are unnecessary for simple query
> with single foreign table, but IMO it should be integrated.
> 
> With Kaigai-san's advise, changes for supporting foreign join on
> postgres_fdw is minimized into postgres_fdw itself.  But I added new
> FDW API named GetForeignJoinPaths() to keep the policy that all
> interface between core and FDW should be in FdwRoutine, instead of
> using hook function.  Now I'm writing document about it, and will post
> it in a day.
> 
> 2015-02-19 16:19 GMT+09:00 Shigeru Hanada <shigeru.hanada@gmail.com>:
> > 2015-02-17 10:39 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:
> >> Let me put some comments in addition to where you're checking now.
> >>
> >> [design issues]
> >> * Cost estimation
> >> Estimation and evaluation of cost for remote join query is not an
> >> obvious issue. In principle, local side cannot determine the cost
> >> to run remote join without remote EXPLAIN, because local side has
> >> no information about JOIN logic applied on the remote side.
> >> Probably, we have to put an assumption for remote join algorithm,
> >> because local planner has no idea about remote planner's choice
> >> unless foreign-join don't take "use_remote_estimate".
> >> I think, it is reasonable assumption (even if it is incorrect) to
> >> calculate remote join cost based on local hash-join algorithm.
> >> If user wants more correct estimation, remote EXPLAIN will make
> >> more reliable cost estimation.
> >
> > Hm, I guess that you chose hash-join as "least-costed join".  In the
> > pgbench model, most combination between two tables generate hash join
> > as cheapest path.  Remote EXPLAIN is very expensive in the context of
> > planning, so it would easily make the plan optimization meaningless.
> > But giving an option to users is good, I agree.
> >
> >>
> >> It also needs a consensus whether cost for remote CPU execution is
> >> equivalent to local CPU. If we think local CPU is rare resource
> >> than remote one, a discount rate will make planner more preferable
> >> to choose remote join than local one
> >
> > Something like cpu_cost_ratio as a new server-level FDW option?
> >
> >>
> >> Once we assume a join algorithm for remote join, unit cost for
> >> remote CPU, we can calculate a cost for foreign join based on
> >> the local join logic plus cost for network translation (maybe
> >> fdw_tuple_cost?).
> >
> > Yes, sum of these costs is the total cost of a remote join.
> >     o fdw_startup_cost
> >     o hash-join cost, estimated as a local join
> >     o fdw_tuple_cost * rows * width
> >
> >> * FDW options
> >> Unlike table scan, FDW options we should refer is unclear.
> >> Table level FDW options are associated with a foreign table as
> >> literal. I think we have two options here:
> >> 1. Foreign-join refers FDW options for foreign-server, but ones
> >>    for foreign-tables are ignored.
> >> 2. Foreign-join is prohibited when both of relations don't have
> >>    identical FDW options.
> >> My preference is 2. Even though N-way foreign join, it ensures
> >> all the tables involved with (N-1)-way foreign join has identical
> >> FDW options, thus it leads we can make N-way foreign join with
> >> all identical FDW options.
> >> One exception is "updatable" flag of postgres_fdw. It does not
> >> make sense on remote join, so I think mixture of updatable and
> >> non-updatable foreign tables should be admitted, however, it is
> >> a decision by FDW driver.
> >>
> >> Probably, above points need to take time for getting consensus.
> >> I'd like to see your opinion prior to editing your patch.
> >
> > postgres_fdw can't push down a join which contains foreign tables on
> > multiple servers, so use_remote_estimate and fdw_startup_cost  are the
> > only FDW options to consider.  So we have options for each option.
> >
> > 1-a. If all foreign tables in the join has identical
> > use_remote_estimate, allow pushing down.
> > 1-b. If any of foreign table in the join has true as
> > use_remote_estimate, use remote estimate.
> >
> > 2-a. If all foreign tables in the join has identical fdw_startup_cost,
> > allow pushing down.
> > 2-b. Always use max value in the join. (cost would be more expensive)
> > 2-c. Always use min value in the join.  (cost would be cheaper)
> >
> > I prefer 1-a and 2-b, so more joins avoid remote EXPLAIN but have
> > reasonable cost about startup.
> >
> > I agree about "updatable" option.
> >
> >>
> >> [implementation issues]
> >> The interface does not intend to add new Path/Plan type for each scan
> >> that replaces foreign joins. What postgres_fdw should do is, adding
> >> ForeignPath towards a particular joinrel, then it populates ForeignScan
> >> with remote join query once it got chosen by the planner.
> >
> > That idea is interesting, and make many things simpler.  Please let me consider.
> >
> >>
> >> A few functions added in src/backend/foreign/foreign.c are not
> >> called by anywhere, at this moment.
> >>
> >> create_plan_recurse() is reverted to static. It is needed for custom-
> >> join enhancement, if no other infrastructure can support.
> >
> > I made it back to static because I thought that create_plan_recurse
> > can be called by core before giving control to FDWs.  But I'm not sure
> > it can be applied to custom scans.  I'll recheck that part.
> >
> >
> > --
> > Shigeru HANADA
> 
> 
> 
> --
> Shigeru HANADA

Re: Join push-down support for foreign tables

От

Shigeru Hanada

Дата:

03 марта 2015 г., 10:07:51

Thanks for the detailed comments.

2015-03-03 18:01 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:
> Hanada-san,
>
> I checked the patch, below is the random comments from my side.
>
> * Context variables
> -------------------
> Sorry, I might give you a wrong suggestion.
> The foreign_glob_cxt and deparse_expr_cxt were re-defined as follows:
>
> typedef struct foreign_glob_cxt
>  {
>         PlannerInfo *root;                      /* global planner state */
> -       RelOptInfo *foreignrel;         /* the foreign relation we are planning
> +       RelOptInfo *outerrel;           /* the foreign relation, or outer child
> +       RelOptInfo *innerrel;           /* inner child, only set for join */
>  } foreign_glob_cxt;
>
>  /*
> @@ -86,9 +89,12 @@ typedef struct foreign_loc_cxt
>  typedef struct deparse_expr_cxt
>  {
>         PlannerInfo *root;                      /* global planner state */
> -       RelOptInfo *foreignrel;         /* the foreign relation we are planning
> +       RelOptInfo *outerrel;           /* the foreign relation, or outer child
> +       RelOptInfo *innerrel;           /* inner child, only set for join */
>         StringInfo      buf;                    /* output buffer to append to */
>         List      **params_list;        /* exprs that will become remote Params
> +       ForeignScan *outerplan;         /* outer child's ForeignScan node */
> +       ForeignScan *innerplan;         /* inner child's ForeignScan node */
>  } deparse_expr_cxt;
>
> However, the outerrel does not need to have double-meaning.
>
> RelOptInfo->reloptkind gives us information whether the target
> relation is base-relation or join-relation.
> So, foreign_expr_walker() can be implemented as follows:
>     if (bms_is_member(var->varno, glob_cxt->foreignrel->relids) &&
>         var->varlevelsup == 0)
>     {
>             :
> also, deparseVar() can checks relation type using:
>     if (context->foreignrel->reloptkind == RELOPT_JOINREL)
>     {
>         deparseJoinVar(...);
>
> In addition, what we need in deparse_expr_cxt are target-list of
> outer and inner relation in deparse_expr_cxt.
> How about to add inner_tlist/outer_tlist instead of innerplan and
> outerplan in deparse_expr_cxt?
> The deparseJoinVar() references these fields, but only target list.

Ah, I've totally misunderstood your suggestion.  Now I reverted my
changes and use target lists to know whether the var came from either
of the relations.


> * GetForeignJoinPath method of FDW
> ----------------------------------
> It should be portion of the interface patch, so I added these
> enhancement of FDW APIs with documentation updates.
> Please see the newer version of foreign/custom-join interface patch.

Agreed.

> * enable_foreignjoin parameter
> ------------------------------
> I'm uncertain whether we should have this GUC parameter that affects
> to all FDW implementation. Rather than a built-in one, my preference
> is an individual GUC variable defined with DefineCustomBoolVariable(),
> by postgres_fdw.
> Pros: It offers user more flexible configuration.
> Cons: Each FDW has to implement this GUC by itself?

Hum...
In a sense, I added this GUC parameter for debugging purpose.  As you
pointed out, users might want to control join push-down feature
per-FDW.  I'd like to hear others' opinion.


> * syntax violated query if empty targetlist
> -------------------------------------------
> At least NULL shall be injected if no columns are referenced.
> Also, add a dummy entry to fdw_ps_tlist to fit slot tuple descriptor.
>
> postgres=# explain verbose select NULL from ft1,ft2 where aid=bid;
>                          QUERY PLAN
> ---------------------------------------------------------------------------
>  Foreign Scan  (cost=100.00..129.25 rows=2925 width=0)
>    Output: NULL::unknown
>    Remote SQL: SELECT  FROM (SELECT bid, NULL FROM public.t2) l (a_0, a_1) INNER
>  JOIN (SELECT aid, NULL FROM public.t1) r (a_0, a_1)  ON ((r.a_0 = l.a_0))

Fixed.

> * Bug reported by Thom Brown
> -----------------------------
> # EXPLAIN VERBOSE SELECT NULL FROM (SELECT people.id FROM people INNER JOIN countries ON people.country_id =
countries.idLIMIT 3) x;
 
> ERROR:  could not open relation with OID 0
>
> Sorry, it was a problem caused by my portion. The patched setrefs.c
> checks fdw_/custom_ps_tlist to determine whether Foreign/CustomScan
> node is associated with a certain base relation. If *_ps_tlist is
> valid, it also expects scanrelid == 0.
> However, things we should check is incorrect. We may have a case
> with empty *_ps_tlist if remote join expects no columns.
> So, I adjusted the condition to check scanrelid instead.

Is this issue fixed by v5 custom/foreign join patch?

> * make_tuple_from_result_row() call
> ------------------------------------
> The 4th argument (newly added) is referenced without NULL checks,
> however, store_returning_result() and analyze_row_processor() put
> NULL on this argument, then it leads segmentation fault.
> RelationGetDescr(fmstate->rel or astate->rel) should be given on
> the caller.

Fixed.

>
> * regression test crash
> ------------------------
> The query below gets crashed:
>   UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
>   FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
>
> According to the crash dump, tidin() got a cstring input with
> unexpected format. I guess column number is incorrectly assigned,
> but have no clear scenario at this moment.
>
> #0  0x00007f9b45a11513 in conversion_error_callback (arg=0x7fffc257ecc0)
>     at postgres_fdw.c:3293
> #1  0x00000000008e51d6 in errfinish (dummy=0) at elog.c:436
> #2  0x00000000008935cd in tidin (fcinfo=0x7fffc257e8a0) at tid.c:69
> /home/kaigai/repo/sepgsql/src/backend/utils/adt/tid.c:69:1734:beg:0x8935cd
> (gdb) p str
> $6 = 0x1d17cf7 "foo"

Join push-down underlying UPDATE or DELETE requires ctid as its
output, but it seems not fully supported.  I'm fixing this issue now.

Regards,
-- 
Shigeru HANADA

Re: Join push-down support for foreign tables

От

Kouhei Kaigai

Дата:

03 марта 2015 г., 10:48:35

> > * Bug reported by Thom Brown
> > -----------------------------
> > # EXPLAIN VERBOSE SELECT NULL FROM (SELECT people.id FROM people INNER JOIN
> countries ON people.country_id = countries.id LIMIT 3) x;
> > ERROR:  could not open relation with OID 0
> >
> > Sorry, it was a problem caused by my portion. The patched setrefs.c
> > checks fdw_/custom_ps_tlist to determine whether Foreign/CustomScan
> > node is associated with a certain base relation. If *_ps_tlist is
> > valid, it also expects scanrelid == 0.
> > However, things we should check is incorrect. We may have a case
> > with empty *_ps_tlist if remote join expects no columns.
> > So, I adjusted the condition to check scanrelid instead.
> 
> Is this issue fixed by v5 custom/foreign join patch?
>
Yes, please rebase it.

--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: Join push-down support for foreign tables

От

Shigeru Hanada

Дата:

03 марта 2015 г., 12:34:53

I rebased "join push-down" patch onto Kaigai-san's Custom/Foreign Join
v6 patch.  I posted some comments to v6 patch in this post:

http://www.postgresql.org/message-id/CAEZqfEcNvjqq-P=jxnW1Pb4T9wvpcPoRCN7G6cc46JGuB7dY8w@mail.gmail.com

Before applying my v3 patch, please apply Kaigai-san's v6 patch and my
mod_cjv6.patch.
Sorry for complex patch combination.  Those patches will be arranged
soon by Kaigai-san and me.

I fixed the issues pointed out by Thom and Kohei, but still the patch
has an issue about joins underlying UPDATE or DELETE.  Now I'm working
on fixing this issue.  Besides this issue, existing regression test
passed.

2015-03-03 19:48 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:
>> > * Bug reported by Thom Brown
>> > -----------------------------
>> > # EXPLAIN VERBOSE SELECT NULL FROM (SELECT people.id FROM people INNER JOIN
>> countries ON people.country_id = countries.id LIMIT 3) x;
>> > ERROR:  could not open relation with OID 0
>> >
>> > Sorry, it was a problem caused by my portion. The patched setrefs.c
>> > checks fdw_/custom_ps_tlist to determine whether Foreign/CustomScan
>> > node is associated with a certain base relation. If *_ps_tlist is
>> > valid, it also expects scanrelid == 0.
>> > However, things we should check is incorrect. We may have a case
>> > with empty *_ps_tlist if remote join expects no columns.
>> > So, I adjusted the condition to check scanrelid instead.
>>
>> Is this issue fixed by v5 custom/foreign join patch?
>>
> Yes, please rebase it.
>
> --
> NEC OSS Promotion Center / PG-Strom Project
> KaiGai Kohei <kaigai@ak.jp.nec.com>
>



--
Shigeru HANADA

Вложения

foreign_join_v3.patch

Re: Join push-down support for foreign tables

От

Thom Brown

Дата:

03 марта 2015 г., 13:11:41

On 3 March 2015 at 12:34, Shigeru Hanada <shigeru.hanada@gmail.com> wrote:
> I rebased "join push-down" patch onto Kaigai-san's Custom/Foreign Join
> v6 patch.  I posted some comments to v6 patch in this post:
>
> http://www.postgresql.org/message-id/CAEZqfEcNvjqq-P=jxnW1Pb4T9wvpcPoRCN7G6cc46JGuB7dY8w@mail.gmail.com
>
> Before applying my v3 patch, please apply Kaigai-san's v6 patch and my
> mod_cjv6.patch.
> Sorry for complex patch combination.  Those patches will be arranged
> soon by Kaigai-san and me.
>
> I fixed the issues pointed out by Thom and Kohei, but still the patch
> has an issue about joins underlying UPDATE or DELETE.  Now I'm working
> on fixing this issue.  Besides this issue, existing regression test
> passed.

Re-tested the broken query and it works for me now.

-- 
Thom

Re: Join push-down support for foreign tables

От

Etsuro Fujita

Дата:

04 марта 2015 г., 08:00:55

On 2015/03/03 21:34, Shigeru Hanada wrote:
> I rebased "join push-down" patch onto Kaigai-san's Custom/Foreign Join
> v6 patch.

Thanks for the work, Hanada-san and KaiGai-san!

Maybe I'm missing something, but did we agree to take this approach, ie, 
"join push-down" on top of custom join?  There is a comment ahout that 
[1].  I just thought it'd be better to achieve a consensus before 
implementing the feature further.

> but still the patch
> has an issue about joins underlying UPDATE or DELETE.  Now I'm working
> on fixing this issue.

Is that something like "UPDATE foo ... FROM bar ..." where both foo and 
bar are remote?  If so, I think it'd be better to push such an update 
down to the remote, as discussed in [2], and I'd like to work on that 
together!

Sorry for having been late for the party.

Best regards,
Etsuro Fujita

[1] http://www.postgresql.org/message-id/23343.1418658355@sss.pgh.pa.us
[2] http://www.postgresql.org/message-id/31942.1410534785@sss.pgh.pa.us

Re: Join push-down support for foreign tables

От

Kouhei Kaigai

Дата:

04 марта 2015 г., 08:32:18

> On 2015/03/03 21:34, Shigeru Hanada wrote:
> > I rebased "join push-down" patch onto Kaigai-san's Custom/Foreign Join
> > v6 patch.
> 
> Thanks for the work, Hanada-san and KaiGai-san!
> 
> Maybe I'm missing something, but did we agree to take this approach, ie,
> "join push-down" on top of custom join?  There is a comment ahout that
> [1].  I just thought it'd be better to achieve a consensus before
> implementing the feature further.
> 
It is not correct. The join push-down feature is not implemented
on top of the custom-join feature, however, both of them are 99%
similar on both of the concept and implementation.
So, we're working to enhance foreign/custom-join interface together,
according to Robert's suggestion [3], using postgres_fdw extension
as a minimum worthwhile example for both of foreign/custom-scan.

[3] http://bit.ly/1w1PoDU

> > but still the patch
> > has an issue about joins underlying UPDATE or DELETE.  Now I'm working
> > on fixing this issue.
> 
> Is that something like "UPDATE foo ... FROM bar ..." where both foo and
> bar are remote?  If so, I think it'd be better to push such an update
> down to the remote, as discussed in [2], and I'd like to work on that
> together!
>
Hanada-san, could you give us test query to reproduce the problem
above? I and Fujita-san can help to investigate the problem from
different standpoints for each.

> Sorry for having been late for the party.
> 
We are still in the party.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: Join push-down support for foreign tables

От

Etsuro Fujita

Дата:

04 марта 2015 г., 08:46:26

On 2015/03/04 17:31, Kouhei Kaigai wrote:
>> On 2015/03/03 21:34, Shigeru Hanada wrote:
>>> I rebased "join push-down" patch onto Kaigai-san's Custom/Foreign Join
>>> v6 patch.

>> Maybe I'm missing something, but did we agree to take this approach, ie,
>> "join push-down" on top of custom join?  There is a comment ahout that
>> [1].  I just thought it'd be better to achieve a consensus before
>> implementing the feature further.

> It is not correct. The join push-down feature is not implemented
> on top of the custom-join feature, however, both of them are 99%
> similar on both of the concept and implementation.
> So, we're working to enhance foreign/custom-join interface together,
> according to Robert's suggestion [3], using postgres_fdw extension
> as a minimum worthwhile example for both of foreign/custom-scan.

OK, thanks for the explanation!

>>> but still the patch
>>> has an issue about joins underlying UPDATE or DELETE.  Now I'm working
>>> on fixing this issue.

>> Is that something like "UPDATE foo ... FROM bar ..." where both foo and
>> bar are remote?  If so, I think it'd be better to push such an update
>> down to the remote, as discussed in [2], and I'd like to work on that
>> together!

> Hanada-san, could you give us test query to reproduce the problem
> above? I and Fujita-san can help to investigate the problem from
> different standpoints for each.

Yeah, will do.

Best regards,
Etsuro Fujita

Re: Join push-down support for foreign tables

От

Shigeru Hanada

Дата:

04 марта 2015 г., 08:57:27

2015-03-04 17:00 GMT+09:00 Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>:
> On 2015/03/03 21:34, Shigeru Hanada wrote:
>>
>> I rebased "join push-down" patch onto Kaigai-san's Custom/Foreign Join
>> v6 patch.
>
>
> Thanks for the work, Hanada-san and KaiGai-san!
>
> Maybe I'm missing something, but did we agree to take this approach, ie,
> "join push-down" on top of custom join?  There is a comment ahout that [1].
> I just thought it'd be better to achieve a consensus before implementing the
> feature further.

As Kaigai-san says, foreign join push-down is beside custom scan, and
they are on the custom/foreign join api patch.

>
>> but still the patch
>> has an issue about joins underlying UPDATE or DELETE.  Now I'm working
>> on fixing this issue.
>
>
> Is that something like "UPDATE foo ... FROM bar ..." where both foo and bar
> are remote?  If so, I think it'd be better to push such an update down to
> the remote, as discussed in [2], and I'd like to work on that together!

A part of it, perhaps.  But at the moment I see many issues to solve
around pushing down complex UPDATE/DELETE.  So I once tightened the
restriction, that joins between foreign tables are pushed down only if
they are part of SELECT statement.  Please see next v4 patch I'll post
soon.

>
> Sorry for having been late for the party.
>
> Best regards,
> Etsuro Fujita
>
> [1] http://www.postgresql.org/message-id/23343.1418658355@sss.pgh.pa.us
> [2] http://www.postgresql.org/message-id/31942.1410534785@sss.pgh.pa.us



-- 
Shigeru HANADA

Re: Join push-down support for foreign tables

От

Etsuro Fujita

Дата:

04 марта 2015 г., 09:26:21

On 2015/03/04 17:57, Shigeru Hanada wrote:
> 2015-03-04 17:00 GMT+09:00 Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>:
>> On 2015/03/03 21:34, Shigeru Hanada wrote:
>>> I rebased "join push-down" patch onto Kaigai-san's Custom/Foreign Join
>>> v6 patch.

>>> but still the patch
>>> has an issue about joins underlying UPDATE or DELETE.  Now I'm working
>>> on fixing this issue.

>> Is that something like "UPDATE foo ... FROM bar ..." where both foo and bar
>> are remote?  If so, I think it'd be better to push such an update down to
>> the remote, as discussed in [2], and I'd like to work on that together!

> A part of it, perhaps.  But at the moment I see many issues to solve
> around pushing down complex UPDATE/DELETE.  So I once tightened the
> restriction, that joins between foreign tables are pushed down only if
> they are part of SELECT statement.  Please see next v4 patch I'll post
> soon.

OK, thanks for the reply!

Best regards,
Etsuro Fujita

Re: Join push-down support for foreign tables

От

Shigeru Hanada

Дата:

04 марта 2015 г., 09:26:37

Here is v4 patch of Join push-down support for foreign tables.  This
patch requires Custom/Foreign join patch v7 posted by Kaigai-san.

In this version I added check about query type which gives up pushing
down joins when the join is a part of an underlying query of
UPDATE/DELETE.



As of now postgres_fdw builds a proper remote query but it can't bring
ctid value up to postgresExecForeignUpdate()...

I'm still working on supporting such query, but I'm not sure that
supporting UPDATE/DELETE is required in the first version.  I attached
a patch foreign_join_update.patch to sure WIP for supporting
update/delete as top of foreign joins.

How to reproduce the error, please execute query below after running
attached init_fdw.sql for building test environment.  Note that the
script drops "user1", and creates database "fdw" and "pgbench".

fdw=# explain (verbose) update pgbench_branches b set filler = 'foo'
from pgbench_tellers t where t.bid = b.bid and t.tid < 10;

                                                                 QUERY
PLAN


----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------
 Update on public.pgbench_branches b  (cost=100.00..100.67 rows=67 width=390)
   Remote SQL: UPDATE public.pgbench_branches SET filler = $2 WHERE ctid = $1
   ->  Foreign Scan  (cost=100.00..100.67 rows=67 width=390)
         Output: b.bid, b.bbalance, 'foo

'::character(88), b.ctid, *
         Remote SQL: SELECT r.a_0, r.a_1, r.a_2, l FROM (SELECT tid,
bid, tbalance, filler FROM public.pgbench_tellers WHERE ((tid < 10)))
l (a_0, a_1) INNER JOIN (SELECT b
id, bbalance, NULL, ctid FROM public.pgbench_branches FOR UPDATE) r
(a_0, a_1, a_2, a_3)  ON ((r.a_0 = l.a_1))
(5 rows)
fdw=# explain (analyze, verbose) update pgbench_branches b set filler
= 'foo' from pgbench_tellers t where t.bid = b.bid and t.tid < 10;
ERROR:  ctid is NULL


2015-03-03 21:34 GMT+09:00 Shigeru Hanada <shigeru.hanada@gmail.com>:
> I rebased "join push-down" patch onto Kaigai-san's Custom/Foreign Join
> v6 patch.  I posted some comments to v6 patch in this post:
>
> http://www.postgresql.org/message-id/CAEZqfEcNvjqq-P=jxnW1Pb4T9wvpcPoRCN7G6cc46JGuB7dY8w@mail.gmail.com
>
> Before applying my v3 patch, please apply Kaigai-san's v6 patch and my
> mod_cjv6.patch.
> Sorry for complex patch combination.  Those patches will be arranged
> soon by Kaigai-san and me.
>
> I fixed the issues pointed out by Thom and Kohei, but still the patch
> has an issue about joins underlying UPDATE or DELETE.  Now I'm working
> on fixing this issue.  Besides this issue, existing regression test
> passed.
>
> 2015-03-03 19:48 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:
>>> > * Bug reported by Thom Brown
>>> > -----------------------------
>>> > # EXPLAIN VERBOSE SELECT NULL FROM (SELECT people.id FROM people INNER JOIN
>>> countries ON people.country_id = countries.id LIMIT 3) x;
>>> > ERROR:  could not open relation with OID 0
>>> >
>>> > Sorry, it was a problem caused by my portion. The patched setrefs.c
>>> > checks fdw_/custom_ps_tlist to determine whether Foreign/CustomScan
>>> > node is associated with a certain base relation. If *_ps_tlist is
>>> > valid, it also expects scanrelid == 0.
>>> > However, things we should check is incorrect. We may have a case
>>> > with empty *_ps_tlist if remote join expects no columns.
>>> > So, I adjusted the condition to check scanrelid instead.
>>>
>>> Is this issue fixed by v5 custom/foreign join patch?
>>>
>> Yes, please rebase it.
>>
>> --
>> NEC OSS Promotion Center / PG-Strom Project
>> KaiGai Kohei <kaigai@ak.jp.nec.com>
>>
>
>
>
> --
> Shigeru HANADA



--
Shigeru HANADA

Вложения

Re: Join push-down support for foreign tables

От

Ashutosh Bapat

Дата:

04 марта 2015 г., 10:17:14

Hi Hanada-san,

I am looking at the patch. Here are my comments

In create_foreignscan_path() we have lines like -

1587 pathnode->path.param_info = get_baserel_parampathinfo(root, rel,
1588 required_outer);

Now, that the same function is being used for creating foreign scan paths for joins, we should be calling get_joinrel_parampathinfo() on a join rel and get_baserel_parampathinfo() on base rel.

The patch seems to handle all the restriction clauses in the same way. There are two kinds of restriction clauses - a. join quals (specified using ON clause; optimizer might move them to the other class if that doesn't affect correctness) and b. quals on join relation (specified in the WHERE clause, optimizer might move them to the other class if that doesn't affect correctness). The quals in "a" are applied while the join is being computed whereas those in "b" are applied after the join is computed. For example,
postgres=# select * from lt;
val | val2
-----+------
   1 |    2
   1 |    3
(2 rows)

postgres=# select * from lt2;
val | val2
-----+------
   1 |    2
(1 row)

postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2);
val | val2 | val | val2
-----+------+-----+------
   1 |    2 |   1 |    2
   1 |    3 |     |
(2 rows)

postgres=# select * from lt left join lt2 on (true) where (lt.val2 = lt2.val2);
val | val2 | val | val2
-----+------+-----+------
   1 |    2 |   1 |    2
(1 row)

The difference between these two kinds is evident in case of outer joins, for inner join optimizer puts all of them in class "b". The remote query sent to the foreign server has all those in ON clause. Consider foreign tables ft1 and ft2 pointing to local tables on the same server.
postgres=# \d ft1
         Foreign table "public.ft1"
Column | Type   | Modifiers | FDW Options
--------+---------+-----------+-------------
val    | integer |           |
val2   | integer |           |
Server: loopback
FDW Options: (table_name 'lt')

postgres=# \d ft2
         Foreign table "public.ft2"
Column | Type   | Modifiers | FDW Options
--------+---------+-----------+-------------
val    | integer |           |
val2   | integer |           |
Server: loopback
FDW Options: (table_name 'lt2')

postgres=# explain verbose select * from ft1 left join ft2 on (ft1.val2 = ft2.val2) where ft1.val + ft2.val > ft1.val2 or ft2.val is null;
                                                                                                                QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------
Foreign Scan (cost=100.00..125.60 rows=2560 width=16)
   Output: val, val2, val, val2
   Remote SQL: SELECT r.a_0, r.a_1, l.a_0, l.a_1 FROM (SELECT val, val2 FROM public.lt2) l (a_0, a_1) RIGHT JOIN (SELECT val, val2 FROM public.lt) r (a
_0, a_1) ON ((((r.a_0 + l.a_0) > r.a_1) OR (l.a_0 IS NULL))) AND ((r.a_1 = l.a_1))
(3 rows)

The result is then wrong
postgres=# select * from ft1 left join ft2 on (ft1.val2 = ft2.val2) where ft1.val + ft2.val > ft1.val2 or ft2.val is null;
val | val2 | val | val2
-----+------+-----+------
   1 |    2 |     |
   1 |    3 |     |
(2 rows)

which should match the result obtained by substituting local tables for foreign ones
postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2) where lt.val + lt2.val > lt.val2 or lt2.val is null;
val | val2 | val | val2
-----+------+-----+------
   1 |    3 |     |
(1 row)

Once we start distinguishing the two kinds of quals, there is some optimization possible. For pushing down a join it's essential that all the quals in "a" are safe to be pushed down. But a join can be pushed down, even if quals in "a" are not safe to be pushed down. But more clauses one pushed down to foreign server, lesser are the rows fetched from the foreign server. In postgresGetForeignJoinPath, instead of checking all the restriction clauses to be safe to be pushed down, we need to check only those which are join quals (class "a").

Following EXPLAIN output seems to be confusing

ft1 and ft2 both are pointing to same lt on a foreign server.
postgres=# explain verbose select ft1.val + ft1.val2 from ft1, ft2 where ft1.val + ft1.val2 = ft2.val;
                                                                                   QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------
Foreign Scan (cost=100.00..132.00 rows=2560 width=8)
   Output: (val + val2)
   Remote SQL: SELECT r.a_0, r.a_1 FROM (SELECT val, NULL FROM public.lt) l (a_0, a_1) INNER JOIN (SELECT val, val2 FROM public.lt) r (a_0, a_1) ON ((
(r.a_0 + r.a_1) = l.a_0))

Output just specified val + val2, it doesn't tell, where those val and val2 come from, neither it's evident from the rest of the context.

On Mon, Mar 2, 2015 at 6:18 PM, Shigeru Hanada <shigeru.hanada@gmail.com> wrote:

Attached is the revised/rebased version of the $SUBJECT.

This patch is based on Kaigai-san's custom/foreign join patch, so
please apply it before this patch. In this version I changed some
points from original postgres_fdw.

1) Disabled SELECT clause optimization
~9.4 postgres_fdw lists only columns actually used in SELECT clause,
but AFAIS it makes SQL generation complex. So I disabled such
optimization and put "NULL" for unnecessary columns in SELECT clause
of remote query.

2) Extended deparse context
To allow deparsing based on multiple source relations, I added some
members to context structure. They are unnecessary for simple query
with single foreign table, but IMO it should be integrated.

With Kaigai-san's advise, changes for supporting foreign join on
postgres_fdw is minimized into postgres_fdw itself. But I added new
FDW API named GetForeignJoinPaths() to keep the policy that all
interface between core and FDW should be in FdwRoutine, instead of
using hook function. Now I'm writing document about it, and will post
it in a day.

2015-02-19 16:19 GMT+09:00 Shigeru Hanada <shigeru.hanada@gmail.com>:
> 2015-02-17 10:39 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:
>> Let me put some comments in addition to where you're checking now.
>>
>> [design issues]
>> * Cost estimation
>> Estimation and evaluation of cost for remote join query is not an
>> obvious issue. In principle, local side cannot determine the cost
>> to run remote join without remote EXPLAIN, because local side has
>> no information about JOIN logic applied on the remote side.
>> Probably, we have to put an assumption for remote join algorithm,
>> because local planner has no idea about remote planner's choice
>> unless foreign-join don't take "use_remote_estimate".
>> I think, it is reasonable assumption (even if it is incorrect) to
>> calculate remote join cost based on local hash-join algorithm.
>> If user wants more correct estimation, remote EXPLAIN will make
>> more reliable cost estimation.
>
> Hm, I guess that you chose hash-join as "least-costed join". In the
> pgbench model, most combination between two tables generate hash join
> as cheapest path. Remote EXPLAIN is very expensive in the context of
> planning, so it would easily make the plan optimization meaningless.
> But giving an option to users is good, I agree.
>
>>
>> It also needs a consensus whether cost for remote CPU execution is
>> equivalent to local CPU. If we think local CPU is rare resource
>> than remote one, a discount rate will make planner more preferable
>> to choose remote join than local one
>
> Something like cpu_cost_ratio as a new server-level FDW option?
>
>>
>> Once we assume a join algorithm for remote join, unit cost for
>> remote CPU, we can calculate a cost for foreign join based on
>> the local join logic plus cost for network translation (maybe
>> fdw_tuple_cost?).
>
> Yes, sum of these costs is the total cost of a remote join.
> o fdw_startup_cost
> o hash-join cost, estimated as a local join
> o fdw_tuple_cost * rows * width
>
>> * FDW options
>> Unlike table scan, FDW options we should refer is unclear.
>> Table level FDW options are associated with a foreign table as
>> literal. I think we have two options here:
>> 1. Foreign-join refers FDW options for foreign-server, but ones
>> for foreign-tables are ignored.
>> 2. Foreign-join is prohibited when both of relations don't have
>> identical FDW options.
>> My preference is 2. Even though N-way foreign join, it ensures
>> all the tables involved with (N-1)-way foreign join has identical
>> FDW options, thus it leads we can make N-way foreign join with
>> all identical FDW options.
>> One exception is "updatable" flag of postgres_fdw. It does not
>> make sense on remote join, so I think mixture of updatable and
>> non-updatable foreign tables should be admitted, however, it is
>> a decision by FDW driver.
>>
>> Probably, above points need to take time for getting consensus.
>> I'd like to see your opinion prior to editing your patch.
>
> postgres_fdw can't push down a join which contains foreign tables on
> multiple servers, so use_remote_estimate and fdw_startup_cost are the
> only FDW options to consider. So we have options for each option.
>
> 1-a. If all foreign tables in the join has identical
> use_remote_estimate, allow pushing down.
> 1-b. If any of foreign table in the join has true as
> use_remote_estimate, use remote estimate.
>
> 2-a. If all foreign tables in the join has identical fdw_startup_cost,
> allow pushing down.
> 2-b. Always use max value in the join. (cost would be more expensive)
> 2-c. Always use min value in the join. (cost would be cheaper)
>
> I prefer 1-a and 2-b, so more joins avoid remote EXPLAIN but have
> reasonable cost about startup.
>
> I agree about "updatable" option.
>
>>
>> [implementation issues]
>> The interface does not intend to add new Path/Plan type for each scan
>> that replaces foreign joins. What postgres_fdw should do is, adding
>> ForeignPath towards a particular joinrel, then it populates ForeignScan
>> with remote join query once it got chosen by the planner.
>
> That idea is interesting, and make many things simpler. Please let me consider.
>
>>
>> A few functions added in src/backend/foreign/foreign.c are not
>> called by anywhere, at this moment.
>>
>> create_plan_recurse() is reverted to static. It is needed for custom-
>> join enhancement, if no other infrastructure can support.
>
> I made it back to static because I thought that create_plan_recurse
> can be called by core before giving control to FDWs. But I'm not sure
> it can be applied to custom scans. I'll recheck that part.
>
>
> --
> Shigeru HANADA

--
Shigeru HANADA

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: Join push-down support for foreign tables

От

Kouhei Kaigai

Дата:

05 марта 2015 г., 01:27:53

> Here is v4 patch of Join push-down support for foreign tables.  This
> patch requires Custom/Foreign join patch v7 posted by Kaigai-san.
> 
Thanks for your efforts,

> In this version I added check about query type which gives up pushing
> down joins when the join is a part of an underlying query of
> UPDATE/DELETE.
>
> As of now postgres_fdw builds a proper remote query but it can't bring
> ctid value up to postgresExecForeignUpdate()...
>
The "ctid" reference shall exist as an usual column reference in
the target-list of join-rel. It is not origin of the problem.

See my investigation below. I guess special treatment on whole-row-
reference is problematic, rather than ctid.

> How to reproduce the error, please execute query below after running
> attached init_fdw.sql for building test environment.  Note that the
> script drops "user1", and creates database "fdw" and "pgbench".
> 
> fdw=# explain (verbose) update pgbench_branches b set filler = 'foo'
> from pgbench_tellers t where t.bid = b.bid and t.tid < 10;
> 
>                                                                  QUERY
> PLAN
> 
> ----------------------------------------------------------------------------
> ----------------------------------------------------------------------------
> --------------------
> ----------------------------------------------------------------------------
> -----------------------------------
>  Update on public.pgbench_branches b  (cost=100.00..100.67 rows=67 width=390)
>    Remote SQL: UPDATE public.pgbench_branches SET filler = $2 WHERE ctid = $1
>    ->  Foreign Scan  (cost=100.00..100.67 rows=67 width=390)
>          Output: b.bid, b.bbalance, 'foo
> 
> '::character(88), b.ctid, *
>          Remote SQL: SELECT r.a_0, r.a_1, r.a_2, l FROM (SELECT tid,
> bid, tbalance, filler FROM public.pgbench_tellers WHERE ((tid < 10)))
> l (a_0, a_1) INNER JOIN (SELECT b
> id, bbalance, NULL, ctid FROM public.pgbench_branches FOR UPDATE) r
> (a_0, a_1, a_2, a_3)  ON ((r.a_0 = l.a_1))
> (5 rows)
> fdw=# explain (analyze, verbose) update pgbench_branches b set filler
> = 'foo' from pgbench_tellers t where t.bid = b.bid and t.tid < 10;
> ERROR:  ctid is NULL
> 
It seems to me the left relation has smaller number of alias definitions
than required. The left SELECT statement has 4 target-entries, however,
only a_0 and a_1 are defined.
The logic to assign aliases of relation/column might be problematic.
Because deparseColumnAliases() add aliases for each target-entries of
underlying SELECT statement, but skips whole-row0-reference.
On the other hands, postgres_fdw takes special treatment on whole-
row-reference. Once a whole-row-reference is used, postgres_fdw
put all the non-system columns on the target-list of remote SELECT
statement.
Thus, it makes mismatch between baserel->targetlist and generated
aliases.

I think we have two options:
(1) Stop special treatment for whole-row-reference, even if it is   simple foreign-scan (which does not involve join).
(2) Add a new routine to reconstruct whole-row-reference of   a particular relation from the target-entries of joined
relations.

My preference is (2). It can keep the code simple, and I doubt
whether network traffic optimization actually has advantage towards
disadvantage of local tuple reconstruction (which consumes additional
CPU cycles).

Does it make sense for you?
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: Join push-down support for foreign tables

От

Shigeru Hanada

Дата:

05 марта 2015 г., 13:00:12

Hi Ashutosh, thanks for the review.

2015-03-04 19:17 GMT+09:00 Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>:
> In create_foreignscan_path() we have lines like -
> 1587     pathnode->path.param_info = get_baserel_parampathinfo(root, rel,
> 1588
> required_outer);
> Now, that the same function is being used for creating foreign scan paths
> for joins, we should be calling get_joinrel_parampathinfo() on a join rel
> and get_baserel_parampathinfo() on base rel.

Got it.  Please let me check the difference.

>
> The patch seems to handle all the restriction clauses in the same way. There
> are two kinds of restriction clauses - a. join quals (specified using ON
> clause; optimizer might move them to the other class if that doesn't affect
> correctness) and b. quals on join relation (specified in the WHERE clause,
> optimizer might move them to the other class if that doesn't affect
> correctness). The quals in "a" are applied while the join is being computed
> whereas those in "b" are applied after the join is computed. For example,
> postgres=# select * from lt;
>  val | val2
> -----+------
>    1 |    2
>    1 |    3
> (2 rows)
>
> postgres=# select * from lt2;
>  val | val2
> -----+------
>    1 |    2
> (1 row)
>
> postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2);
>  val | val2 | val | val2
> -----+------+-----+------
>    1 |    2 |   1 |    2
>    1 |    3 |     |
> (2 rows)
>
> postgres=# select * from lt left join lt2 on (true) where (lt.val2 =
> lt2.val2);
>  val | val2 | val | val2
> -----+------+-----+------
>    1 |    2 |   1 |    2
> (1 row)
>
> The difference between these two kinds is evident in case of outer joins,
> for inner join optimizer puts all of them in class "b". The remote query
> sent to the foreign server has all those in ON clause. Consider foreign
> tables ft1 and ft2 pointing to local tables on the same server.
> postgres=# \d ft1
>          Foreign table "public.ft1"
>  Column |  Type   | Modifiers | FDW Options
> --------+---------+-----------+-------------
>  val    | integer |           |
>  val2   | integer |           |
> Server: loopback
> FDW Options: (table_name 'lt')
>
> postgres=# \d ft2
>          Foreign table "public.ft2"
>  Column |  Type   | Modifiers | FDW Options
> --------+---------+-----------+-------------
>  val    | integer |           |
>  val2   | integer |           |
> Server: loopback
> FDW Options: (table_name 'lt2')
>
> postgres=# explain verbose select * from ft1 left join ft2 on (ft1.val2 =
> ft2.val2) where ft1.val + ft2.val > ft1.val2 or ft2.val is null;
>
> QUERY PLAN
>
>
-------------------------------------------------------------------------------------------------------------------------------------------------------
> ------------------------------------------------------------------------------------
>  Foreign Scan  (cost=100.00..125.60 rows=2560 width=16)
>    Output: val, val2, val, val2
>    Remote SQL: SELECT r.a_0, r.a_1, l.a_0, l.a_1 FROM (SELECT val, val2 FROM
> public.lt2) l (a_0, a_1) RIGHT JOIN (SELECT val, val2 FROM public.lt) r (a
> _0, a_1)  ON ((((r.a_0 + l.a_0) > r.a_1) OR (l.a_0 IS NULL))) AND ((r.a_1 =
> l.a_1))
> (3 rows)
>
> The result is then wrong
> postgres=# select * from ft1 left join ft2 on (ft1.val2 = ft2.val2) where
> ft1.val + ft2.val > ft1.val2 or ft2.val is null;
>  val | val2 | val | val2
> -----+------+-----+------
>    1 |    2 |     |
>    1 |    3 |     |
> (2 rows)
>
> which should match the result obtained by substituting local tables for
> foreign ones
> postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2) where
> lt.val + lt2.val > lt.val2 or lt2.val is null;
>  val | val2 | val | val2
> -----+------+-----+------
>    1 |    3 |     |
> (1 row)
>
> Once we start distinguishing the two kinds of quals, there is some
> optimization possible. For pushing down a join it's essential that all the
> quals in "a" are safe to be pushed down. But a join can be pushed down, even
> if quals in "a" are not safe to be pushed down. But more clauses one pushed
> down to foreign server, lesser are the rows fetched from the foreign server.
> In postgresGetForeignJoinPath, instead of checking all the restriction
> clauses to be safe to be pushed down, we need to check only those which are
> join quals (class "a").

The argument restrictlist of GetForeignJoinPaths contains both
conditions mixed, so I added extract_actual_join_clauses() to separate
it into two lists, join_quals and other clauses.  This is similar to
what create_nestloop_plan and siblings do.


>
> Following EXPLAIN output seems to be confusing
> ft1 and ft2 both are pointing to same lt on a foreign server.
> postgres=# explain verbose select ft1.val + ft1.val2 from ft1, ft2 where
> ft1.val + ft1.val2 = ft2.val;
>
> QUERY PLAN
>
>
-------------------------------------------------------------------------------------------------------------------------------------------------------
> --------------------------
>  Foreign Scan  (cost=100.00..132.00 rows=2560 width=8)
>    Output: (val + val2)
>    Remote SQL: SELECT r.a_0, r.a_1 FROM (SELECT val, NULL FROM public.lt) l
> (a_0, a_1) INNER JOIN (SELECT val, val2 FROM public.lt) r (a_0, a_1)  ON ((
> (r.a_0 + r.a_1) = l.a_0))
>
> Output just specified val + val2, it doesn't tell, where those val and val2
> come from, neither it's evident from the rest of the context.
>

Actually val and val2 come from public.lt in "r" side, but as you say
it's too difficult to know that from EXPLAIN output.  Do you have any
idea to make the "Output" item more readable?

-- 
Shigeru HANADA

Re: Join push-down support for foreign tables

От

Shigeru Hanada

Дата:

05 марта 2015 г., 15:30:26

Here is the v5 patch of Join push-down support for foreign tables.

Changes since v4:

- Separete remote conditions into ON and WHERE, per Ashutosh.
- Add regression test cases for foreign join.
- Don't skip reversed relation combination in OUTER join cases.

I'm now working on two issues from Kaigai-san and Ashutosu, whole-row
reference handling and use of get_joinrel_parampathinfo().

2015-03-05 22:00 GMT+09:00 Shigeru Hanada <shigeru.hanada@gmail.com>:
> Hi Ashutosh, thanks for the review.
>
> 2015-03-04 19:17 GMT+09:00 Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>:
>> In create_foreignscan_path() we have lines like -
>> 1587     pathnode->path.param_info = get_baserel_parampathinfo(root, rel,
>> 1588
>> required_outer);
>> Now, that the same function is being used for creating foreign scan paths
>> for joins, we should be calling get_joinrel_parampathinfo() on a join rel
>> and get_baserel_parampathinfo() on base rel.
>
> Got it.  Please let me check the difference.
>
>>
>> The patch seems to handle all the restriction clauses in the same way. There
>> are two kinds of restriction clauses - a. join quals (specified using ON
>> clause; optimizer might move them to the other class if that doesn't affect
>> correctness) and b. quals on join relation (specified in the WHERE clause,
>> optimizer might move them to the other class if that doesn't affect
>> correctness). The quals in "a" are applied while the join is being computed
>> whereas those in "b" are applied after the join is computed. For example,
>> postgres=# select * from lt;
>>  val | val2
>> -----+------
>>    1 |    2
>>    1 |    3
>> (2 rows)
>>
>> postgres=# select * from lt2;
>>  val | val2
>> -----+------
>>    1 |    2
>> (1 row)
>>
>> postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2);
>>  val | val2 | val | val2
>> -----+------+-----+------
>>    1 |    2 |   1 |    2
>>    1 |    3 |     |
>> (2 rows)
>>
>> postgres=# select * from lt left join lt2 on (true) where (lt.val2 =
>> lt2.val2);
>>  val | val2 | val | val2
>> -----+------+-----+------
>>    1 |    2 |   1 |    2
>> (1 row)
>>
>> The difference between these two kinds is evident in case of outer joins,
>> for inner join optimizer puts all of them in class "b". The remote query
>> sent to the foreign server has all those in ON clause. Consider foreign
>> tables ft1 and ft2 pointing to local tables on the same server.
>> postgres=# \d ft1
>>          Foreign table "public.ft1"
>>  Column |  Type   | Modifiers | FDW Options
>> --------+---------+-----------+-------------
>>  val    | integer |           |
>>  val2   | integer |           |
>> Server: loopback
>> FDW Options: (table_name 'lt')
>>
>> postgres=# \d ft2
>>          Foreign table "public.ft2"
>>  Column |  Type   | Modifiers | FDW Options
>> --------+---------+-----------+-------------
>>  val    | integer |           |
>>  val2   | integer |           |
>> Server: loopback
>> FDW Options: (table_name 'lt2')
>>
>> postgres=# explain verbose select * from ft1 left join ft2 on (ft1.val2 =
>> ft2.val2) where ft1.val + ft2.val > ft1.val2 or ft2.val is null;
>>
>> QUERY PLAN
>>
>>
-------------------------------------------------------------------------------------------------------------------------------------------------------
>> ------------------------------------------------------------------------------------
>>  Foreign Scan  (cost=100.00..125.60 rows=2560 width=16)
>>    Output: val, val2, val, val2
>>    Remote SQL: SELECT r.a_0, r.a_1, l.a_0, l.a_1 FROM (SELECT val, val2 FROM
>> public.lt2) l (a_0, a_1) RIGHT JOIN (SELECT val, val2 FROM public.lt) r (a
>> _0, a_1)  ON ((((r.a_0 + l.a_0) > r.a_1) OR (l.a_0 IS NULL))) AND ((r.a_1 =
>> l.a_1))
>> (3 rows)
>>
>> The result is then wrong
>> postgres=# select * from ft1 left join ft2 on (ft1.val2 = ft2.val2) where
>> ft1.val + ft2.val > ft1.val2 or ft2.val is null;
>>  val | val2 | val | val2
>> -----+------+-----+------
>>    1 |    2 |     |
>>    1 |    3 |     |
>> (2 rows)
>>
>> which should match the result obtained by substituting local tables for
>> foreign ones
>> postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2) where
>> lt.val + lt2.val > lt.val2 or lt2.val is null;
>>  val | val2 | val | val2
>> -----+------+-----+------
>>    1 |    3 |     |
>> (1 row)
>>
>> Once we start distinguishing the two kinds of quals, there is some
>> optimization possible. For pushing down a join it's essential that all the
>> quals in "a" are safe to be pushed down. But a join can be pushed down, even
>> if quals in "a" are not safe to be pushed down. But more clauses one pushed
>> down to foreign server, lesser are the rows fetched from the foreign server.
>> In postgresGetForeignJoinPath, instead of checking all the restriction
>> clauses to be safe to be pushed down, we need to check only those which are
>> join quals (class "a").
>
> The argument restrictlist of GetForeignJoinPaths contains both
> conditions mixed, so I added extract_actual_join_clauses() to separate
> it into two lists, join_quals and other clauses.  This is similar to
> what create_nestloop_plan and siblings do.
>
>
>>
>> Following EXPLAIN output seems to be confusing
>> ft1 and ft2 both are pointing to same lt on a foreign server.
>> postgres=# explain verbose select ft1.val + ft1.val2 from ft1, ft2 where
>> ft1.val + ft1.val2 = ft2.val;
>>
>> QUERY PLAN
>>
>>
-------------------------------------------------------------------------------------------------------------------------------------------------------
>> --------------------------
>>  Foreign Scan  (cost=100.00..132.00 rows=2560 width=8)
>>    Output: (val + val2)
>>    Remote SQL: SELECT r.a_0, r.a_1 FROM (SELECT val, NULL FROM public.lt) l
>> (a_0, a_1) INNER JOIN (SELECT val, val2 FROM public.lt) r (a_0, a_1)  ON ((
>> (r.a_0 + r.a_1) = l.a_0))
>>
>> Output just specified val + val2, it doesn't tell, where those val and val2
>> come from, neither it's evident from the rest of the context.
>>
>
> Actually val and val2 come from public.lt in "r" side, but as you say
> it's too difficult to know that from EXPLAIN output.  Do you have any
> idea to make the "Output" item more readable?
>
> --
> Shigeru HANADA



--
Shigeru HANADA

Вложения

foreign_join_v5.patch

Re: Join push-down support for foreign tables

От

Kouhei Kaigai

Дата:

06 марта 2015 г., 01:11:39

> Actually val and val2 come from public.lt in "r" side, but as you say
> it's too difficult to know that from EXPLAIN output.  Do you have any
> idea to make the "Output" item more readable?
>
A fundamental reason why we need to have symbolic aliases here is that
postgres_fdw has remote query in cstring form. It makes implementation
complicated to deconstruct/construct a query that is once constructed
on the underlying foreign-path level.
If ForeignScan keeps items to construct remote query in expression node
form (and construction of remote query is delayed to beginning of the
executor, probably), we will be able to construct more human readable
remote query.

However, I don't recommend to work on this great refactoring stuff
within the scope of join push-down support project.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Shigeru Hanada
> Sent: Thursday, March 05, 2015 10:00 PM
> To: Ashutosh Bapat
> Cc: Kaigai Kouhei(海外 浩平); Robert Haas; PostgreSQL-development
> Subject: Re: [HACKERS] Join push-down support for foreign tables
> 
> Hi Ashutosh, thanks for the review.
> 
> 2015-03-04 19:17 GMT+09:00 Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>:
> > In create_foreignscan_path() we have lines like -
> > 1587     pathnode->path.param_info = get_baserel_parampathinfo(root, rel,
> > 1588
> > required_outer);
> > Now, that the same function is being used for creating foreign scan paths
> > for joins, we should be calling get_joinrel_parampathinfo() on a join rel
> > and get_baserel_parampathinfo() on base rel.
> 
> Got it.  Please let me check the difference.
> 
> >
> > The patch seems to handle all the restriction clauses in the same way. There
> > are two kinds of restriction clauses - a. join quals (specified using ON
> > clause; optimizer might move them to the other class if that doesn't affect
> > correctness) and b. quals on join relation (specified in the WHERE clause,
> > optimizer might move them to the other class if that doesn't affect
> > correctness). The quals in "a" are applied while the join is being computed
> > whereas those in "b" are applied after the join is computed. For example,
> > postgres=# select * from lt;
> >  val | val2
> > -----+------
> >    1 |    2
> >    1 |    3
> > (2 rows)
> >
> > postgres=# select * from lt2;
> >  val | val2
> > -----+------
> >    1 |    2
> > (1 row)
> >
> > postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2);
> >  val | val2 | val | val2
> > -----+------+-----+------
> >    1 |    2 |   1 |    2
> >    1 |    3 |     |
> > (2 rows)
> >
> > postgres=# select * from lt left join lt2 on (true) where (lt.val2 =
> > lt2.val2);
> >  val | val2 | val | val2
> > -----+------+-----+------
> >    1 |    2 |   1 |    2
> > (1 row)
> >
> > The difference between these two kinds is evident in case of outer joins,
> > for inner join optimizer puts all of them in class "b". The remote query
> > sent to the foreign server has all those in ON clause. Consider foreign
> > tables ft1 and ft2 pointing to local tables on the same server.
> > postgres=# \d ft1
> >          Foreign table "public.ft1"
> >  Column |  Type   | Modifiers | FDW Options
> > --------+---------+-----------+-------------
> >  val    | integer |           |
> >  val2   | integer |           |
> > Server: loopback
> > FDW Options: (table_name 'lt')
> >
> > postgres=# \d ft2
> >          Foreign table "public.ft2"
> >  Column |  Type   | Modifiers | FDW Options
> > --------+---------+-----------+-------------
> >  val    | integer |           |
> >  val2   | integer |           |
> > Server: loopback
> > FDW Options: (table_name 'lt2')
> >
> > postgres=# explain verbose select * from ft1 left join ft2 on (ft1.val2 =
> > ft2.val2) where ft1.val + ft2.val > ft1.val2 or ft2.val is null;
> >
> > QUERY PLAN
> >
> >
> ----------------------------------------------------------------------------
> ---------------------------------------------------------------------------
> >
> ----------------------------------------------------------------------------
> --------
> >  Foreign Scan  (cost=100.00..125.60 rows=2560 width=16)
> >    Output: val, val2, val, val2
> >    Remote SQL: SELECT r.a_0, r.a_1, l.a_0, l.a_1 FROM (SELECT val, val2 FROM
> > public.lt2) l (a_0, a_1) RIGHT JOIN (SELECT val, val2 FROM public.lt) r (a
> > _0, a_1)  ON ((((r.a_0 + l.a_0) > r.a_1) OR (l.a_0 IS NULL))) AND ((r.a_1 =
> > l.a_1))
> > (3 rows)
> >
> > The result is then wrong
> > postgres=# select * from ft1 left join ft2 on (ft1.val2 = ft2.val2) where
> > ft1.val + ft2.val > ft1.val2 or ft2.val is null;
> >  val | val2 | val | val2
> > -----+------+-----+------
> >    1 |    2 |     |
> >    1 |    3 |     |
> > (2 rows)
> >
> > which should match the result obtained by substituting local tables for
> > foreign ones
> > postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2) where
> > lt.val + lt2.val > lt.val2 or lt2.val is null;
> >  val | val2 | val | val2
> > -----+------+-----+------
> >    1 |    3 |     |
> > (1 row)
> >
> > Once we start distinguishing the two kinds of quals, there is some
> > optimization possible. For pushing down a join it's essential that all the
> > quals in "a" are safe to be pushed down. But a join can be pushed down, even
> > if quals in "a" are not safe to be pushed down. But more clauses one pushed
> > down to foreign server, lesser are the rows fetched from the foreign server.
> > In postgresGetForeignJoinPath, instead of checking all the restriction
> > clauses to be safe to be pushed down, we need to check only those which are
> > join quals (class "a").
> 
> The argument restrictlist of GetForeignJoinPaths contains both
> conditions mixed, so I added extract_actual_join_clauses() to separate
> it into two lists, join_quals and other clauses.  This is similar to
> what create_nestloop_plan and siblings do.
> 
> 
> >
> > Following EXPLAIN output seems to be confusing
> > ft1 and ft2 both are pointing to same lt on a foreign server.
> > postgres=# explain verbose select ft1.val + ft1.val2 from ft1, ft2 where
> > ft1.val + ft1.val2 = ft2.val;
> >
> > QUERY PLAN
> >
> >
> ----------------------------------------------------------------------------
> ---------------------------------------------------------------------------
> > --------------------------
> >  Foreign Scan  (cost=100.00..132.00 rows=2560 width=8)
> >    Output: (val + val2)
> >    Remote SQL: SELECT r.a_0, r.a_1 FROM (SELECT val, NULL FROM public.lt) l
> > (a_0, a_1) INNER JOIN (SELECT val, val2 FROM public.lt) r (a_0, a_1)  ON ((
> > (r.a_0 + r.a_1) = l.a_0))
> >
> > Output just specified val + val2, it doesn't tell, where those val and val2
> > come from, neither it's evident from the rest of the context.
> >
> 
> Actually val and val2 come from public.lt in "r" side, but as you say
> it's too difficult to know that from EXPLAIN output.  Do you have any
> idea to make the "Output" item more readable?
> 
> --
> Shigeru HANADA
> 
> 
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

Re: Join push-down support for foreign tables

От

Ashutosh Bapat

Дата:

06 марта 2015 г., 10:25:54

Hi Kaigai-san, Hanada-san,

Attached please find a patch to print the column names prefixed by the relation names. I haven't tested the patch fully. The same changes will be needed for CustomPlan node specific code.

Now I am able to make sense out of the Output information

postgres=# explain verbose select * from ft1 join ft2 using (val);
                                                                                  QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------
Foreign Scan (cost=100.00..125.60 rows=2560 width=12)
   Output: ft1.val, ft1.val2, ft2.val2
   Remote SQL: SELECT r.a_0, r.a_1, l.a_1 FROM (SELECT val, val2 FROM public.lt) l (a_0, a_1) INNER JOIN (SELECT val, val2 FROM public.lt) r (a_0, a_1)
ON ((r.a_0 = l.a_0))
(3 rows)

On Fri, Mar 6, 2015 at 6:41 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

> Actually val and val2 come from public.lt in "r" side, but as you say
> it's too difficult to know that from EXPLAIN output. Do you have any
> idea to make the "Output" item more readable?
>
A fundamental reason why we need to have symbolic aliases here is that
postgres_fdw has remote query in cstring form. It makes implementation
complicated to deconstruct/construct a query that is once constructed
on the underlying foreign-path level.
If ForeignScan keeps items to construct remote query in expression node
form (and construction of remote query is delayed to beginning of the
executor, probably), we will be able to construct more human readable
remote query.

However, I don't recommend to work on this great refactoring stuff
within the scope of join push-down support project.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Shigeru Hanada
> Sent: Thursday, March 05, 2015 10:00 PM
> To: Ashutosh Bapat
> Cc: Kaigai Kouhei(海外浩平); Robert Haas; PostgreSQL-development
> Subject: Re: [HACKERS] Join push-down support for foreign tables
>
> Hi Ashutosh, thanks for the review.
>
> 2015-03-04 19:17 GMT+09:00 Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>:
> > In create_foreignscan_path() we have lines like -
> > 1587 pathnode->path.param_info = get_baserel_parampathinfo(root, rel,
> > 1588
> > required_outer);
> > Now, that the same function is being used for creating foreign scan paths
> > for joins, we should be calling get_joinrel_parampathinfo() on a join rel
> > and get_baserel_parampathinfo() on base rel.
>
> Got it. Please let me check the difference.
>
> >
> > The patch seems to handle all the restriction clauses in the same way. There
> > are two kinds of restriction clauses - a. join quals (specified using ON
> > clause; optimizer might move them to the other class if that doesn't affect
> > correctness) and b. quals on join relation (specified in the WHERE clause,
> > optimizer might move them to the other class if that doesn't affect
> > correctness). The quals in "a" are applied while the join is being computed
> > whereas those in "b" are applied after the join is computed. For example,
> > postgres=# select * from lt;
> > val | val2
> > -----+------
> > 1 | 2
> > 1 | 3
> > (2 rows)
> >
> > postgres=# select * from lt2;
> > val | val2
> > -----+------
> > 1 | 2
> > (1 row)
> >
> > postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2);
> > val | val2 | val | val2
> > -----+------+-----+------
> > 1 | 2 | 1 | 2
> > 1 | 3 | |
> > (2 rows)
> >
> > postgres=# select * from lt left join lt2 on (true) where (lt.val2 =
> > lt2.val2);
> > val | val2 | val | val2
> > -----+------+-----+------
> > 1 | 2 | 1 | 2
> > (1 row)
> >
> > The difference between these two kinds is evident in case of outer joins,
> > for inner join optimizer puts all of them in class "b". The remote query
> > sent to the foreign server has all those in ON clause. Consider foreign
> > tables ft1 and ft2 pointing to local tables on the same server.
> > postgres=# \d ft1
> > Foreign table "public.ft1"
> > Column | Type | Modifiers | FDW Options
> > --------+---------+-----------+-------------
> > val | integer | |
> > val2 | integer | |
> > Server: loopback
> > FDW Options: (table_name 'lt')
> >
> > postgres=# \d ft2
> > Foreign table "public.ft2"
> > Column | Type | Modifiers | FDW Options
> > --------+---------+-----------+-------------
> > val | integer | |
> > val2 | integer | |
> > Server: loopback
> > FDW Options: (table_name 'lt2')
> >
> > postgres=# explain verbose select * from ft1 left join ft2 on (ft1.val2 =
> > ft2.val2) where ft1.val + ft2.val > ft1.val2 or ft2.val is null;
> >
> > QUERY PLAN
> >
> >
> ----------------------------------------------------------------------------
> ---------------------------------------------------------------------------
> >
> ----------------------------------------------------------------------------
> --------
> > Foreign Scan (cost=100.00..125.60 rows=2560 width=16)
> > Output: val, val2, val, val2
> > Remote SQL: SELECT r.a_0, r.a_1, l.a_0, l.a_1 FROM (SELECT val, val2 FROM
> > public.lt2) l (a_0, a_1) RIGHT JOIN (SELECT val, val2 FROM public.lt) r (a
> > _0, a_1) ON ((((r.a_0 + l.a_0) > r.a_1) OR (l.a_0 IS NULL))) AND ((r.a_1 =
> > l.a_1))
> > (3 rows)
> >
> > The result is then wrong
> > postgres=# select * from ft1 left join ft2 on (ft1.val2 = ft2.val2) where
> > ft1.val + ft2.val > ft1.val2 or ft2.val is null;
> > val | val2 | val | val2
> > -----+------+-----+------
> > 1 | 2 | |
> > 1 | 3 | |
> > (2 rows)
> >
> > which should match the result obtained by substituting local tables for
> > foreign ones
> > postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2) where
> > lt.val + lt2.val > lt.val2 or lt2.val is null;
> > val | val2 | val | val2
> > -----+------+-----+------
> > 1 | 3 | |
> > (1 row)
> >
> > Once we start distinguishing the two kinds of quals, there is some
> > optimization possible. For pushing down a join it's essential that all the
> > quals in "a" are safe to be pushed down. But a join can be pushed down, even
> > if quals in "a" are not safe to be pushed down. But more clauses one pushed
> > down to foreign server, lesser are the rows fetched from the foreign server.
> > In postgresGetForeignJoinPath, instead of checking all the restriction
> > clauses to be safe to be pushed down, we need to check only those which are
> > join quals (class "a").
>
> The argument restrictlist of GetForeignJoinPaths contains both
> conditions mixed, so I added extract_actual_join_clauses() to separate
> it into two lists, join_quals and other clauses. This is similar to
> what create_nestloop_plan and siblings do.
>
>
> >
> > Following EXPLAIN output seems to be confusing
> > ft1 and ft2 both are pointing to same lt on a foreign server.
> > postgres=# explain verbose select ft1.val + ft1.val2 from ft1, ft2 where
> > ft1.val + ft1.val2 = ft2.val;
> >
> > QUERY PLAN
> >
> >
> ----------------------------------------------------------------------------
> ---------------------------------------------------------------------------
> > --------------------------
> > Foreign Scan (cost=100.00..132.00 rows=2560 width=8)
> > Output: (val + val2)
> > Remote SQL: SELECT r.a_0, r.a_1 FROM (SELECT val, NULL FROM public.lt) l
> > (a_0, a_1) INNER JOIN (SELECT val, val2 FROM public.lt) r (a_0, a_1) ON ((
> > (r.a_0 + r.a_1) = l.a_0))
> >
> > Output just specified val + val2, it doesn't tell, where those val and val2
> > come from, neither it's evident from the rest of the context.
> >
>
> Actually val and val2 come from public.lt in "r" side, but as you say
> it's too difficult to know that from EXPLAIN output. Do you have any
> idea to make the "Output" item more readable?
>
> --
> Shigeru HANADA
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

explain_relnames.patch

Re: Join push-down support for foreign tables

От

Kouhei Kaigai

Дата:

09 марта 2015 г., 00:17:40

Hi Ashutosh,

Thanks for finding out what we oversight.
Here is still a problem because the new 'relids' field is not updated
on setrefs.c (scanrelid is incremented by rtoffset here).
It is easy to shift the bitmapset by rtoffset, however, I also would
like to see another approach.

My idea adds 'List *fdw_sub_paths' field in ForeignPath to inform
planner underlying foreign-scan paths (with scanrelid > 0).
The create_foreignscan_plan() will call create_plan_recurse() to
construct plan nodes based on the path nodes being attached.
Even though these foreign-scan nodes are not actually executed,
setrefs.c can update scanrelid in usual way and ExplainPreScanNode
does not need to take exceptional handling on Foreign/CustomScan
nodes.
In addition, it allows to keep information about underlying foreign
table scan, even if planner will need some other information in the
future version (not only relids).

How about your thought?
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Ashutosh Bapat
> Sent: Friday, March 06, 2015 7:26 PM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Shigeru Hanada; Robert Haas; PostgreSQL-development
> Subject: Re: [HACKERS] Join push-down support for foreign tables
> 
> Hi Kaigai-san, Hanada-san,
> 
> Attached please find a patch to print the column names prefixed by the relation
> names. I haven't tested the patch fully. The same changes will be needed for
> CustomPlan node specific code.
> 
> Now I am able to make sense out of the Output information
> 
> postgres=# explain verbose select * from ft1 join ft2 using (val);
> 
> QUERY PLAN
> 
> ----------------------------------------------------------------------------
> ---------------------------------------------------------------------------
> -----------------------
>  Foreign Scan  (cost=100.00..125.60 rows=2560 width=12)
>    Output: ft1.val, ft1.val2, ft2.val2
>    Remote SQL: SELECT r.a_0, r.a_1, l.a_1 FROM (SELECT val, val2 FROM public.lt)
> l (a_0, a_1) INNER JOIN (SELECT val, val2 FROM public.lt) r (a_0, a_1)
>   ON ((r.a_0 = l.a_0))
> (3 rows)
> 
> 
> 
> On Fri, Mar 6, 2015 at 6:41 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> 
> 
>     > Actually val and val2 come from public.lt in "r" side, but as you say
>     > it's too difficult to know that from EXPLAIN output.  Do you have any
>     > idea to make the "Output" item more readable?
>     >
>     A fundamental reason why we need to have symbolic aliases here is that
>     postgres_fdw has remote query in cstring form. It makes implementation
>     complicated to deconstruct/construct a query that is once constructed
>     on the underlying foreign-path level.
>     If ForeignScan keeps items to construct remote query in expression node
>     form (and construction of remote query is delayed to beginning of the
>     executor, probably), we will be able to construct more human readable
>     remote query.
> 
>     However, I don't recommend to work on this great refactoring stuff
>     within the scope of join push-down support project.
> 
>     Thanks,
>     --
>     NEC OSS Promotion Center / PG-Strom Project
>     KaiGai Kohei <kaigai@ak.jp.nec.com>
> 
> 
>     > -----Original Message-----
>     > From: pgsql-hackers-owner@postgresql.org
>     > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Shigeru
> Hanada
>     > Sent: Thursday, March 05, 2015 10:00 PM
>     > To: Ashutosh Bapat
>     > Cc: Kaigai Kouhei(海外 浩平); Robert Haas; PostgreSQL-development
>     > Subject: Re: [HACKERS] Join push-down support for foreign tables
>     >
> 
>     > Hi Ashutosh, thanks for the review.
>     >
>     > 2015-03-04 19:17 GMT+09:00 Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com>:
>     > > In create_foreignscan_path() we have lines like -
>     > > 1587     pathnode->path.param_info =
> get_baserel_parampathinfo(root, rel,
>     > > 1588
>     > > required_outer);
>     > > Now, that the same function is being used for creating foreign scan
> paths
>     > > for joins, we should be calling get_joinrel_parampathinfo() on a join
> rel
>     > > and get_baserel_parampathinfo() on base rel.
>     >
>     > Got it.  Please let me check the difference.
>     >
>     > >
>     > > The patch seems to handle all the restriction clauses in the same
> way. There
>     > > are two kinds of restriction clauses - a. join quals (specified using
> ON
>     > > clause; optimizer might move them to the other class if that doesn't
> affect
>     > > correctness) and b. quals on join relation (specified in the WHERE
> clause,
>     > > optimizer might move them to the other class if that doesn't affect
>     > > correctness). The quals in "a" are applied while the join is being
> computed
>     > > whereas those in "b" are applied after the join is computed. For
> example,
>     > > postgres=# select * from lt;
>     > >  val | val2
>     > > -----+------
>     > >    1 |    2
>     > >    1 |    3
>     > > (2 rows)
>     > >
>     > > postgres=# select * from lt2;
>     > >  val | val2
>     > > -----+------
>     > >    1 |    2
>     > > (1 row)
>     > >
>     > > postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2);
>     > >  val | val2 | val | val2
>     > > -----+------+-----+------
>     > >    1 |    2 |   1 |    2
>     > >    1 |    3 |     |
>     > > (2 rows)
>     > >
>     > > postgres=# select * from lt left join lt2 on (true) where (lt.val2
> =
>     > > lt2.val2);
>     > >  val | val2 | val | val2
>     > > -----+------+-----+------
>     > >    1 |    2 |   1 |    2
>     > > (1 row)
>     > >
>     > > The difference between these two kinds is evident in case of outer
> joins,
>     > > for inner join optimizer puts all of them in class "b". The remote
> query
>     > > sent to the foreign server has all those in ON clause. Consider foreign
>     > > tables ft1 and ft2 pointing to local tables on the same server.
>     > > postgres=# \d ft1
>     > >          Foreign table "public.ft1"
>     > >  Column |  Type   | Modifiers | FDW Options
>     > > --------+---------+-----------+-------------
>     > >  val    | integer |           |
>     > >  val2   | integer |           |
>     > > Server: loopback
>     > > FDW Options: (table_name 'lt')
>     > >
>     > > postgres=# \d ft2
>     > >          Foreign table "public.ft2"
>     > >  Column |  Type   | Modifiers | FDW Options
>     > > --------+---------+-----------+-------------
>     > >  val    | integer |           |
>     > >  val2   | integer |           |
>     > > Server: loopback
>     > > FDW Options: (table_name 'lt2')
>     > >
>     > > postgres=# explain verbose select * from ft1 left join ft2 on (ft1.val2
> =
>     > > ft2.val2) where ft1.val + ft2.val > ft1.val2 or ft2.val is null;
>     > >
>     > > QUERY PLAN
>     > >
>     > >
>     >
> ----------------------------------------------------------------------------
>     >
> ---------------------------------------------------------------------------
>     > >
>     >
> ----------------------------------------------------------------------------
>     > --------
>     > >  Foreign Scan  (cost=100.00..125.60 rows=2560 width=16)
>     > >    Output: val, val2, val, val2
>     > >    Remote SQL: SELECT r.a_0, r.a_1, l.a_0, l.a_1 FROM (SELECT val,
> val2 FROM
>     > > public.lt2) l (a_0, a_1) RIGHT JOIN (SELECT val, val2 FROM public.lt)
> r (a
>     > > _0, a_1)  ON ((((r.a_0 + l.a_0) > r.a_1) OR (l.a_0 IS NULL))) AND
> ((r.a_1 =
>     > > l.a_1))
>     > > (3 rows)
>     > >
>     > > The result is then wrong
>     > > postgres=# select * from ft1 left join ft2 on (ft1.val2 = ft2.val2)
> where
>     > > ft1.val + ft2.val > ft1.val2 or ft2.val is null;
>     > >  val | val2 | val | val2
>     > > -----+------+-----+------
>     > >    1 |    2 |     |
>     > >    1 |    3 |     |
>     > > (2 rows)
>     > >
>     > > which should match the result obtained by substituting local tables
> for
>     > > foreign ones
>     > > postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2)
> where
>     > > lt.val + lt2.val > lt.val2 or lt2.val is null;
>     > >  val | val2 | val | val2
>     > > -----+------+-----+------
>     > >    1 |    3 |     |
>     > > (1 row)
>     > >
>     > > Once we start distinguishing the two kinds of quals, there is some
>     > > optimization possible. For pushing down a join it's essential that
> all the
>     > > quals in "a" are safe to be pushed down. But a join can be pushed
> down, even
>     > > if quals in "a" are not safe to be pushed down. But more clauses one
> pushed
>     > > down to foreign server, lesser are the rows fetched from the foreign
> server.
>     > > In postgresGetForeignJoinPath, instead of checking all the
> restriction
>     > > clauses to be safe to be pushed down, we need to check only those
> which are
>     > > join quals (class "a").
>     >
>     > The argument restrictlist of GetForeignJoinPaths contains both
>     > conditions mixed, so I added extract_actual_join_clauses() to separate
>     > it into two lists, join_quals and other clauses.  This is similar to
>     > what create_nestloop_plan and siblings do.
>     >
>     >
>     > >
>     > > Following EXPLAIN output seems to be confusing
>     > > ft1 and ft2 both are pointing to same lt on a foreign server.
>     > > postgres=# explain verbose select ft1.val + ft1.val2 from ft1, ft2
> where
>     > > ft1.val + ft1.val2 = ft2.val;
>     > >
>     > > QUERY PLAN
>     > >
>     > >
>     >
> ----------------------------------------------------------------------------
>     >
> ---------------------------------------------------------------------------
>     > > --------------------------
>     > >  Foreign Scan  (cost=100.00..132.00 rows=2560 width=8)
>     > >    Output: (val + val2)
>     > >    Remote SQL: SELECT r.a_0, r.a_1 FROM (SELECT val, NULL FROM
> public.lt) l
>     > > (a_0, a_1) INNER JOIN (SELECT val, val2 FROM public.lt) r (a_0, a_1)
> ON ((
>     > > (r.a_0 + r.a_1) = l.a_0))
>     > >
>     > > Output just specified val + val2, it doesn't tell, where those val
> and val2
>     > > come from, neither it's evident from the rest of the context.
>     > >
>     >
>     > Actually val and val2 come from public.lt in "r" side, but as you say
>     > it's too difficult to know that from EXPLAIN output.  Do you have any
>     > idea to make the "Output" item more readable?
>     >
>     > --
>     > Shigeru HANADA
>     >
>     >
> 
>     > --
>     > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>     > To make changes to your subscription:
>     > http://www.postgresql.org/mailpref/pgsql-hackers
> 
> 
> 
> 
> 
> --
> 
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company

Re: Join push-down support for foreign tables

От

Ashutosh Bapat

Дата:

09 марта 2015 г., 03:49:16

On Mon, Mar 9, 2015 at 5:46 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

Hi Ashutosh,

Thanks for finding out what we oversight.
Here is still a problem because the new 'relids' field is not updated
on setrefs.c (scanrelid is incremented by rtoffset here).
It is easy to shift the bitmapset by rtoffset, however, I also would
like to see another approach.

I just made it work for explain, but other parts still need work. Sorry about that. If we follow INDEX_VAR, we should be able to get there.

My idea adds 'List *fdw_sub_paths' field in ForeignPath to inform
planner underlying foreign-scan paths (with scanrelid > 0).
The create_foreignscan_plan() will call create_plan_recurse() to
construct plan nodes based on the path nodes being attached.
Even though these foreign-scan nodes are not actually executed,
setrefs.c can update scanrelid in usual way and ExplainPreScanNode
does not need to take exceptional handling on Foreign/CustomScan
nodes.
In addition, it allows to keep information about underlying foreign
table scan, even if planner will need some other information in the
future version (not only relids).

How about your thought?

I am not sure about keeping planner nodes, which are not turned into execution nodes. There's no precedence for that in current code. It could be risky.

--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Ashutosh Bapat
> Sent: Friday, March 06, 2015 7:26 PM
> To: Kaigai Kouhei(海外浩平)
> Cc: Shigeru Hanada; Robert Haas; PostgreSQL-development
> Subject: Re: [HACKERS] Join push-down support for foreign tables
>
> Hi Kaigai-san, Hanada-san,
>
> Attached please find a patch to print the column names prefixed by the relation
> names. I haven't tested the patch fully. The same changes will be needed for
> CustomPlan node specific code.
>
> Now I am able to make sense out of the Output information
>
> postgres=# explain verbose select * from ft1 join ft2 using (val);
>
> QUERY PLAN
>
> ----------------------------------------------------------------------------
> ---------------------------------------------------------------------------
> -----------------------
> Foreign Scan (cost=100.00..125.60 rows=2560 width=12)
> Output: ft1.val, ft1.val2, ft2.val2
> Remote SQL: SELECT r.a_0, r.a_1, l.a_1 FROM (SELECT val, val2 FROM public.lt)
> l (a_0, a_1) INNER JOIN (SELECT val, val2 FROM public.lt) r (a_0, a_1)
> ON ((r.a_0 = l.a_0))
> (3 rows)
>
>
>
> On Fri, Mar 6, 2015 at 6:41 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
>
>
> > Actually val and val2 come from public.lt in "r" side, but as you say
> > it's too difficult to know that from EXPLAIN output. Do you have any
> > idea to make the "Output" item more readable?
> >
> A fundamental reason why we need to have symbolic aliases here is that
> postgres_fdw has remote query in cstring form. It makes implementation
> complicated to deconstruct/construct a query that is once constructed
> on the underlying foreign-path level.
> If ForeignScan keeps items to construct remote query in expression node
> form (and construction of remote query is delayed to beginning of the
> executor, probably), we will be able to construct more human readable
> remote query.
>
> However, I don't recommend to work on this great refactoring stuff
> within the scope of join push-down support project.
>
> Thanks,
> --
> NEC OSS Promotion Center / PG-Strom Project
> KaiGai Kohei <kaigai@ak.jp.nec.com>
>
>
> > -----Original Message-----
> > From: pgsql-hackers-owner@postgresql.org
> > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Shigeru
> Hanada
> > Sent: Thursday, March 05, 2015 10:00 PM
> > To: Ashutosh Bapat
> > Cc: Kaigai Kouhei(海外浩平); Robert Haas; PostgreSQL-development
> > Subject: Re: [HACKERS] Join push-down support for foreign tables
> >
>
> > Hi Ashutosh, thanks for the review.
> >
> > 2015-03-04 19:17 GMT+09:00 Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com>:
> > > In create_foreignscan_path() we have lines like -
> > > 1587 pathnode->path.param_info =
> get_baserel_parampathinfo(root, rel,
> > > 1588
> > > required_outer);
> > > Now, that the same function is being used for creating foreign scan
> paths
> > > for joins, we should be calling get_joinrel_parampathinfo() on a join
> rel
> > > and get_baserel_parampathinfo() on base rel.
> >
> > Got it. Please let me check the difference.
> >
> > >
> > > The patch seems to handle all the restriction clauses in the same
> way. There
> > > are two kinds of restriction clauses - a. join quals (specified using
> ON
> > > clause; optimizer might move them to the other class if that doesn't
> affect
> > > correctness) and b. quals on join relation (specified in the WHERE
> clause,
> > > optimizer might move them to the other class if that doesn't affect
> > > correctness). The quals in "a" are applied while the join is being
> computed
> > > whereas those in "b" are applied after the join is computed. For
> example,
> > > postgres=# select * from lt;
> > > val | val2
> > > -----+------
> > > 1 | 2
> > > 1 | 3
> > > (2 rows)
> > >
> > > postgres=# select * from lt2;
> > > val | val2
> > > -----+------
> > > 1 | 2
> > > (1 row)
> > >
> > > postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2);
> > > val | val2 | val | val2
> > > -----+------+-----+------
> > > 1 | 2 | 1 | 2
> > > 1 | 3 | |
> > > (2 rows)
> > >
> > > postgres=# select * from lt left join lt2 on (true) where (lt.val2
> =
> > > lt2.val2);
> > > val | val2 | val | val2
> > > -----+------+-----+------
> > > 1 | 2 | 1 | 2
> > > (1 row)
> > >
> > > The difference between these two kinds is evident in case of outer
> joins,
> > > for inner join optimizer puts all of them in class "b". The remote
> query
> > > sent to the foreign server has all those in ON clause. Consider foreign
> > > tables ft1 and ft2 pointing to local tables on the same server.
> > > postgres=# \d ft1
> > > Foreign table "public.ft1"
> > > Column | Type | Modifiers | FDW Options
> > > --------+---------+-----------+-------------
> > > val | integer | |
> > > val2 | integer | |
> > > Server: loopback
> > > FDW Options: (table_name 'lt')
> > >
> > > postgres=# \d ft2
> > > Foreign table "public.ft2"
> > > Column | Type | Modifiers | FDW Options
> > > --------+---------+-----------+-------------
> > > val | integer | |
> > > val2 | integer | |
> > > Server: loopback
> > > FDW Options: (table_name 'lt2')
> > >
> > > postgres=# explain verbose select * from ft1 left join ft2 on (ft1.val2
> =
> > > ft2.val2) where ft1.val + ft2.val > ft1.val2 or ft2.val is null;
> > >
> > > QUERY PLAN
> > >
> > >
> >
> ----------------------------------------------------------------------------
> >
> ---------------------------------------------------------------------------
> > >
> >
> ----------------------------------------------------------------------------
> > --------
> > > Foreign Scan (cost=100.00..125.60 rows=2560 width=16)
> > > Output: val, val2, val, val2
> > > Remote SQL: SELECT r.a_0, r.a_1, l.a_0, l.a_1 FROM (SELECT val,
> val2 FROM
> > > public.lt2) l (a_0, a_1) RIGHT JOIN (SELECT val, val2 FROM public.lt)
> r (a
> > > _0, a_1) ON ((((r.a_0 + l.a_0) > r.a_1) OR (l.a_0 IS NULL))) AND
> ((r.a_1 =
> > > l.a_1))
> > > (3 rows)
> > >
> > > The result is then wrong
> > > postgres=# select * from ft1 left join ft2 on (ft1.val2 = ft2.val2)
> where
> > > ft1.val + ft2.val > ft1.val2 or ft2.val is null;
> > > val | val2 | val | val2
> > > -----+------+-----+------
> > > 1 | 2 | |
> > > 1 | 3 | |
> > > (2 rows)
> > >
> > > which should match the result obtained by substituting local tables
> for
> > > foreign ones
> > > postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2)
> where
> > > lt.val + lt2.val > lt.val2 or lt2.val is null;
> > > val | val2 | val | val2
> > > -----+------+-----+------
> > > 1 | 3 | |
> > > (1 row)
> > >
> > > Once we start distinguishing the two kinds of quals, there is some
> > > optimization possible. For pushing down a join it's essential that
> all the
> > > quals in "a" are safe to be pushed down. But a join can be pushed
> down, even
> > > if quals in "a" are not safe to be pushed down. But more clauses one
> pushed
> > > down to foreign server, lesser are the rows fetched from the foreign
> server.
> > > In postgresGetForeignJoinPath, instead of checking all the
> restriction
> > > clauses to be safe to be pushed down, we need to check only those
> which are
> > > join quals (class "a").
> >
> > The argument restrictlist of GetForeignJoinPaths contains both
> > conditions mixed, so I added extract_actual_join_clauses() to separate
> > it into two lists, join_quals and other clauses. This is similar to
> > what create_nestloop_plan and siblings do.
> >
> >
> > >
> > > Following EXPLAIN output seems to be confusing
> > > ft1 and ft2 both are pointing to same lt on a foreign server.
> > > postgres=# explain verbose select ft1.val + ft1.val2 from ft1, ft2
> where
> > > ft1.val + ft1.val2 = ft2.val;
> > >
> > > QUERY PLAN
> > >
> > >
> >
> ----------------------------------------------------------------------------
> >
> ---------------------------------------------------------------------------
> > > --------------------------
> > > Foreign Scan (cost=100.00..132.00 rows=2560 width=8)
> > > Output: (val + val2)
> > > Remote SQL: SELECT r.a_0, r.a_1 FROM (SELECT val, NULL FROM
> public.lt) l
> > > (a_0, a_1) INNER JOIN (SELECT val, val2 FROM public.lt) r (a_0, a_1)
> ON ((
> > > (r.a_0 + r.a_1) = l.a_0))
> > >
> > > Output just specified val + val2, it doesn't tell, where those val
> and val2
> > > come from, neither it's evident from the rest of the context.
> > >
> >
> > Actually val and val2 come from public.lt in "r" side, but as you say
> > it's too difficult to know that from EXPLAIN output. Do you have any
> > idea to make the "Output" item more readable?
> >
> > --
> > Shigeru HANADA
> >
> >
>
> > --
> > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-hackers
>
>
>
>
>
> --
>
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: Join push-down support for foreign tables

От

Kouhei Kaigai

Дата:

10 марта 2015 г., 03:19:39

>     Thanks for finding out what we oversight.
>     Here is still a problem because the new 'relids' field is not updated
>     on setrefs.c (scanrelid is incremented by rtoffset here).
>     It is easy to shift the bitmapset by rtoffset, however, I also would
>     like to see another approach.
> 
> 
> 
> I just made it work for explain, but other parts still need work. Sorry about
> that. If we follow INDEX_VAR, we should be able to get there.
> 
I tried to modify your patch a bit as below:
* add adjustment of bitmap fields on setrefs.c
* add support on outfuncs.c and copyfuncs.c.
* add bms_shift_members() in bitmapset.c

I think it is a reasonable enhancement, however, it is not tested with
real-life code, like postgres_fdw.

Hanada-san, could you add a feature to print name of foreign-tables
which are involved in remote queries, on postgresExplainForeignScan()?
ForeignScan->fdw_relids bitmap and ExplainState->rtable_names will
tell you the joined foreign tables replaced by the (pseudo) foreign-scan.

Soon, I'll update the interface patch also.

>     My idea adds 'List *fdw_sub_paths' field in ForeignPath to inform
>     planner underlying foreign-scan paths (with scanrelid > 0).
>     The create_foreignscan_plan() will call create_plan_recurse() to
>     construct plan nodes based on the path nodes being attached.
>     Even though these foreign-scan nodes are not actually executed,
>     setrefs.c can update scanrelid in usual way and ExplainPreScanNode
>     does not need to take exceptional handling on Foreign/CustomScan
>     nodes.
>     In addition, it allows to keep information about underlying foreign
>     table scan, even if planner will need some other information in the
>     future version (not only relids).
> 
>     How about your thought?
> 
> 
> 
> I am not sure about keeping planner nodes, which are not turned into execution
> nodes. There's no precedence for that in current code. It could be risky.
>
Indeed, it is a fair enough opinion. At this moment, no other code makes plan
node but shall not be executed actually.
Please forget above idea.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

Вложения

add_fdw_custom_relids.patch

Re: Join push-down support for foreign tables

От

Ashutosh Bapat

Дата:

13 марта 2015 г., 09:32:25

Hi Hanada-san,

I noticed that the patch doesn't have any tests for testing FDW join in postgres_fdw. While you are updating the patch, can you please add few tests for the same. I will suggest adding tests for a combination of these dimensions

1. Types of joins

2. Joins between multiple foreign and local tables together, to test whether we are pushing maximum of join tree with mixed tables.

3. Join/Where conditions with un/safe-to-push expressions

4. Queries with sorting/aggregation on top of join to test working of setref.

5. Joins between foreign tables on different foreign servers (to check that those do not get accidently pushed down).

I have attached a file with some example queries on those lines.

On Tue, Mar 10, 2015 at 8:37 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

> Thanks for finding out what we oversight.
> Here is still a problem because the new 'relids' field is not updated
> on setrefs.c (scanrelid is incremented by rtoffset here).
> It is easy to shift the bitmapset by rtoffset, however, I also would
> like to see another approach.
>
>
>
> I just made it work for explain, but other parts still need work. Sorry about
> that. If we follow INDEX_VAR, we should be able to get there.
>
I tried to modify your patch a bit as below:
* add adjustment of bitmap fields on setrefs.c
* add support on outfuncs.c and copyfuncs.c.
* add bms_shift_members() in bitmapset.c

I think it is a reasonable enhancement, however, it is not tested with
real-life code, like postgres_fdw.

Hanada-san, could you add a feature to print name of foreign-tables
which are involved in remote queries, on postgresExplainForeignScan()?
ForeignScan->fdw_relids bitmap and ExplainState->rtable_names will
tell you the joined foreign tables replaced by the (pseudo) foreign-scan.

Soon, I'll update the interface patch also.

> My idea adds 'List *fdw_sub_paths' field in ForeignPath to inform
> planner underlying foreign-scan paths (with scanrelid > 0).
> The create_foreignscan_plan() will call create_plan_recurse() to
> construct plan nodes based on the path nodes being attached.
> Even though these foreign-scan nodes are not actually executed,
> setrefs.c can update scanrelid in usual way and ExplainPreScanNode
> does not need to take exceptional handling on Foreign/CustomScan
> nodes.
> In addition, it allows to keep information about underlying foreign
> table scan, even if planner will need some other information in the
> future version (not only relids).
>
> How about your thought?
>
>
>
> I am not sure about keeping planner nodes, which are not turned into execution
> nodes. There's no precedence for that in current code. It could be risky.
>
Indeed, it is a fair enough opinion. At this moment, no other code makes plan
node but shall not be executed actually.
Please forget above idea.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

pgfdw_join.sql

Re: Join push-down support for foreign tables

От

Robert Haas

Дата:

16 марта 2015 г., 12:51:23

On Wed, Mar 4, 2015 at 4:26 AM, Shigeru Hanada <shigeru.hanada@gmail.com> wrote:
> Here is v4 patch of Join push-down support for foreign tables.  This
> patch requires Custom/Foreign join patch v7 posted by Kaigai-san.

Hi,

I just want to point out to the folks on this thread that the action
in this area is happening on the other thread, about the
custom/foreign join patch, and that Tom and I are suspecting that we
do not have the right design here.  Your input is needed.

From my end, I am quite skeptical about the way
postgresGetForeignJoinPath in this patch works.  It looks only at the
cheapest total path of the relations to be joined, which seems like it
could easily be wrong.  What about some other path that is more
expensive but provides a convenient sort order?  What about something
like A LEFT JOIN (B JOIN C ON B.x = C.x) ON A.y = B.y AND A.z = C.z,
which can't make a legal join until level 3?  Tom's proposed hook
placement would instead invoke the FDW once per joinrel, passing root
and the joinrel.  Then, you could cost a path based on the idea of
pushing that join entirely to the remote side, or exit without doing
anything if pushdown is not feasible.

Please read the other thread and then respond either there or here
with thoughts on that design.  If you don't provide some input on
this, both of these patches are going to get rejected as lacking
consensus, and we'll move on to other things.  I'd really rather not
ship yet another release without this important feature, but that's
where we're heading if we can't talk this through.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Join push-down support for foreign tables

От

Michael Paquier

Дата:

26 августа 2015 г., 07:46:47

On Mon, Mar 16, 2015 at 9:51 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Mar 4, 2015 at 4:26 AM, Shigeru Hanada <shigeru.hanada@gmail.com> wrote:
>> Here is v4 patch of Join push-down support for foreign tables.  This
>> patch requires Custom/Foreign join patch v7 posted by Kaigai-san.
>
> Hi,
>
> I just want to point out to the folks on this thread that the action
> in this area is happening on the other thread, about the
> custom/foreign join patch, and that Tom and I are suspecting that we
> do not have the right design here.  Your input is needed.
>
> [...]

Moved for now to the next CF as it was in state "Need review".
-- 
Michael

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Join push-down support for foreign tables

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения