Обсуждение: CREATE ROUTINE MAPPING

Поиск
Список
Период
Сортировка

CREATE ROUTINE MAPPING

От
Corey Huinker
Дата:
A few months ago, I was researching ways for formalizing calling functions on one postgres instance from another. RPC, basically. In doing so, I stumbled across an obscure part of the the SQL Standard called ROUTINE MAPPING, which is exactly what I'm looking for.

The syntax specified is, roughly:

CREATE ROUTINE MAPPING local_routine_name FOR remote_routine_spec
SERVER my_server [ OPTIONS( ... ) ]

Which isn't too different from CREATE USER MAPPING.

The idea here is that if I had a local query:

SELECT t.x, remote_func1(),  remote_func2(t.y)
FROM remote_table t
WHERE t.active = true;

that would become this query on the remote side:

SELECT t.x, local_func1(), local_func2(t.y)
FROM local_table t
WHERE t.active = true;


That was probably the main intention of this feature, but I see a different possibility there. Consider the cases:

SELECT remote_func(1,'a');

and

SELECT * FROM remote_srf(10, true);

Now we could have written remote_func() and remote_srf() in plpythonu, and it could access whatever remote data that we wanted to see, but that exposes our local server to the untrusted pl/python module as well as python process overhead.

We could create a specialized foreign data wrapper that requires a WHERE clause to include all the require parameters as predicates, essentially making every function a table, but that's awkward and unclear to an end user.

Having the ability to import functions from other servers allows us to write foreign servers that expose functions to the local database, and those foreign servers handle the bloat and risks associated with accessing that remote data.

Moreover, it would allow hosted environments (AWS, etc) that restrict the extensions that can be added to the database to still connect to those foreign data sources.

I'm hoping to submit a patch for this someday, but it touches on several areas of the codebase where I have no familiarity, so I've put forth to spark interest in the feature, to see if any similar work is underway, or if anyone can offer guidance.

Thanks in advance.

Re: CREATE ROUTINE MAPPING

От
David Fetter
Дата:
On Thu, Jan 11, 2018 at 09:37:43PM -0500, Corey Huinker wrote:
> A few months ago, I was researching ways for formalizing calling functions
> on one postgres instance from another. RPC, basically. In doing so, I
> stumbled across an obscure part of the the SQL Standard called ROUTINE
> MAPPING, which is exactly what I'm looking for.
> 
> The syntax specified is, roughly:
> 
> CREATE ROUTINE MAPPING local_routine_name FOR remote_routine_spec
> SERVER my_server [ OPTIONS( ... ) ]

[neat use cases elided]

For what it's worth, the now-defunct DBI-Link I wrote had
remote_execute(), which did many of the things you describe here, only
with no help from the rest of PostgreSQL, as it was implemented
strictly in userland.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: CREATE ROUTINE MAPPING

От
Ashutosh Bapat
Дата:
On Fri, Jan 12, 2018 at 8:07 AM, Corey Huinker <corey.huinker@gmail.com> wrote:
> A few months ago, I was researching ways for formalizing calling functions
> on one postgres instance from another. RPC, basically. In doing so, I
> stumbled across an obscure part of the the SQL Standard called ROUTINE
> MAPPING, which is exactly what I'm looking for.
>
> The syntax specified is, roughly:
>
> CREATE ROUTINE MAPPING local_routine_name FOR remote_routine_spec
> SERVER my_server [ OPTIONS( ... ) ]
>
>
> Which isn't too different from CREATE USER MAPPING.
>
> The idea here is that if I had a local query:
>
> SELECT t.x, remote_func1(),  remote_func2(t.y)
>
> FROM remote_table t
>
> WHERE t.active = true;
>
>
> that would become this query on the remote side:
>
> SELECT t.x, local_func1(), local_func2(t.y)
>
> FROM local_table t
>
> WHERE t.active = true;
>

I think this is a desired feature. Being able to call a function on
remote server through local server is often useful.

PostgreSQL allows function overloading, which means that there can be
multiple functions with same name differing in argument types. So, the
syntax has to include the input parameters or their types at least.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: CREATE ROUTINE MAPPING

От
Pavel Stehule
Дата:


2018-01-12 10:02 GMT+01:00 Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>:
On Fri, Jan 12, 2018 at 8:07 AM, Corey Huinker <corey.huinker@gmail.com> wrote:
> A few months ago, I was researching ways for formalizing calling functions
> on one postgres instance from another. RPC, basically. In doing so, I
> stumbled across an obscure part of the the SQL Standard called ROUTINE
> MAPPING, which is exactly what I'm looking for.
>
> The syntax specified is, roughly:
>
> CREATE ROUTINE MAPPING local_routine_name FOR remote_routine_spec
> SERVER my_server [ OPTIONS( ... ) ]
>
>
> Which isn't too different from CREATE USER MAPPING.
>
> The idea here is that if I had a local query:
>
> SELECT t.x, remote_func1(),  remote_func2(t.y)
>
> FROM remote_table t
>
> WHERE t.active = true;
>
>
> that would become this query on the remote side:
>
> SELECT t.x, local_func1(), local_func2(t.y)
>
> FROM local_table t
>
> WHERE t.active = true;
>

I think this is a desired feature. Being able to call a function on
remote server through local server is often useful.

PostgreSQL allows function overloading, which means that there can be
multiple functions with same name differing in argument types. So, the
syntax has to include the input parameters or their types at least.

+1

Pavel


--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: CREATE ROUTINE MAPPING

От
Corey Huinker
Дата:
PostgreSQL allows function overloading, which means that there can be
multiple functions with same name differing in argument types. So, the
syntax has to include the input parameters or their types at least.

"local_routine_name" and "remote_routine_spec" were my own paraphrasings of what the spec implies. I'm nearly certain that the local routine name, which the spec says is just an identifier, cannot have a parameter spec on it, which leaves only one other place to define it, remote_routine_spec, which wasn't defined at all. I _suppose_ parameter definitions could be pushed into options, but that'd be ugly.

Re: CREATE ROUTINE MAPPING

От
David Fetter
Дата:
On Fri, Jan 12, 2018 at 11:11:26AM -0500, Corey Huinker wrote:
> >
> > PostgreSQL allows function overloading, which means that there can
> > be multiple functions with same name differing in argument types.
> > So, the syntax has to include the input parameters or their types
> > at least.
> 
> "local_routine_name" and "remote_routine_spec" were my own
> paraphrasings of what the spec implies. I'm nearly certain that the
> local routine name, which the spec says is just an identifier,
> cannot have a parameter spec on it, which leaves only one other
> place to define it, remote_routine_spec, which wasn't defined at
> all. I _suppose_ parameter definitions could be pushed into options,
> but that'd be ugly.

In my draft of SQL:2011, which I don't think has substantive changes
to what's either in the official SQL:2011 or SQL:2016, it says:

<routine mapping definition> ::=
CREATE ROUTINE MAPPING <routine mapping name> FOR <specific routine designator>
SERVER <foreign server name> [ <generic options> ]
Syntax Rules
1) Let FSN be the <foreign server name>. Let RMN be the <routine mapping name>.
2) The catalog identified by the explicit or implicit catalog name of FSN shall include a foreign server
descriptor whose foreign server name is equivalent to FSN.
3) The SQL-environment shall not include a routine mapping descriptor whose routine mapping name is
RMN.
4) Let R be the SQL-invoked routine identified by the <specific
routine designator>. R shall identify an SQL-invoked regular function.

It goes on from there, but I think there's a reasonable interpretation
of this which allows us to use the same syntax as CREATE
(FUNCTION|PROCEDURE), apart from the body, e.g.:

CREATE ROUTINE MAPPING local_routine_name
FOR (FUNCTION | PROCEDURE) remote_routine_name ( [ [ argmode ] [ argname ] argtype [ { DEFAULT | = } default_expr ] [,
...]] )
 
   [ RETURNS rettype
     | RETURNS TABLE ( column_name column_type [, ...] ) ]
SERVER foreign_server_name 
   [ (option [, ...]) ]

Does that seem like too broad an interpretation?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: CREATE ROUTINE MAPPING

От
Corey Huinker
Дата:


It goes on from there, but I think there's a reasonable interpretation
of this which allows us to use the same syntax as CREATE
(FUNCTION|PROCEDURE), apart from the body, e.g.:

CREATE ROUTINE MAPPING local_routine_name
FOR (FUNCTION | PROCEDURE) remote_routine_name ( [ [ argmode ] [ argname ] argtype [ { DEFAULT | = } default_expr ] [, ...] ] )
   [ RETURNS rettype
     | RETURNS TABLE ( column_name column_type [, ...] ) ]
SERVER foreign_server_name
   [ (option [, ...]) ]

Does that seem like too broad an interpretation?

That's really interesting. I didn't think to look in the definition of CREATE FUNCTION to see if a SERVER option popped in there, but seems like a more accessible way to introduce the notion of remote functions, because I talked to a few developers about this before posting to the list, and only one had ever heard of ROUTINE MAPPING and had no clear recollection of it. An option on CREATE FUNCTION is going to get noticed (and used!) a lot sooner.

Having said that, I think syntactically we have to implement CREATE ROUTINE MAPPING, even if it is just translated to a CREATE FUNCTION call.

In either case, I suspected that pg_proc would need a nullable srvid column pointing to pg_foreign_server, and possibly a new row in pg_language for 'external'. I had entertained having a pg_routine_mappings table like pg_user_mappings, and we still could, if the proc's language of 'external' clued the planner to look for the mapping. I can see arguments for either approach.

Before anyone asks, I looked for, and did not find, any suggestion of IMPORT FOREIGN ROUTINE a la IMPORT FOREIGN SCHEMA, so as of yet there wouldn't be any way to grab all the functions that .a foreign server is offering up.

Re: CREATE ROUTINE MAPPING

От
David Fetter
Дата:
On Fri, Jan 12, 2018 at 02:29:53PM -0500, Corey Huinker wrote:
> >
> >
> >
> > It goes on from there, but I think there's a reasonable interpretation
> > of this which allows us to use the same syntax as CREATE
> > (FUNCTION|PROCEDURE), apart from the body, e.g.:
> >
> > CREATE ROUTINE MAPPING local_routine_name
> > FOR (FUNCTION | PROCEDURE) remote_routine_name ( [ [ argmode ] [ argname ]
> > argtype [ { DEFAULT | = } default_expr ] [, ...] ] )
> >    [ RETURNS rettype
> >      | RETURNS TABLE ( column_name column_type [, ...] ) ]
> > SERVER foreign_server_name
> >    [ (option [, ...]) ]
> >
> > Does that seem like too broad an interpretation?
> >
> 
> That's really interesting. I didn't think to look in the definition of
> CREATE FUNCTION to see if a SERVER option popped in there, but seems like a
> more accessible way to introduce the notion of remote functions,

It does indeed.  Adding the functionality to CREATE
(FUNCTION|PROCEDURE) seems like a *much* better idea than trying to
wedge it into the CREATE ROUTINE MAPPING syntax.

> because I talked to a few developers about this before posting to
> the list, and only one had ever heard of ROUTINE MAPPING and had no
> clear recollection of it.  An option on CREATE FUNCTION is going to
> get noticed (and used!) a lot sooner.

+1

> Having said that, I think syntactically we have to implement CREATE ROUTINE
> MAPPING, even if it is just translated to a CREATE FUNCTION call.
> 
> In either case, I suspected that pg_proc would need a nullable srvid column
> pointing to pg_foreign_server, and possibly a new row in pg_language for
> 'external'.

Makes a lot of sense.

> I had entertained having a pg_routine_mappings table like
> pg_user_mappings, and we still could, if the proc's language of
> 'external' clued the planner to look for the mapping. I can see
> arguments for either approach.

It would be good to have them in the catalog somehow if we make CREATE
ROUTINE MAPPING a DDL.  If I've read the standard correctly, there are
parts of information_schema which come into play for those routine
mappings.

> Before anyone asks, I looked for, and did not find, any suggestion of
> IMPORT FOREIGN ROUTINE a la IMPORT FOREIGN SCHEMA, so as of yet there
> wouldn't be any way to grab all the functions that .a foreign server is
> offering up.

How about making an option to IMPORT FOREIGN SCHEMA do it?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: CREATE ROUTINE MAPPING

От
Corey Huinker
Дата:


> > CREATE ROUTINE MAPPING local_routine_name
> > FOR (FUNCTION | PROCEDURE) remote_routine_name ( [ [ argmode ] [ argname ]
> > argtype [ { DEFAULT | = } default_expr ] [, ...] ] )
> >    [ RETURNS rettype
> >      | RETURNS TABLE ( column_name column_type [, ...] ) ]
> > SERVER foreign_server_name
> >    [ (option [, ...]) ]
> >
> > Does that seem like too broad an interpretation?
> >
>
> I had entertained having a pg_routine_mappings table like
> pg_user_mappings, and we still could, if the proc's language of
> 'external' clued the planner to look for the mapping. I can see
> arguments for either approach.

It would be good to have them in the catalog somehow if we make CREATE
ROUTINE MAPPING a DDL.  If I've read the standard correctly, there are
parts of information_schema which come into play for those routine
mappings.

> Before anyone asks, I looked for, and did not find, any suggestion of
> IMPORT FOREIGN ROUTINE a la IMPORT FOREIGN SCHEMA, so as of yet there
> wouldn't be any way to grab all the functions that .a foreign server is
> offering up.

How about making an option to IMPORT FOREIGN SCHEMA do it?



Ok, so the steps seem to be:
1. settle on syntax.
2. determine data dictionary structures
3. parse and create those structures
4. "handle" external functions locally
5. provide structures passed to FDW handlers so that they can handle external functions
6. implement those handlers in postgres_fdw

#1 is largely prescribed for us, though I'm curious as to how the CRM statements I've made up in examples above would look like as CREATE FUNCTION ... SERVER ...

#2 deserves a lot of debate, but probably mostly hinges on the new "language" and how to associate a pg_proc entry with a pg_foreign_server

#3 i'm guessing this is a lot of borrowing code from CREATE ROUTINE MAPPING but is otherwise pretty straightforward.

#4 an external function obviously cannot be executed locally, doing so means that the planner failed to push it down, so this is probably stub-error functions

#5 These functions would essentially be passed in the same as foreign columns with the "name" as "f(a,b,4)", and the burden of forming the remote query is on the FDW

Which gets tricky. What should happen in simple situations is obvious:

SELECT t.x, remote_func1(),  remote_func2(t.y)
FROM remote_table t
WHERE t.active = true;

that would become this query on the remote side:

SELECT t.x, local_func1(), local_func2(t.y)
FROM local_table t
WHERE t.active = true;

And it's still simple when local functions consume remote input


SELECT local_func1(remote_func1(r.x)) FROM remote_table r WHERE r.active = true;


But other situations seem un-handle-able to me:

SELECT remote_func1(l.x) FROM local_table l WHERE l.active = true;


In those cases, at least initially, I think the FDW handler is right to raise an error, because the function inputs are unknowable at query time, and the inputs cannot also be pushed down to the remote server. That might not be common, but I can see situations like this:

SELECT r.*
FROM remote_srf( ( SELECT remote_code_value FROM local_table_of_remote_codes WHERE local_code_value = 'xyz' ) ) r;

and we would want things like that to work. Currently is similar table-situations the FDW has no choice but to fetch the entire table and filter locally. That's good for tables, whose contents are knowable, but the set of possible function inputs is unreasonably large. The current workaround in table-land is to run the inner query locally, and present the result at a constant to a follow-up query, so maybe that's what we have to do here, at least initially.

#6 is where the FDW either does the translation or rejects the notion that functions can be pushed down, either outright or based on the usage of the function in the query.


I'm doing this thinking on the mailing list in the hopes that it evokes suggestions, warnings, suggested code samples, and of course, help.
 

Re: CREATE ROUTINE MAPPING

От
David Fetter
Дата:
On Wed, Jan 17, 2018 at 11:09:19AM -0500, Corey Huinker wrote:
> > > CREATE ROUTINE MAPPING local_routine_name
> > > > FOR (FUNCTION | PROCEDURE) remote_routine_name ( [ [ argmode ] [
> > argname ]
> > > > argtype [ { DEFAULT | = } default_expr ] [, ...] ] )
> > > >    [ RETURNS rettype
> > > >      | RETURNS TABLE ( column_name column_type [, ...] ) ]
> > > > SERVER foreign_server_name
> > > >    [ (option [, ...]) ]
> > > >
> > > > Does that seem like too broad an interpretation?
> > > >
> > >
> > > I had entertained having a pg_routine_mappings table like
> > > pg_user_mappings, and we still could, if the proc's language of
> > > 'external' clued the planner to look for the mapping. I can see
> > > arguments for either approach.
> >
> > It would be good to have them in the catalog somehow if we make CREATE
> > ROUTINE MAPPING a DDL.  If I've read the standard correctly, there are
> > parts of information_schema which come into play for those routine
> > mappings.
> >
> > > Before anyone asks, I looked for, and did not find, any suggestion of
> > > IMPORT FOREIGN ROUTINE a la IMPORT FOREIGN SCHEMA, so as of yet there
> > > wouldn't be any way to grab all the functions that .a foreign server is
> > > offering up.
> >
> > How about making an option to IMPORT FOREIGN SCHEMA do it?
> >
> >
> 
> Ok, so the steps seem to be:
> 1. settle on syntax.
> 2. determine data dictionary structures
> 3. parse and create those structures
> 4. "handle" external functions locally
> 5. provide structures passed to FDW handlers so that they can handle
> external functions
> 6. implement those handlers in postgres_fdw
> 
> #1 is largely prescribed for us, though I'm curious as to how the CRM
> statements I've made up in examples above would look like as CREATE
> FUNCTION ... SERVER ...
> 
> #2 deserves a lot of debate, but probably mostly hinges on the new
> "language" and how to associate a pg_proc entry with a pg_foreign_server
> 
> #3 i'm guessing this is a lot of borrowing code from CREATE ROUTINE MAPPING
> but is otherwise pretty straightforward.
> 
> #4 an external function obviously cannot be executed locally, doing so
> means that the planner failed to push it down, so this is probably
> stub-error functions
> 
> #5 These functions would essentially be passed in the same as foreign
> columns with the "name" as "f(a,b,4)", and the burden of forming the remote
> query is on the FDW
> 
> Which gets tricky. What should happen in simple situations is obvious:
> 
> SELECT t.x, remote_func1(),  remote_func2(t.y)
> 
> FROM remote_table t
> 
> WHERE t.active = true;
> 
> 
> that would become this query on the remote side:
> 
> SELECT t.x, local_func1(), local_func2(t.y)
> 
> FROM local_table t
> 
> WHERE t.active = true;
> 
> And it's still simple when local functions consume remote input
> 
> 
> SELECT local_func1(remote_func1(r.x)) FROM remote_table r WHERE r.active =
> true;
> 
> 
> But other situations seem un-handle-able to me:
> 
> SELECT remote_func1(l.x) FROM local_table l WHERE l.active = true;

Do we have any way, or any plan to make a way, to push the set (SELECT
x FROM local_table WHERE active = true) to the remote side for
execution there?  Obviously, there are foreign DBs that couldn't
support this, but I'm guessing they wouldn't have much by way of UDFs
either.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: CREATE ROUTINE MAPPING

От
Corey Huinker
Дата:

>
> But other situations seem un-handle-able to me:
>
> SELECT remote_func1(l.x) FROM local_table l WHERE l.active = true;

Do we have any way, or any plan to make a way, to push the set (SELECT
x FROM local_table WHERE active = true) to the remote side for
execution there?  Obviously, there are foreign DBs that couldn't
support this, but I'm guessing they wouldn't have much by way of UDFs
either.

No. The remote query has to be generated at planning time, so it can't make predicates out of anything that can't be resolved into constants by the planner itself. The complexities of doing so would be excessive, far better to let the application developer split the queries up because they know better which parts have to resolve first.


Re: CREATE ROUTINE MAPPING

От
David Fetter
Дата:
On Thu, Jan 18, 2018 at 04:09:13PM -0500, Corey Huinker wrote:
> >
> >
> > >
> > > But other situations seem un-handle-able to me:
> > >
> > > SELECT remote_func1(l.x) FROM local_table l WHERE l.active = true;
> >
> > Do we have any way, or any plan to make a way, to push the set (SELECT
> > x FROM local_table WHERE active = true) to the remote side for
> > execution there?  Obviously, there are foreign DBs that couldn't
> > support this, but I'm guessing they wouldn't have much by way of UDFs
> > either.
> >
> 
> No. The remote query has to be generated at planning time, so it can't make
> predicates out of anything that can't be resolved into constants by the
> planner itself. The complexities of doing so would be excessive, far better
> to let the application developer split the queries up because they know
> better which parts have to resolve first.

So Corey and I, with lots of inputs from Andrew Gierth and Matheus
Oliveira, have come up with a sketch of how to do this, to wit:

- Extend CREATE FUNCTION to take either FOREIGN and SERVER or AS and
  LANGUAGE as parameters, but not both. This seems simpler, at least
  in a proof of concept, than creating SQL standard compliant grammar
  out of whole cloth.  The SQL standard grammar could be layered in
  later via the rewriter if this turns out to work.

- In pg_proc, store foreign functions as having a new language,
  sql_med, which doesn't actually exist.  This "language" would
  function as a hint to the planner.

- Add a new system catalog for foreign functions that references
  pg_proc and pg_foreign_server. Writing to it would also do the usual
  stuff with pg_depend.

- During planning, at least to start, we'd ensure that foreign
  functions can only take arguments on the same server.

- Once it's established that the combinations could actually work,
  execution gets pushed to the foreign server(s)

What say?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: CREATE ROUTINE MAPPING

От
Ashutosh Bapat
Дата:
On Thu, Jan 25, 2018 at 10:43 AM, David Fetter <david@fetter.org> wrote:
> On Thu, Jan 18, 2018 at 04:09:13PM -0500, Corey Huinker wrote:
>> >
>> >
>> > >
>> > > But other situations seem un-handle-able to me:
>> > >
>> > > SELECT remote_func1(l.x) FROM local_table l WHERE l.active = true;
>> >
>> > Do we have any way, or any plan to make a way, to push the set (SELECT
>> > x FROM local_table WHERE active = true) to the remote side for
>> > execution there?  Obviously, there are foreign DBs that couldn't
>> > support this, but I'm guessing they wouldn't have much by way of UDFs
>> > either.
>> >
>>
>> No. The remote query has to be generated at planning time, so it can't make
>> predicates out of anything that can't be resolved into constants by the
>> planner itself. The complexities of doing so would be excessive, far better
>> to let the application developer split the queries up because they know
>> better which parts have to resolve first.
>
> So Corey and I, with lots of inputs from Andrew Gierth and Matheus
> Oliveira, have come up with a sketch of how to do this, to wit:
>
> - Extend CREATE FUNCTION to take either FOREIGN and SERVER or AS and
>   LANGUAGE as parameters, but not both. This seems simpler, at least
>   in a proof of concept, than creating SQL standard compliant grammar
>   out of whole cloth.  The SQL standard grammar could be layered in
>   later via the rewriter if this turns out to work.
>
> - In pg_proc, store foreign functions as having a new language,
>   sql_med, which doesn't actually exist.  This "language" would
>   function as a hint to the planner.
>
> - Add a new system catalog for foreign functions that references
>   pg_proc and pg_foreign_server. Writing to it would also do the usual
>   stuff with pg_depend.
>
> - During planning, at least to start, we'd ensure that foreign
>   functions can only take arguments on the same server.

May be I am going in details, not expected at this stage. Right now
FDWs have a notion of shippability - i.e. certain expressions can be
evaluated on the remote server. Shippable expressions are pushed down
to the foreign server, but that's optional. Unshippable expressions
however can not be pushed down to the foreign server. With this
change, we will have a new notion of shippability where a
function/expression must be shipped to the foreign server. As long as
these strict-shippable expressions are part of shippable expressions,
things work as they are today, but as an earlier mail by Corey shows,
if those are expressions are not part of shippable expressions, they
need to be evaluated on foreign server apart from the query that gets
pushed down. You seem to be suggesting that we do not implement it
right now, which is fine. But whatever design we chose should be
extensible to do that.

A possible way to implement this may be to implement sql-med language
handler which takes the responsibility to interact with FDW and
evaluate the function. That way we can use existing function
evaluation infrastructure.

>
> - Once it's established that the combinations could actually work,
>   execution gets pushed to the foreign server(s)
>

Overall this structure looks ok to me.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: CREATE ROUTINE MAPPING

От
Masahiko Sawada
Дата:
On Thu, Jan 25, 2018 at 2:13 PM, David Fetter <david@fetter.org> wrote:
> On Thu, Jan 18, 2018 at 04:09:13PM -0500, Corey Huinker wrote:
>> >
>> >
>> > >
>> > > But other situations seem un-handle-able to me:
>> > >
>> > > SELECT remote_func1(l.x) FROM local_table l WHERE l.active = true;
>> >
>> > Do we have any way, or any plan to make a way, to push the set (SELECT
>> > x FROM local_table WHERE active = true) to the remote side for
>> > execution there?  Obviously, there are foreign DBs that couldn't
>> > support this, but I'm guessing they wouldn't have much by way of UDFs
>> > either.
>> >
>>
>> No. The remote query has to be generated at planning time, so it can't make
>> predicates out of anything that can't be resolved into constants by the
>> planner itself. The complexities of doing so would be excessive, far better
>> to let the application developer split the queries up because they know
>> better which parts have to resolve first.
>
> So Corey and I, with lots of inputs from Andrew Gierth and Matheus
> Oliveira, have come up with a sketch of how to do this, to wit:
>
> - Extend CREATE FUNCTION to take either FOREIGN and SERVER or AS and
>   LANGUAGE as parameters, but not both. This seems simpler, at least
>   in a proof of concept, than creating SQL standard compliant grammar
>   out of whole cloth.  The SQL standard grammar could be layered in
>   later via the rewriter if this turns out to work.

I'm also interested in this feature. While studying this feature, I
understood that this feature just pair a local function with a remote
function, not means that creates a kind of virtual function that can
be invoked on only foreign servers. For example, if we execute the
following SQL the local_func() is invoked in local because the col1
column of local_table is referenced by it.

SELECT * FROM local_table l WHERE local_func(l.col1) = 1;

On the other hand, suppose we have the following routine mapping,

CREATE ROUTINE MAPPING rmap FOR local_func(integer) OPTIONS
(remote_func_schema = 'myschema', remote_func_name = 'remote_func');

and execute the similar SQL for a foreign table. We will get the
following remote SQL.

- Local SQL
SELECT * FROM foreign_table f WHERE local_func(f.col1) = 1;

- Remote SQL
SELECT * FROM foreign_table f WHERE my_schema.remote_func(f.col1) = 1;

In this concept, the CREATE ROUTINE MAPPING doesn't need to specify
the return type of function but must specify the existing function in
the local PostgreSQL. The mapped remote function is expected to have
the same properly(arguments, return type etc) as the local function. I
might be missing something, please give me feedback.

Please find a attached PoC patch of ROUTINE MAPPING feature. This
patch is missing many things such as the doc and the shippability
supports but this patch adds the new system catalog pg_routine_mapping
with three attributes: name, procid, serverid and enables FDWs to
refer this mapping and and to replace the function.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Вложения

Re: CREATE ROUTINE MAPPING

От
David Fetter
Дата:
On Fri, Aug 31, 2018 at 05:18:26PM +0900, Masahiko Sawada wrote:
> On Thu, Jan 25, 2018 at 2:13 PM, David Fetter <david@fetter.org> wrote:
> > On Thu, Jan 18, 2018 at 04:09:13PM -0500, Corey Huinker wrote:
> >> >
> >> >
> >> > >
> >> > > But other situations seem un-handle-able to me:
> >> > >
> >> > > SELECT remote_func1(l.x) FROM local_table l WHERE l.active = true;
> >> >
> >> > Do we have any way, or any plan to make a way, to push the set (SELECT
> >> > x FROM local_table WHERE active = true) to the remote side for
> >> > execution there?  Obviously, there are foreign DBs that couldn't
> >> > support this, but I'm guessing they wouldn't have much by way of UDFs
> >> > either.
> >> >
> >>
> >> No. The remote query has to be generated at planning time, so it can't make
> >> predicates out of anything that can't be resolved into constants by the
> >> planner itself. The complexities of doing so would be excessive, far better
> >> to let the application developer split the queries up because they know
> >> better which parts have to resolve first.
> >
> > So Corey and I, with lots of inputs from Andrew Gierth and Matheus
> > Oliveira, have come up with a sketch of how to do this, to wit:
> >
> > - Extend CREATE FUNCTION to take either FOREIGN and SERVER or AS and
> >   LANGUAGE as parameters, but not both. This seems simpler, at least
> >   in a proof of concept, than creating SQL standard compliant grammar
> >   out of whole cloth.  The SQL standard grammar could be layered in
> >   later via the rewriter if this turns out to work.
> 
> I'm also interested in this feature. While studying this feature, I
> understood that this feature just pair a local function with a remote
> function, not means that creates a kind of virtual function that can
> be invoked on only foreign servers. For example, if we execute the
> following SQL the local_func() is invoked in local because the col1
> column of local_table is referenced by it.
> 
> SELECT * FROM local_table l WHERE local_func(l.col1) = 1;
> 
> On the other hand, suppose we have the following routine mapping,
> 
> CREATE ROUTINE MAPPING rmap FOR local_func(integer) OPTIONS
> (remote_func_schema = 'myschema', remote_func_name = 'remote_func');
> 
> and execute the similar SQL for a foreign table. We will get the
> following remote SQL.
> 
> - Local SQL
> SELECT * FROM foreign_table f WHERE local_func(f.col1) = 1;
> 
> - Remote SQL
> SELECT * FROM foreign_table f WHERE my_schema.remote_func(f.col1) = 1;
> 
> In this concept, the CREATE ROUTINE MAPPING doesn't need to specify
> the return type of function but must specify the existing function in
> the local PostgreSQL. The mapped remote function is expected to have
> the same properly(arguments, return type etc) as the local function. I
> might be missing something, please give me feedback.
> 
> Please find a attached PoC patch of ROUTINE MAPPING feature. This
> patch is missing many things such as the doc and the shippability
> supports but this patch adds the new system catalog pg_routine_mapping
> with three attributes: name, procid, serverid and enables FDWs to
> refer this mapping and and to replace the function.

Sawada-san,

Thanks very much for sending this. I just tried to compile it, and
got:

make
make -C ./src/backend generated-headers
make[1]: Entering directory '/home/shackle/pggit/postgresql/src/backend'
make -C catalog distprep generated-header-symlinks
make[2]: Entering directory '/home/shackle/pggit/postgresql/src/backend/catalog'
make[2]: *** No rule to make target '../../../src/include/catalog/pg_routine_mapping.h', needed by 'bki-stamp'.  Stop.
make[2]: Leaving directory '/home/shackle/pggit/postgresql/src/backend/catalog'
make[1]: *** [Makefile:141: submake-catalog-headers] Error 2
make[1]: Leaving directory '/home/shackle/pggit/postgresql/src/backend'
make: *** [src/Makefile.global:369: submake-generated-headers] Error 2

Was there a file missing from the patch?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: CREATE ROUTINE MAPPING

От
Masahiko Sawada
Дата:
On Tue, Sep 4, 2018 at 5:48 AM, David Fetter <david@fetter.org> wrote:
> On Fri, Aug 31, 2018 at 05:18:26PM +0900, Masahiko Sawada wrote:
>> On Thu, Jan 25, 2018 at 2:13 PM, David Fetter <david@fetter.org> wrote:
>> > On Thu, Jan 18, 2018 at 04:09:13PM -0500, Corey Huinker wrote:
>> >> >
>> >> >
>> >> > >
>> >> > > But other situations seem un-handle-able to me:
>> >> > >
>> >> > > SELECT remote_func1(l.x) FROM local_table l WHERE l.active = true;
>> >> >
>> >> > Do we have any way, or any plan to make a way, to push the set (SELECT
>> >> > x FROM local_table WHERE active = true) to the remote side for
>> >> > execution there?  Obviously, there are foreign DBs that couldn't
>> >> > support this, but I'm guessing they wouldn't have much by way of UDFs
>> >> > either.
>> >> >
>> >>
>> >> No. The remote query has to be generated at planning time, so it can't make
>> >> predicates out of anything that can't be resolved into constants by the
>> >> planner itself. The complexities of doing so would be excessive, far better
>> >> to let the application developer split the queries up because they know
>> >> better which parts have to resolve first.
>> >
>> > So Corey and I, with lots of inputs from Andrew Gierth and Matheus
>> > Oliveira, have come up with a sketch of how to do this, to wit:
>> >
>> > - Extend CREATE FUNCTION to take either FOREIGN and SERVER or AS and
>> >   LANGUAGE as parameters, but not both. This seems simpler, at least
>> >   in a proof of concept, than creating SQL standard compliant grammar
>> >   out of whole cloth.  The SQL standard grammar could be layered in
>> >   later via the rewriter if this turns out to work.
>>
>> I'm also interested in this feature. While studying this feature, I
>> understood that this feature just pair a local function with a remote
>> function, not means that creates a kind of virtual function that can
>> be invoked on only foreign servers. For example, if we execute the
>> following SQL the local_func() is invoked in local because the col1
>> column of local_table is referenced by it.
>>
>> SELECT * FROM local_table l WHERE local_func(l.col1) = 1;
>>
>> On the other hand, suppose we have the following routine mapping,
>>
>> CREATE ROUTINE MAPPING rmap FOR local_func(integer) OPTIONS
>> (remote_func_schema = 'myschema', remote_func_name = 'remote_func');
>>
>> and execute the similar SQL for a foreign table. We will get the
>> following remote SQL.
>>
>> - Local SQL
>> SELECT * FROM foreign_table f WHERE local_func(f.col1) = 1;
>>
>> - Remote SQL
>> SELECT * FROM foreign_table f WHERE my_schema.remote_func(f.col1) = 1;
>>
>> In this concept, the CREATE ROUTINE MAPPING doesn't need to specify
>> the return type of function but must specify the existing function in
>> the local PostgreSQL. The mapped remote function is expected to have
>> the same properly(arguments, return type etc) as the local function. I
>> might be missing something, please give me feedback.
>>
>> Please find a attached PoC patch of ROUTINE MAPPING feature. This
>> patch is missing many things such as the doc and the shippability
>> supports but this patch adds the new system catalog pg_routine_mapping
>> with three attributes: name, procid, serverid and enables FDWs to
>> refer this mapping and and to replace the function.
>
> Sawada-san,
>
> Thanks very much for sending this. I just tried to compile it, and
> got:
>
> make
> make -C ./src/backend generated-headers
> make[1]: Entering directory '/home/shackle/pggit/postgresql/src/backend'
> make -C catalog distprep generated-header-symlinks
> make[2]: Entering directory '/home/shackle/pggit/postgresql/src/backend/catalog'
> make[2]: *** No rule to make target '../../../src/include/catalog/pg_routine_mapping.h', needed by 'bki-stamp'.
Stop.
> make[2]: Leaving directory '/home/shackle/pggit/postgresql/src/backend/catalog'
> make[1]: *** [Makefile:141: submake-catalog-headers] Error 2
> make[1]: Leaving directory '/home/shackle/pggit/postgresql/src/backend'
> make: *** [src/Makefile.global:369: submake-generated-headers] Error 2
>
> Was there a file missing from the patch?
>

Oops, sorry I missed the header file. Attached the PoC patches again.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Вложения

Re: CREATE ROUTINE MAPPING

От
Kyotaro HORIGUCHI
Дата:
Hello.

At Tue, 4 Sep 2018 09:34:21 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in
<CAD21AoCBfTKRFPwboss4xVEoVwUmi0gKBgwsWQijviJP3hScwQ@mail.gmail.com>
> On Tue, Sep 4, 2018 at 5:48 AM, David Fetter <david@fetter.org> wrote:
> > On Fri, Aug 31, 2018 at 05:18:26PM +0900, Masahiko Sawada wrote:
> >> On Thu, Jan 25, 2018 at 2:13 PM, David Fetter <david@fetter.org> wrote:
> >> > On Thu, Jan 18, 2018 at 04:09:13PM -0500, Corey Huinker wrote:
> >> >> >
> >> >> >
> >> >> > >
> >> >> > > But other situations seem un-handle-able to me:
> >> >> > >
> >> >> > > SELECT remote_func1(l.x) FROM local_table l WHERE l.active = true;
> >> >> >
> >> >> > Do we have any way, or any plan to make a way, to push the set (SELECT
> >> >> > x FROM local_table WHERE active = true) to the remote side for
> >> >> > execution there?  Obviously, there are foreign DBs that couldn't
> >> >> > support this, but I'm guessing they wouldn't have much by way of UDFs
> >> >> > either.
> >> >> >
> >> >>
> >> >> No. The remote query has to be generated at planning time, so it can't make
> >> >> predicates out of anything that can't be resolved into constants by the
> >> >> planner itself. The complexities of doing so would be excessive, far better
> >> >> to let the application developer split the queries up because they know
> >> >> better which parts have to resolve first.
> >> >
> >> > So Corey and I, with lots of inputs from Andrew Gierth and Matheus
> >> > Oliveira, have come up with a sketch of how to do this, to wit:
> >> >
> >> > - Extend CREATE FUNCTION to take either FOREIGN and SERVER or AS and
> >> >   LANGUAGE as parameters, but not both. This seems simpler, at least
> >> >   in a proof of concept, than creating SQL standard compliant grammar
> >> >   out of whole cloth.  The SQL standard grammar could be layered in
> >> >   later via the rewriter if this turns out to work.
> >>
> >> I'm also interested in this feature. While studying this feature, I
> >> understood that this feature just pair a local function with a remote
> >> function, not means that creates a kind of virtual function that can
> >> be invoked on only foreign servers. For example, if we execute the
> >> following SQL the local_func() is invoked in local because the col1
> >> column of local_table is referenced by it.

Do you mean that ISO/IEC 9075-9:2016 (right?) is defining that
(and we must follow it)?  Or does it comes by referring to
something like [1]? As far as I see David's mail upthread,
OPTIONS is not precisely defined.

[1] http://cs.unibo.it/~montesi/CBD/Articoli/SQL-MED.pdf

Unfortunately I don't have access to the document nor concrete
use cases. With a rough idea of "remote mapping", I can guess the
followng four use cases.  Each example syntax is just a guess
without any consideration on implementability or other
restrictions. The patch looks currently covering B.

A. Just notify a function can be just pushed down.

  ex. SELECT foo(1, 'bar');  Remote: SELECT foo(1, 'bar');

   CREATE ROUTINE MAPPING foomap FOR foo(int, text) SERVER rem;
   (or same as B)

B. Replace function name with the remote equivalent.

  ex. SELECT foo(1, 'bar');  Remote: SELECT hoge(1, 'bar');

   CREATE ROUTINE MAPPING foomap FOR foo(int, text) SERVER rem
   OPTIONS (remote_func_name 'hoge'));

C. Adjust function specification with remote.

  ex. SELECT foo(1, 'bar');  Remote: SELECT hoge('bar', 1, true);

   CREATE ROUTINE MAPPING foomap FOR foo(int, text) SERVER rem
   OPTIONS (remote_expression 'hoge($2,$1,true)');
  
D. Replace with an equivalent remote expression.

  ex. SELECT foo(1, 'bar');  Remote: SELECT ('bar' || to_char(1 % 10));

   CREATE ROUTINE MAPPING foomap FOR foo(int, text) SERVER rem
      OPTIONS (remote_expression '$2 || to_char($1 % 10)');


I haven't looked the patch in depth, but the core side looks
generic and the FDW side is extensible to A, C and D. I think B
is enough as a starter. I don't mean that we should implement all
of them. They are just possibilities.


I have some comments on the patch.

It doesn't seem working. Am I missing something?
====
create server sv1 foreign data wrapper postgres_fdw options (host '/tmp', port '5432', dbname 'postgres');
create table lt (a int);
create foreign table ft (a int) server sv1 options (table_name 'lt');
create function lhoge(int) returns int as 'begin return $1 * 2; end;' language plpgsql;
create routine mapping rm1 for function lhoge(int) server sv1 options (remote_func_name 'rhoge');
explain verbose select * from ft where a = lhoge(3);
                            QUERY PLAN                            
------------------------------------------------------------------
 Foreign Scan on public.ft  (cost=100.00..936.31 rows=15 width=4)
   Output: a
   Filter: (ft.a = lhoge(3))
   Remote SQL: SELECT a FROM public.lt
(4 rows)
====

Perhaps it cannot push down simple SQL local functions. (I'm not
sure we should do that.)

Can't we specify remote schema in remote_func_name just as
(remote_func_name "fooschema.funcname")?

Can't we provide the syntax without making MAPPING reserved?

 Chainging the syntax for ALTER/DROP ROUTINE MAPPING like USER
 MAPPING, specifically to ALTER/DROP ROUTINE MAPPING FOR FUNCTION
 fname SERVER sname would evade the conflict.

Can't we just push down the function itself with no option defined?

  CREATE ROUTINE MAPPING rm1 FOR FUNCTION pg_backend_pid() SERVER sv1;

funcid is not used in objectaddress.c.

(The patch is missing a LF in a comment in lsyscache.c)



regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: CREATE ROUTINE MAPPING

От
Hannu Krosing
Дата:
Hi Corey

Have you looked at pl/proxy ?

It does this and then some (sharding) 

It actually started out as a set of pl/pythonu functions, but then got formalized into a full extension language for defining remote (potentially sharded) function calls


Best Regards
Hannu Krosng



On Fri, 12 Jan 2018 at 03:38, Corey Huinker <corey.huinker@gmail.com> wrote:
A few months ago, I was researching ways for formalizing calling functions on one postgres instance from another. RPC, basically. In doing so, I stumbled across an obscure part of the the SQL Standard called ROUTINE MAPPING, which is exactly what I'm looking for.

The syntax specified is, roughly:

CREATE ROUTINE MAPPING local_routine_name FOR remote_routine_spec
SERVER my_server [ OPTIONS( ... ) ]

Which isn't too different from CREATE USER MAPPING.

The idea here is that if I had a local query:

SELECT t.x, remote_func1(),  remote_func2(t.y)
FROM remote_table t
WHERE t.active = true;

that would become this query on the remote side:

SELECT t.x, local_func1(), local_func2(t.y)
FROM local_table t
WHERE t.active = true;


That was probably the main intention of this feature, but I see a different possibility there. Consider the cases:

SELECT remote_func(1,'a');

and

SELECT * FROM remote_srf(10, true);

Now we could have written remote_func() and remote_srf() in plpythonu, and it could access whatever remote data that we wanted to see, but that exposes our local server to the untrusted pl/python module as well as python process overhead.

We could create a specialized foreign data wrapper that requires a WHERE clause to include all the require parameters as predicates, essentially making every function a table, but that's awkward and unclear to an end user.

Having the ability to import functions from other servers allows us to write foreign servers that expose functions to the local database, and those foreign servers handle the bloat and risks associated with accessing that remote data.

Moreover, it would allow hosted environments (AWS, etc) that restrict the extensions that can be added to the database to still connect to those foreign data sources.

I'm hoping to submit a patch for this someday, but it touches on several areas of the codebase where I have no familiarity, so I've put forth to spark interest in the feature, to see if any similar work is underway, or if anyone can offer guidance.

Thanks in advance.

Re: CREATE ROUTINE MAPPING

От
Corey Huinker
Дата:
On Mon, Sep 10, 2018 at 3:28 AM Hannu Krosing <hannu.krosing@2ndquadrant.com> wrote:
Hi Corey

Have you looked at pl/proxy ?

I have, a long while ago.
 
It does this and then some (sharding) 

PL/proxy isn't a part of the SQL Standard.
PL/proxy only connects to other libpq-speaking databases.
The hope with routine mapping is that other data sources that do not easily conform to a rows-and-columns metaphor can still expose their data to postgresql.

 

Re: CREATE ROUTINE MAPPING

От
Masahiko Sawada
Дата:
Thank you for the comment.

On Mon, Sep 10, 2018 at 4:16 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> Hello.
>
> At Tue, 4 Sep 2018 09:34:21 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in
<CAD21AoCBfTKRFPwboss4xVEoVwUmi0gKBgwsWQijviJP3hScwQ@mail.gmail.com>
>> On Tue, Sep 4, 2018 at 5:48 AM, David Fetter <david@fetter.org> wrote:
>> > On Fri, Aug 31, 2018 at 05:18:26PM +0900, Masahiko Sawada wrote:
>> >> On Thu, Jan 25, 2018 at 2:13 PM, David Fetter <david@fetter.org> wrote:
>> >> > On Thu, Jan 18, 2018 at 04:09:13PM -0500, Corey Huinker wrote:
>> >> >> >
>> >> >> >
>> >> >> > >
>> >> >> > > But other situations seem un-handle-able to me:
>> >> >> > >
>> >> >> > > SELECT remote_func1(l.x) FROM local_table l WHERE l.active = true;
>> >> >> >
>> >> >> > Do we have any way, or any plan to make a way, to push the set (SELECT
>> >> >> > x FROM local_table WHERE active = true) to the remote side for
>> >> >> > execution there?  Obviously, there are foreign DBs that couldn't
>> >> >> > support this, but I'm guessing they wouldn't have much by way of UDFs
>> >> >> > either.
>> >> >> >
>> >> >>
>> >> >> No. The remote query has to be generated at planning time, so it can't make
>> >> >> predicates out of anything that can't be resolved into constants by the
>> >> >> planner itself. The complexities of doing so would be excessive, far better
>> >> >> to let the application developer split the queries up because they know
>> >> >> better which parts have to resolve first.
>> >> >
>> >> > So Corey and I, with lots of inputs from Andrew Gierth and Matheus
>> >> > Oliveira, have come up with a sketch of how to do this, to wit:
>> >> >
>> >> > - Extend CREATE FUNCTION to take either FOREIGN and SERVER or AS and
>> >> >   LANGUAGE as parameters, but not both. This seems simpler, at least
>> >> >   in a proof of concept, than creating SQL standard compliant grammar
>> >> >   out of whole cloth.  The SQL standard grammar could be layered in
>> >> >   later via the rewriter if this turns out to work.
>> >>
>> >> I'm also interested in this feature. While studying this feature, I
>> >> understood that this feature just pair a local function with a remote
>> >> function, not means that creates a kind of virtual function that can
>> >> be invoked on only foreign servers. For example, if we execute the
>> >> following SQL the local_func() is invoked in local because the col1
>> >> column of local_table is referenced by it.
>
> Do you mean that ISO/IEC 9075-9:2016 (right?) is defining that
> (and we must follow it)?  Or does it comes by referring to
> something like [1]? As far as I see David's mail upthread,
> OPTIONS is not precisely defined.

Yeah, I read [1] and the final committee draft ISO/IEC 9075-9:2006, it
might be old though.

>
> [1] http://cs.unibo.it/~montesi/CBD/Articoli/SQL-MED.pdf
>
> Unfortunately I don't have access to the document nor concrete
> use cases. With a rough idea of "remote mapping", I can guess the
> followng four use cases.  Each example syntax is just a guess
> without any consideration on implementability or other
> restrictions. The patch looks currently covering B.

Thank you for summarizing.

>
> A. Just notify a function can be just pushed down.
>
>   ex. SELECT foo(1, 'bar');  Remote: SELECT foo(1, 'bar');
>
>    CREATE ROUTINE MAPPING foomap FOR foo(int, text) SERVER rem;
>    (or same as B)
>
> B. Replace function name with the remote equivalent.
>
>   ex. SELECT foo(1, 'bar');  Remote: SELECT hoge(1, 'bar');
>
>    CREATE ROUTINE MAPPING foomap FOR foo(int, text) SERVER rem
>    OPTIONS (remote_func_name 'hoge'));
>
> C. Adjust function specification with remote.
>
>   ex. SELECT foo(1, 'bar');  Remote: SELECT hoge('bar', 1, true);
>
>    CREATE ROUTINE MAPPING foomap FOR foo(int, text) SERVER rem
>    OPTIONS (remote_expression 'hoge($2,$1,true)');
>
> D. Replace with an equivalent remote expression.
>
>   ex. SELECT foo(1, 'bar');  Remote: SELECT ('bar' || to_char(1 % 10));
>
>    CREATE ROUTINE MAPPING foomap FOR foo(int, text) SERVER rem
>       OPTIONS (remote_expression '$2 || to_char($1 % 10)');
>
> I haven't looked the patch in depth, but the core side looks
> generic and the FDW side is extensible to A, C and D. I think B
> is enough as a starter.
> I don't mean that we should implement all of them. They are just possibilities.

I agree that this feature covers A and B as the first step. But I'm
concerned that for D (and maybe for C?) the volatility of mapped
function could be changed. That is, currently we allow to push down
only immutable functions but they might break it. Also, can the
replacing a function with any expression be a risk of sql injections?

Also, according to the standard the routine mapping seems to work when
columns of referenced foreign table are passed to the function that is
mapped to the remote function. The functions in WHERE clause will
obviously be mapped but I'm not sure for function in target lists.

>
>
> I have some comments on the patch.
>
> It doesn't seem working. Am I missing something?
> ====
> create server sv1 foreign data wrapper postgres_fdw options (host '/tmp', port '5432', dbname 'postgres');
> create table lt (a int);
> create foreign table ft (a int) server sv1 options (table_name 'lt');
> create function lhoge(int) returns int as 'begin return $1 * 2; end;' language plpgsql;
> create routine mapping rm1 for function lhoge(int) server sv1 options (remote_func_name 'rhoge');
> explain verbose select * from ft where a = lhoge(3);
>                             QUERY PLAN
> ------------------------------------------------------------------
>  Foreign Scan on public.ft  (cost=100.00..936.31 rows=15 width=4)
>    Output: a
>    Filter: (ft.a = lhoge(3))
>    Remote SQL: SELECT a FROM public.lt
> (4 rows)
> ====
>
> Perhaps it cannot push down simple SQL local functions. (I'm not
> sure we should do that.)

Yeah, the current patch maps only when the function can be pushed down
to the foreign server. So maybe you can use either the built-in
functions or functions that are provided an extension that is listed
in 'extension' option.

>
> Can't we specify remote schema in remote_func_name just as
> (remote_func_name "fooschema.funcname")?

That's a possible option.

>
> Can't we provide the syntax without making MAPPING reserved?

I think it's possible but I haven't tried it hard yet. The problem I
faced is that, for example, without that the "DROP ROUTINE MAPPING
name" where the name can be the ColId conflicts with the "DROP ROUTINE
function_with_argtype_list [CASCADE | RESTRICT]" where
function_with_argtype_list can be the ColId as well.

>
>  Chainging the syntax for ALTER/DROP ROUTINE MAPPING like USER
>  MAPPING, specifically to ALTER/DROP ROUTINE MAPPING FOR FUNCTION
>  fname SERVER sname would evade the conflict.

Hmm, the syntax of ALTER/DROP ROUTINE MAPPING follow the standard.
Maybe we can discuss later.

>
> Can't we just push down the function itself with no option defined?
>
>   CREATE ROUTINE MAPPING rm1 FOR FUNCTION pg_backend_pid() SERVER sv1;

Yeah, it would be an good idea.

>
> funcid is not used in objectaddress.c.
>
> (The patch is missing a LF in a comment in lsyscache.c)
>

Thanks, I'll fix them in the next patch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: CREATE ROUTINE MAPPING

От
David Fetter
Дата:
On Mon, Sep 10, 2018 at 09:28:31AM +0200, Hannu Krosing wrote:
> Hi Corey
> 
> Have you looked at pl/proxy ?

DBI-Link pre-dated PL/proxy by some years, and was a good bit more
flexible as to what types of functions it could send where. Neither
has a capability fundamentally similar to this because neither had any
way to interact with the planner, other quals, etc.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: CREATE ROUTINE MAPPING

От
Tom Lane
Дата:
[ I wasn't paying much attention to this thread at the time, but
  Kyotaro-san just drew my attention to it again ]

Masahiko Sawada <sawada.mshk@gmail.com> writes:
> I agree that this feature covers A and B as the first step. But I'm
> concerned that for D (and maybe for C?) the volatility of mapped
> function could be changed. That is, currently we allow to push down
> only immutable functions but they might break it. Also, can the
> replacing a function with any expression be a risk of sql injections?

Yes, I'm afraid that there's serious security issues here that
the SQL standard fails to address.  Let's assume that user A has
created a function F(), and user B has created a foreign server S.
What privileges should be required for user C to create a mapping
for F on S?  AFAICS, the spec only requires C to have USAGE on S.
This means that all C needs is USAGE on S to create a trojan horse
that will execute arbitrary code when any other user D executes
a query using F on S.  C doesn't need to have any privilege on F
at all, and USAGE is not exactly strong privilege on S --- you
certainly wouldn't expect that "USAGE" translates to "I can
backdoor anybody else's usage of this server".

I see that SQL:2011's access rules for "CREATE ROUTINE MAPPING" are

    1) The applicable privileges shall include the USAGE privilege on
    the foreign server identified by FSN.

    2) Additional privileges, if any, necessary to execute <routine
    mapping definition> are implementation-defined.

It seems to me that (2) should be read as "we know we blew it here,
but we're leaving it up to implementors to fix this".

Some of the alternatives that were discussed upthread basically replace
this whole idea with attaching properties to the original function F.
I think I like that a lot better from a security perspective.  If you are
calling F in your query, you are already placing trust in F's owner.
Better to let F's owner define how it maps to functions on remote servers
than to let random third parties define that.

            regards, tom lane