Обсуждение: Extension support for postgres_fdw
I would like to enhance the postgres_fdw to allow more complete support for user-defined types. Right now, postgres_fdw already does a good job of passing user-defined type data back and forth, which is pretty nice. However,it will not pass functions or operators that use user-defined types to the remote host. For a extension like PostGIS,that means that spatial filters cannot be executed on remote servers, which makes FDW not so useful for PostGIS. I think the postgres_fdw extension should pass user-defined functions and operators, but only when it knows those functionsand operators exist at the remote. One way would be to ask the remote what extensions it has, but the overhead ofdoing that is a bit high. A simpler way would be to just have the DBA declare what extensions the remote will have, whenshe creates the server definition, for example: CREATE SERVER fire_department_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS ( host 'fire.city.gov', dbname 'infrastructure', port '5432', extensions 'postgis, seg' ); Once the local host knows what extensions to expect on the remote side, it can retain functions and operators in those extensionsin the set of remote restrictions and deparse them for use in the query of the remote. Basically, everywhere thereis a call to is_builtin(Oid oid) now, there's also be a call to is_allowed_extension() (or somesuch) as well. There is a PostGIS-specific implementation of this concept here: https://github.com/pramsey/postgres/blob/9.4-postgres-fdw-postgis/contrib/postgres_fdw If the approach above sounds OK, I'll genericize my PostGIS specific code to hand arbitrary extensions and submit a patch. Thanks! Paul -- Paul Ramsey http://cleverelephant.ca http://postgis.net
On 06/20/2015 10:20 AM, Paul Ramsey wrote: > I would like to enhance the postgres_fdw to allow more complete support for user-defined types. > > Right now, postgres_fdw already does a good job of passing user-defined type data back and forth, which is pretty nice.However, it will not pass functions or operators that use user-defined types to the remote host. For a extension likePostGIS, that means that spatial filters cannot be executed on remote servers, which makes FDW not so useful for PostGIS. > > I think the postgres_fdw extension should pass user-defined functions and operators, but only when it knows those functionsand operators exist at the remote. One way would be to ask the remote what extensions it has, but the overhead ofdoing that is a bit high. A simpler way would be to just have the DBA declare what extensions the remote will have, whenshe creates the server definition, for example: > > CREATE SERVER fire_department_server > FOREIGN DATA WRAPPER postgres_fdw > OPTIONS ( > host 'fire.city.gov', > dbname 'infrastructure', > port '5432', > extensions 'postgis, seg' > ); > > Once the local host knows what extensions to expect on the remote side, it can retain functions and operators in thoseextensions in the set of remote restrictions and deparse them for use in the query of the remote. Basically, everywherethere is a call to is_builtin(Oid oid) now, there's also be a call to is_allowed_extension() (or somesuch) as well. The SQL/MED specification has a concept of routine mappings, for mapping functions. I'm not sure if it'd be better to implement CREATE ROUTINE MAPPING, and adapt it to operators too, or invent something entirely new and PostgreSQL-specific. But CREATE ROUTINE MAPPING at least deserves a look. - Heikki
Paul Ramsey <pramsey@cleverelephant.ca> writes: > ... I think the postgres_fdw extension should pass user-defined > functions and operators, but only when it knows those functions and > operators exist at the remote. One way would be to ask the remote what > extensions it has, but the overhead of doing that is a bit high. A > simpler way would be to just have the DBA declare what extensions the > remote will have, when she creates the server definition, for example: [ For the record, this idea was discussed a bit at PGCon. ] The key question here is whether filtering functions/operators at the level of extensions is a good design. It seems to me like a reasonable compromise between flexibility and ease of use, but others might see it differently. Josh Berkus indicated that he would also want a way to mark individual functions/operators as transmittable, but clearly that would not scale very well to large extensions like PostGIS. I don't feel a need to insist that Paul include such a feature in his patch (unless he wants to, of course) --- per-function marking seems like a separate, though related, feature. Note that no matter what the details are, something like this is putting the onus on the DBA to mark as transmittable only functions that actually are safe to transmit, ie they exist *and have identical semantics* on the remote. I think that's fine as long as it's clearly documented. (Presumably, only immutable functions would get transmitted, even if there are mutable functions present in a marked extension.) regards, tom lane
On 20 June 2015 at 18:19, Tom Lane <tgl@sss.pgh.pa.us> wrote:
--
The key question here is whether filtering functions/operators at the
level of extensions is a good design. It seems to me like a reasonable
compromise between flexibility and ease of use, but others might see it
differently.
I like that, but currently we handle things in terms of Schemas. It would be strange to have differing ways of specifying groups of objects. Maybe that's not a problem, but we'd probably need to analyse that to make sure it didn't make things more complex.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Simon Riggs <simon@2ndQuadrant.com> writes: > On 20 June 2015 at 18:19, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> The key question here is whether filtering functions/operators at the >> level of extensions is a good design. It seems to me like a reasonable >> compromise between flexibility and ease of use, but others might see it >> differently. > I like that, but currently we handle things in terms of Schemas. It would > be strange to have differing ways of specifying groups of objects. Maybe > that's not a problem, but we'd probably need to analyse that to make sure > it didn't make things more complex. Fair point, but I think making it schema-based would be pretty awkward for many common use-cases. By default, at least, all extensions get dropped into schema public. I doubt it would be a good idea to say "anything in public is transmittable". regards, tom lane
On 6/20/15 12:19 PM, Tom Lane wrote: > Note that no matter what the details are, something like this is putting > the onus on the DBA to mark as transmittable only functions that actually > are safe to transmit, ie they exist*and have identical semantics* on the > remote. I think that's fine as long as it's clearly documented. That seems like potentially a lot of extra work. We have the actual function body/definition for all but C functions, perhaps we could automatically map calls when the definitions are identical. I think that could operate safely in addition to manual specification though, so presumably this could be added later. > (Presumably, only immutable functions would get transmitted, even if there > are mutable functions present in a marked extension.) +1 -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Data in Trouble? Get it in Treble! http://BlueTreble.com
Jim Nasby <Jim.Nasby@BlueTreble.com> writes: > On 6/20/15 12:19 PM, Tom Lane wrote: >> Note that no matter what the details are, something like this is putting >> the onus on the DBA to mark as transmittable only functions that actually >> are safe to transmit, ie they exist*and have identical semantics* on the >> remote. I think that's fine as long as it's clearly documented. > That seems like potentially a lot of extra work. We have the actual > function body/definition for all but C functions, perhaps we could > automatically map calls when the definitions are identical. Huh? No, we don't have that, and even if we did, (1) 95% of the functions of interest *are* C functions. (2) It's probably unsafe ever to transmit non-C functions; there are too many ways in which PL-language functions might be environment sensitive. To take just one example, postgres_fdw runs the remote session with a restricted search_path, which may not be what user-defined functions are expecting. (3) In general, determining the behavior of a function is equivalent to solving the halting problem. SQL-language functions with sufficiently simple bodies might be an exception --- but those probably got inlined into the query before the FDW ever saw them, so it's irrelevant whether we could deduce that they're safe to transmit. It's possible that at some point we'd let the DBA override (2) and mark PL functions as transmittable anyway; but for certain, this will be on an "if you break it you get to keep both pieces" basis. We'd be nuts to do it automatically. regards, tom lane