Обсуждение: Handling changes to default type transformations in PLs

Поиск
Список
Период
Сортировка

Handling changes to default type transformations in PLs

От
Jim Nasby
Дата:
Some of our PLs have the unfortunate problem of making a weak effort 
with converting types to and from the PL and Postgres. For example, 
plpythonu will correctly transform a complex type to a dict and an array 
to a list, but it punts back to text for an array contained inside a 
complex type. I know plTCL has similar issues; presumably other PLs are 
affected as well.

While it's a SMOC to fix this, what's not simple is the backwards 
compatability: users that are currently using these types are expecting 
to be handed strings created by the type's output function, so we can't 
just drop these changes in without breaking user code.

It might be possible to work around this with TRANSFORMs, but that's 
just ugly: first you'd have to write a bunch of non-trivial C code, then 
you'd need to forever remember to specify TRANSFORM FOR TYPE blah.

Some ways to handle this:

1) Use a PL-specific GUC for each case where we need backwards 
compatibility. For plpython we'd need 2. plTCL would need 1 or 2.

2) Use a single all-or-nothing GUC. Downside is that if we later decide 
to expand automatic conversion again we'd need yet another GUC.

3) Add the concept of PL API versions. This would allow users to specify 
what range of API versions they support. I think this would have been 
helpful with the plpython elog() patch.

4) Create a mechanism for specifying default TRANSFORMs for a PL, and 
essentially "solve" these issues by supplying a built-in transform.

I think default transforms (4) are worth doing no matter what. Having to 
manually remember to add potentially multiple TRANSFORMs is a PITA. But 
I'm not sure TRANSFORMS would actually fix all issues. For example, you 
can't specify a transform for an array type, so this probably wouldn't 
work for one of the plpython problems.

3 is interesting, but maybe it would be bad to tie multiple unrelated 
API changes together.

So I'm leaning towards 1. It means potentially adding a fair number of 
new GUCs, but these would all be custom GUCs, so maybe it's not that 
bad. The other downside to GUCs is I think it'd be nice to be able to 
set this at a schema level, which you can't currently do with GUCs.

Thoughts?
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: Handling changes to default type transformations in PLs

От
Tom Lane
Дата:
Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
> Some of our PLs have the unfortunate problem of making a weak effort 
> with converting types to and from the PL and Postgres. For example, 
> plpythonu will correctly transform a complex type to a dict and an array 
> to a list, but it punts back to text for an array contained inside a 
> complex type. I know plTCL has similar issues; presumably other PLs are 
> affected as well.

> While it's a SMOC to fix this, what's not simple is the backwards 
> compatability: users that are currently using these types are expecting 
> to be handed strings created by the type's output function, so we can't 
> just drop these changes in without breaking user code.

> It might be possible to work around this with TRANSFORMs, but that's 
> just ugly: first you'd have to write a bunch of non-trivial C code, then 
> you'd need to forever remember to specify TRANSFORM FOR TYPE blah.

> Some ways to handle this:

> 1) Use a PL-specific GUC for each case where we need backwards 
> compatibility. For plpython we'd need 2. plTCL would need 1 or 2.

> 2) Use a single all-or-nothing GUC. Downside is that if we later decide 
> to expand automatic conversion again we'd need yet another GUC.

> 3) Add the concept of PL API versions. This would allow users to specify 
> what range of API versions they support. I think this would have been 
> helpful with the plpython elog() patch.

> 4) Create a mechanism for specifying default TRANSFORMs for a PL, and 
> essentially "solve" these issues by supplying a built-in transform.

> I think default transforms (4) are worth doing no matter what. Having to 
> manually remember to add potentially multiple TRANSFORMs is a PITA. But 
> I'm not sure TRANSFORMS would actually fix all issues. For example, you 
> can't specify a transform for an array type, so this probably wouldn't 
> work for one of the plpython problems.

> 3 is interesting, but maybe it would be bad to tie multiple unrelated 
> API changes together.

> So I'm leaning towards 1. It means potentially adding a fair number of 
> new GUCs, but these would all be custom GUCs, so maybe it's not that 
> bad. The other downside to GUCs is I think it'd be nice to be able to 
> set this at a schema level, which you can't currently do with GUCs.

> Thoughts?

I think harsh experience has taught us to distrust GUCs that change
code semantics.  So I'm not very attracted by option #1, much less
option #2.  I'm not sure about option #4 --- it smells like it would
have the same problems as a GUC, namely that it would be
action-at-a-distance on the semantics of a PL function's arguments,
with insufficient ability to control the scope of the effects.

So that leaves #3, which doesn't seem all that unreasonable from here.
We don't have a problem with bundling a bunch of unrelated changes
into any one major PG revision.  The scripting languages we're talking
about calling do similar things.  So why not for the semantics of the
glue layer?

It seems like you really need to be able to specify this at the
per-function level, which makes me think that specifying
"LANGUAGE plpython_2" or "LANGUAGE plperl_3" might be the right
kind of API.
        regards, tom lane



Re: Handling changes to default type transformations in PLs

От
Pavel Stehule
Дата:




> 3) Add the concept of PL API versions. This would allow users to specify

So that leaves #3, which doesn't seem all that unreasonable from here.
We don't have a problem with bundling a bunch of unrelated changes
into any one major PG revision.  The scripting languages we're talking
about calling do similar things.  So why not for the semantics of the
glue layer?

It seems like you really need to be able to specify this at the
per-function level, which makes me think that specifying
"LANGUAGE plpython_2" or "LANGUAGE plperl_3" might be the right
kind of API.

I am not big fan of this proposal. A users usually would to choose only some preferred features - and this design has maybe too small granularity.

Objections:

* usually is used keyword REVISON - so syntax can be LANGUAGE plpython REVISION 3. It is more readable. You need to specify preferred revision for any language. The revision is persistent. The behave is same like Tom's proposal, but I hope so this can be better readable and understandable

regards

Pavel


 

                        regards, tom lane


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Handling changes to default type transformations in PLs

От
Pavel Stehule
Дата:




4) Create a mechanism for specifying default TRANSFORMs for a PL, and essentially "solve" these issues by supplying a built-in transform.

I think default transforms (4) are worth doing no matter what. Having to manually remember to add potentially multiple TRANSFORMs is a PITA. But I'm not sure TRANSFORMS would actually fix all issues. For example, you can't specify a transform for an array type, so this probably wouldn't work for one of the plpython problems.



yesterday, I wrote some doc about TRANSFORMS - and it is not possible. The transformation cannot be transparent for PL function - because you have to have information, what is your parameters inside function, and if these parameters are original or transformed. This is not a issue for "smart" object types, but generally it should not be ensured for basic types. So default transformation is not a good idea. You can have a transform for C language and it is really different from Python.

Maybe concept of TRANSFORMs too general and some specific PL extension's can be better. It needs some concept of persistent options, that can be used in these extension.

CREATE OR REPLACE FUNCTION .... LANGUAGE plpython WITH OPTIONS ('transform_dictionary', ...)

The execution should be stopped if any option will not be processed.

With these extensions you can do anything and It can works (I hope). But the complexity of our PL will be significantly higher. And then Tom's proposal can be better. It can help with faster adoption of new features and it is relative simple solution.

Regards

Pavel