Обсуждение: Allow any[] as input arguments for sql/plpgsql functions to mimicformat()
Allow any[] as input arguments for sql/plpgsql functions to mimicformat()
Hey everyone, I am writing a plpgsql function that (to greatly simplify) raises an exception with a formatted* message. Ideally, I should be able to call it with raise_exception('The person %I has only %I bananas.', 'Fred', 8), which mimics the format(text, any[]) calling convention. Here is where I have encountered a limitation of PostgreSQL's design: https://www.postgresql.org/docs/11/datatype-pseudo.html mentions explicitly that, "At present most procedural languages forbid use of a pseudo-type as an argument type". My reasoning is that I should be able to accept a value of some type if all I do is passing it to a function that accepts exactly that type, such as format(text, any[]). Given the technical reality, I assume that I wouldn't be able to do anything else with that value, but that is fine, since I don't have to do anything with it regardless. BR Michał "phoe" Herda *I do not want to use the obvious solution of raise_exception(format(...)) because the argument to that function is the error ID that is then looked up in a table from which the error message and sqlstate are retrieved. My full code is in the attached SQL file. Once it is executed: SELECT gateway_error('user_does_not_exist', '2'); -- works but is unnatural, SELECT gateway_error('user_does_not_exist', 2); -- is natural but doesn't work.
Вложения
Re: Allow any[] as input arguments for sql/plpgsql functions to mimic format()
Hey everyone,
I am writing a plpgsql function that (to greatly simplify) raises an
exception with a formatted* message. Ideally, I should be able to call
it with raise_exception('The person %I has only %I bananas.', 'Fred',
8), which mimics the format(text, any[]) calling convention.
Here is where I have encountered a limitation of PostgreSQL's design:
https://www.postgresql.org/docs/11/datatype-pseudo.html mentions
explicitly that, "At present most procedural languages forbid use of a
pseudo-type as an argument type".
My reasoning is that I should be able to accept a value of some type if
all I do is passing it to a function that accepts exactly that type,
such as format(text, any[]). Given the technical reality, I assume that
I wouldn't be able to do anything else with that value, but that is
fine, since I don't have to do anything with it regardless.
BR
Michał "phoe" Herda
*I do not want to use the obvious solution of
raise_exception(format(...)) because the argument to that function is
the error ID that is then looked up in a table from which the error
message and sqlstate are retrieved. My full code is in the attached SQL
file. Once it is executed:
SELECT gateway_error('user_does_not_exist', '2'); -- works but is unnatural,
SELECT gateway_error('user_does_not_exist', 2); -- is natural but
doesn't work.
Re: Allow any[] as input arguments for sql/plpgsql functions to mimicformat()
Hey!
OK - thank you for the update and the explanation.
My reasoning in this case is - if we allow the any[] type to only be passed to other functions that accept any[], and disallow any kind of other operations on this array (such as retrieving its elements or modifying it), I do not yet see any places where it might introduce a performance regression. These arguments will literally be pass-only, and since we are unable to interact with them in any other way, there will be no possibility of type mismatches and therefore for performance penalties.
This approach puts all the heavy work on the plpgsql compiler - it will need to ensure that, if there is a any[] or VARIADIC any variable in a function arglist, it must NOT be accessed in any way, and can only be passed to other functions which accept any[] or VARIADIC any.
BR
~phoe
Hipo 22. 4. 2019 v 11:27 odesílatel Michał "phoe" Herda <phoe@disroot.org> napsal:Hey everyone,
I am writing a plpgsql function that (to greatly simplify) raises an
exception with a formatted* message. Ideally, I should be able to call
it with raise_exception('The person %I has only %I bananas.', 'Fred',
8), which mimics the format(text, any[]) calling convention.
Here is where I have encountered a limitation of PostgreSQL's design:
https://www.postgresql.org/docs/11/datatype-pseudo.html mentions
explicitly that, "At present most procedural languages forbid use of a
pseudo-type as an argument type".
My reasoning is that I should be able to accept a value of some type if
all I do is passing it to a function that accepts exactly that type,
such as format(text, any[]). Given the technical reality, I assume that
I wouldn't be able to do anything else with that value, but that is
fine, since I don't have to do anything with it regardless.
BR
Michał "phoe" Herda
*I do not want to use the obvious solution of
raise_exception(format(...)) because the argument to that function is
the error ID that is then looked up in a table from which the error
message and sqlstate are retrieved. My full code is in the attached SQL
file. Once it is executed:
SELECT gateway_error('user_does_not_exist', '2'); -- works but is unnatural,
SELECT gateway_error('user_does_not_exist', 2); -- is natural but
doesn't work.It is known problem, and fix is not easy.Any expressions inside plpgsql are simple queries like SELECT expr, and they are executed same pipeline like queries.The plans of these queries are stored and reused. Originally these plans disallow any changes, now some changes are supported, but parameters should be same all time. This is ensured by disallowing "any" type.Other polymorphic types are very static, so there is not described risk.Probably some enhancement can be in this are. The plan can be re-planed after some change - but it can has lot of performance impacts. It is long open topic. Some changes in direction to dynamic languages can increase cost of some future optimization to higher performance :-(.RegardsPavel
Re: Allow any[] as input arguments for sql/plpgsql functions to mimic format()
Hey!
OK - thank you for the update and the explanation.
My reasoning in this case is - if we allow the any[] type to only be passed to other functions that accept any[], and disallow any kind of other operations on this array (such as retrieving its elements or modifying it), I do not yet see any places where it might introduce a performance regression. These arguments will literally be pass-only, and since we are unable to interact with them in any other way, there will be no possibility of type mismatches and therefore for performance penalties.
This approach puts all the heavy work on the plpgsql compiler - it will need to ensure that, if there is a any[] or VARIADIC any variable in a function arglist, it must NOT be accessed in any way, and can only be passed to other functions which accept any[] or VARIADIC any.
BR
~phoeOn 22.04.2019 12:09, Pavel Stehule wrote:Hipo 22. 4. 2019 v 11:27 odesílatel Michał "phoe" Herda <phoe@disroot.org> napsal:Hey everyone,
I am writing a plpgsql function that (to greatly simplify) raises an
exception with a formatted* message. Ideally, I should be able to call
it with raise_exception('The person %I has only %I bananas.', 'Fred',
8), which mimics the format(text, any[]) calling convention.
Here is where I have encountered a limitation of PostgreSQL's design:
https://www.postgresql.org/docs/11/datatype-pseudo.html mentions
explicitly that, "At present most procedural languages forbid use of a
pseudo-type as an argument type".
My reasoning is that I should be able to accept a value of some type if
all I do is passing it to a function that accepts exactly that type,
such as format(text, any[]). Given the technical reality, I assume that
I wouldn't be able to do anything else with that value, but that is
fine, since I don't have to do anything with it regardless.
BR
Michał "phoe" Herda
*I do not want to use the obvious solution of
raise_exception(format(...)) because the argument to that function is
the error ID that is then looked up in a table from which the error
message and sqlstate are retrieved. My full code is in the attached SQL
file. Once it is executed:
SELECT gateway_error('user_does_not_exist', '2'); -- works but is unnatural,
SELECT gateway_error('user_does_not_exist', 2); -- is natural but
doesn't work.It is known problem, and fix is not easy.Any expressions inside plpgsql are simple queries like SELECT expr, and they are executed same pipeline like queries.The plans of these queries are stored and reused. Originally these plans disallow any changes, now some changes are supported, but parameters should be same all time. This is ensured by disallowing "any" type.Other polymorphic types are very static, so there is not described risk.Probably some enhancement can be in this are. The plan can be re-planed after some change - but it can has lot of performance impacts. It is long open topic. Some changes in direction to dynamic languages can increase cost of some future optimization to higher performance :-(.RegardsPavel
=?UTF-8?Q?Micha=c5=82_=22phoe=22_Herda?= <phoe@disroot.org> writes: > My reasoning in this case is - if we allow the any[] type to only be > passed to other functions that accept any[], and disallow any kind of > other operations on this array (such as retrieving its elements or > modifying it), I do not yet see any places where it might introduce a > performance regression. Performance regressions are not the question here --- or at least, there are a lot of other questions to get past first. * plpgsql doesn't have any mechanism for restricting the use of a parameter in the way you suggest. It's not clear if it'd be practical to add one, given the arms-length way in which plpgsql does expression evaluation, and it seems likely that any such thing would be messy and bug-prone. * There's not actually any such type as any[]. There's anyarray, which is not what you're wishing for here because it'd constrain all the actual arguments to be the same type (or at least coercible to the same array element type). This is related to the next point... * format() isn't declared as taking any[]. It's really regression=# \df format List of functions Schema | Name | Result data type | Argument data types | Type ------------+--------+------------------+----------------------+------ pg_catalog | format | text | text | func pg_catalog | format | text | text, VARIADIC "any" | func (2 rows) "VARIADIC any" is a very special hack, because unlike other VARIADIC cases, it doesn't result in collapsing the actual arguments into an array. (Again, it can't because they might not have a common type.) The called function has to have special logic for examining its arguments to find out how many there are and what their types are. format() can do that because it's written in C, but a plpgsql function, not so much. * We could imagine allowing a plpgsql function to be declared "VARIADIC any", and treating it as having N polymorphic arguments of independent types, but then what? plpgsql has no notation for accessing such arguments (they wouldn't have names, to start with), nor for finding out how many there are, and it certainly has no notation for passing the whole group of them on to some other function. I think the closest you'll be able to get here is to declare the plpgsql function as taking "variadic text[]" and then passing the text array to format() with a VARIADIC marker. That should work mostly okay, although calls might need explicit casts to text in some cases. FWIW, it'd likely be easier to get around these problems in plperl or pltcl than in plpgsql, as those are both much less concerned with the exact data types of their arguments than plpgsql, and more accustomed to dealing with functions with variable argument lists. I don't know Python well enough to say whether the same is true of plpython. regards, tom lane