Обсуждение: Allow any[] as input arguments for sql/plpgsql functions to mimicformat()

Поиск
Список
Период
Сортировка

Allow any[] as input arguments for sql/plpgsql functions to mimicformat()

От
Michał "phoe" Herda
Дата:
Hey everyone,

I am writing a plpgsql function that (to greatly simplify) raises an
exception with a formatted* message. Ideally, I should be able to call
it with raise_exception('The person %I has only %I bananas.', 'Fred',
8), which mimics the format(text, any[]) calling convention.

Here is where I have encountered a limitation of PostgreSQL's design:
https://www.postgresql.org/docs/11/datatype-pseudo.html mentions
explicitly that, "At present most procedural languages forbid use of a
pseudo-type as an argument type".

My reasoning is that I should be able to accept a value of some type if
all I do is passing it to a function that accepts exactly that type,
such as format(text, any[]). Given the technical reality, I assume that
I wouldn't be able to do anything else with that value, but that is
fine, since I don't have to do anything with it regardless.

BR
Michał "phoe" Herda

*I do not want to use the obvious solution of
raise_exception(format(...)) because the argument to that function is
the error ID that is then looked up in a table from which the error
message and sqlstate are retrieved. My full code is in the attached SQL
file. Once it is executed:

SELECT gateway_error('user_does_not_exist', '2'); -- works but is unnatural,
SELECT gateway_error('user_does_not_exist', 2); -- is natural but
doesn't work.


Вложения

Re: Allow any[] as input arguments for sql/plpgsql functions to mimic format()

От
Pavel Stehule
Дата:
Hi

po 22. 4. 2019 v 11:27 odesílatel Michał "phoe" Herda <phoe@disroot.org> napsal:
Hey everyone,

I am writing a plpgsql function that (to greatly simplify) raises an
exception with a formatted* message. Ideally, I should be able to call
it with raise_exception('The person %I has only %I bananas.', 'Fred',
8), which mimics the format(text, any[]) calling convention.

Here is where I have encountered a limitation of PostgreSQL's design:
https://www.postgresql.org/docs/11/datatype-pseudo.html mentions
explicitly that, "At present most procedural languages forbid use of a
pseudo-type as an argument type".

My reasoning is that I should be able to accept a value of some type if
all I do is passing it to a function that accepts exactly that type,
such as format(text, any[]). Given the technical reality, I assume that
I wouldn't be able to do anything else with that value, but that is
fine, since I don't have to do anything with it regardless.

BR
Michał "phoe" Herda

*I do not want to use the obvious solution of
raise_exception(format(...)) because the argument to that function is
the error ID that is then looked up in a table from which the error
message and sqlstate are retrieved. My full code is in the attached SQL
file. Once it is executed:

SELECT gateway_error('user_does_not_exist', '2'); -- works but is unnatural,
SELECT gateway_error('user_does_not_exist', 2); -- is natural but
doesn't work.

It is known problem, and fix is not easy.

Any expressions inside plpgsql are simple queries like SELECT expr, and they are executed same pipeline like queries.

The plans of these queries are stored and reused. Originally these plans disallow any changes, now some changes are supported, but parameters should be same all time. This is ensured by disallowing "any" type.

Other polymorphic types are very static, so there is not described risk.

Probably some enhancement can be in this are. The plan can be re-planed after some change - but it can has lot of performance impacts. It is long open topic. Some changes in direction to dynamic languages can increase cost of  some future optimization to higher performance :-(.

Regards

Pavel



 

Re: Allow any[] as input arguments for sql/plpgsql functions to mimicformat()

От
Michał "phoe" Herda
Дата:

Hey!

OK - thank you for the update and the explanation.

My reasoning in this case is - if we allow the any[] type to only be passed to other functions that accept any[], and disallow any kind of other operations on this array (such as retrieving its elements or modifying it), I do not yet see any places where it might introduce a performance regression. These arguments will literally be pass-only, and since we are unable to interact with them in any other way, there will be no possibility of type mismatches and therefore for performance penalties.

This approach puts all the heavy work on the plpgsql compiler - it will need to ensure that, if there is a any[] or VARIADIC any variable in a function arglist, it must NOT be accessed in any way, and can only be passed to other functions which accept any[] or VARIADIC any.

BR
~phoe

On 22.04.2019 12:09, Pavel Stehule wrote:
Hi

po 22. 4. 2019 v 11:27 odesílatel Michał "phoe" Herda <phoe@disroot.org> napsal:
Hey everyone,

I am writing a plpgsql function that (to greatly simplify) raises an
exception with a formatted* message. Ideally, I should be able to call
it with raise_exception('The person %I has only %I bananas.', 'Fred',
8), which mimics the format(text, any[]) calling convention.

Here is where I have encountered a limitation of PostgreSQL's design:
https://www.postgresql.org/docs/11/datatype-pseudo.html mentions
explicitly that, "At present most procedural languages forbid use of a
pseudo-type as an argument type".

My reasoning is that I should be able to accept a value of some type if
all I do is passing it to a function that accepts exactly that type,
such as format(text, any[]). Given the technical reality, I assume that
I wouldn't be able to do anything else with that value, but that is
fine, since I don't have to do anything with it regardless.

BR
Michał "phoe" Herda

*I do not want to use the obvious solution of
raise_exception(format(...)) because the argument to that function is
the error ID that is then looked up in a table from which the error
message and sqlstate are retrieved. My full code is in the attached SQL
file. Once it is executed:

SELECT gateway_error('user_does_not_exist', '2'); -- works but is unnatural,
SELECT gateway_error('user_does_not_exist', 2); -- is natural but
doesn't work.

It is known problem, and fix is not easy.

Any expressions inside plpgsql are simple queries like SELECT expr, and they are executed same pipeline like queries.

The plans of these queries are stored and reused. Originally these plans disallow any changes, now some changes are supported, but parameters should be same all time. This is ensured by disallowing "any" type.

Other polymorphic types are very static, so there is not described risk.

Probably some enhancement can be in this are. The plan can be re-planed after some change - but it can has lot of performance impacts. It is long open topic. Some changes in direction to dynamic languages can increase cost of  some future optimization to higher performance :-(.

Regards

Pavel



 

Re: Allow any[] as input arguments for sql/plpgsql functions to mimic format()

От
Pavel Stehule
Дата:
Hi

po 22. 4. 2019 v 19:20 odesílatel Michał "phoe" Herda <phoe@disroot.org> napsal:

Hey!

OK - thank you for the update and the explanation.

My reasoning in this case is - if we allow the any[] type to only be passed to other functions that accept any[], and disallow any kind of other operations on this array (such as retrieving its elements or modifying it), I do not yet see any places where it might introduce a performance regression. These arguments will literally be pass-only, and since we are unable to interact with them in any other way, there will be no possibility of type mismatches and therefore for performance penalties.

This approach puts all the heavy work on the plpgsql compiler - it will need to ensure that, if there is a any[] or VARIADIC any variable in a function arglist, it must NOT be accessed in any way, and can only be passed to other functions which accept any[] or VARIADIC any.

PLpgSQL compiler knows nothing about a expressions - the compiler process only plpgsql statements. Expressions are processed at runtime only by SQL parser and executor.


you can see there, so plpgsql is very different from other compilers. It just glue of SQL expressions or queries, that are black box for PLpgSQL compiler and executor.

Just any[] is not plpgsql way. For your case you should to use a overloading

create or replace function fx(fmt text, par text)
returns void as $$
begin
  raise notice '%', format(fmt, par);
end;
$$ language plpgsql;

create or replace function fx(fmt text, par numeric)
returns void as $$
begin
  raise notice '%', format(fmt, par);
end;
$$ language plpgsql;

There is another limit, you cannot to declare function parameter type that enforce explicit casting

can be nice (but it is strange idea) to have some other flags for arguments

CREATE OR REPLACE FUNCTION gateway_error(fmt text, par text FORCE EXPLICIT CAST)
...

Regards

Pavel
 

BR
~phoe

On 22.04.2019 12:09, Pavel Stehule wrote:
Hi

po 22. 4. 2019 v 11:27 odesílatel Michał "phoe" Herda <phoe@disroot.org> napsal:
Hey everyone,

I am writing a plpgsql function that (to greatly simplify) raises an
exception with a formatted* message. Ideally, I should be able to call
it with raise_exception('The person %I has only %I bananas.', 'Fred',
8), which mimics the format(text, any[]) calling convention.

Here is where I have encountered a limitation of PostgreSQL's design:
https://www.postgresql.org/docs/11/datatype-pseudo.html mentions
explicitly that, "At present most procedural languages forbid use of a
pseudo-type as an argument type".

My reasoning is that I should be able to accept a value of some type if
all I do is passing it to a function that accepts exactly that type,
such as format(text, any[]). Given the technical reality, I assume that
I wouldn't be able to do anything else with that value, but that is
fine, since I don't have to do anything with it regardless.

BR
Michał "phoe" Herda

*I do not want to use the obvious solution of
raise_exception(format(...)) because the argument to that function is
the error ID that is then looked up in a table from which the error
message and sqlstate are retrieved. My full code is in the attached SQL
file. Once it is executed:

SELECT gateway_error('user_does_not_exist', '2'); -- works but is unnatural,
SELECT gateway_error('user_does_not_exist', 2); -- is natural but
doesn't work.

It is known problem, and fix is not easy.

Any expressions inside plpgsql are simple queries like SELECT expr, and they are executed same pipeline like queries.

The plans of these queries are stored and reused. Originally these plans disallow any changes, now some changes are supported, but parameters should be same all time. This is ensured by disallowing "any" type.

Other polymorphic types are very static, so there is not described risk.

Probably some enhancement can be in this are. The plan can be re-planed after some change - but it can has lot of performance impacts. It is long open topic. Some changes in direction to dynamic languages can increase cost of  some future optimization to higher performance :-(.

Regards

Pavel



 

Re: Allow any[] as input arguments for sql/plpgsql functions to mimic format()

От
Tom Lane
Дата:
=?UTF-8?Q?Micha=c5=82_=22phoe=22_Herda?= <phoe@disroot.org> writes:
> My reasoning in this case is - if we allow the any[] type to only be
> passed to other functions that accept any[], and disallow any kind of
> other operations on this array (such as retrieving its elements or
> modifying it), I do not yet see any places where it might introduce a
> performance regression.

Performance regressions are not the question here --- or at least, there
are a lot of other questions to get past first.

* plpgsql doesn't have any mechanism for restricting the use of a
parameter in the way you suggest.  It's not clear if it'd be practical
to add one, given the arms-length way in which plpgsql does expression
evaluation, and it seems likely that any such thing would be messy
and bug-prone.

* There's not actually any such type as any[].  There's anyarray,
which is not what you're wishing for here because it'd constrain
all the actual arguments to be the same type (or at least coercible
to the same array element type).  This is related to the next point...

* format() isn't declared as taking any[].  It's really

regression=# \df format
                          List of functions
   Schema   |  Name  | Result data type | Argument data types  | Type 
------------+--------+------------------+----------------------+------
 pg_catalog | format | text             | text                 | func
 pg_catalog | format | text             | text, VARIADIC "any" | func
(2 rows)

"VARIADIC any" is a very special hack, because unlike other VARIADIC
cases, it doesn't result in collapsing the actual arguments into an
array.  (Again, it can't because they might not have a common type.)
The called function has to have special logic for examining its
arguments to find out how many there are and what their types are.
format() can do that because it's written in C, but a plpgsql function,
not so much.

* We could imagine allowing a plpgsql function to be declared
"VARIADIC any", and treating it as having N polymorphic arguments of
independent types, but then what?  plpgsql has no notation for
accessing such arguments (they wouldn't have names, to start with),
nor for finding out how many there are, and it certainly has no
notation for passing the whole group of them on to some other function.


I think the closest you'll be able to get here is to declare the
plpgsql function as taking "variadic text[]" and then passing the
text array to format() with a VARIADIC marker.  That should work
mostly okay, although calls might need explicit casts to text
in some cases.

FWIW, it'd likely be easier to get around these problems in plperl
or pltcl than in plpgsql, as those are both much less concerned
with the exact data types of their arguments than plpgsql, and
more accustomed to dealing with functions with variable argument
lists.  I don't know Python well enough to say whether the same
is true of plpython.

            regards, tom lane