Обсуждение: BUG #17360: array_to_string should be immutable instead of stable

Поиск
Список
Период
Сортировка

BUG #17360: array_to_string should be immutable instead of stable

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      17360
Logged by:          Gergely Czuczy
Email address:      gergely.czuczy@harmless.hu
PostgreSQL version: 13.3
Operating system:   FreeBSD13
Description:

Hello,

The array_to_string function should have a volatility of immutable, it
perfectly fits the requirements in the documentation.

Where this comes to play is generated columns, when one wishes to have a
field something like GENERATED ALWAYS AS (array_to_string(ARRAY[selector,
'_domainkey', subdomain], '.')) STORED.

Reading the documentation I haven't found anything that would prevent the
function from being immutable, and enable this kind of usecases.

Best regards,
Gergely


Re: BUG #17360: array_to_string should be immutable instead of stable

От
Tom Lane
Дата:
PG Bug reporting form <noreply@postgresql.org> writes:
> The array_to_string function should have a volatility of immutable,

Nope.  It invokes an arbitrary datatype I/O function,
which might only be stable.  As an example:

regression=# begin;
BEGIN
regression=*# select array_to_string(array[now()], ',');
        array_to_string        
-------------------------------
 2022-01-10 09:47:06.605304-05
(1 row)

regression=*# set timezone to UTC;
SET
regression=*# select array_to_string(array[now()], ',');
        array_to_string        
-------------------------------
 2022-01-10 14:47:06.605304+00
(1 row)

Actually, in principle somebody could make a datatype
whose output function is volatile.  It's not clear
what the use-case is, though, so we've established a
project policy that code that invokes some arbitrary
I/O function may assume that function is at least stable.
Thus, array_to_string() is marked stable, and likewise
for some other polymorphic functions such as format().

Ideally, perhaps, we'd determine the volatility level of
polymorphic functions like array_to_string() based on the
actual input data types --- but it's hard to see how to do
that.

            regards, tom lane



Re: BUG #17360: array_to_string should be immutable instead of stable

От
Gergely Czuczy
Дата:
Hello,

Thanks for explaining Tom, I wasn't aware these things on the internals.

Wouldn't it possible to add polymorphs for immutable types, like 
func(text[]), func(int[]), etc, mark those immutable, and keep the 
generic for all the other not-explicitly-dealt-with types as stable? 
Again, I don't really know the implementation details here, so it's just 
my tuppence.

Thanks again for explaining it.

Regards,
Gergely

On 2022. 01. 10. 15:54, Tom Lane wrote:
> PG Bug reporting form <noreply@postgresql.org> writes:
>> The array_to_string function should have a volatility of immutable,
> Nope.  It invokes an arbitrary datatype I/O function,
> which might only be stable.  As an example:
>
> regression=# begin;
> BEGIN
> regression=*# select array_to_string(array[now()], ',');
>          array_to_string
> -------------------------------
>   2022-01-10 09:47:06.605304-05
> (1 row)
>
> regression=*# set timezone to UTC;
> SET
> regression=*# select array_to_string(array[now()], ',');
>          array_to_string
> -------------------------------
>   2022-01-10 14:47:06.605304+00
> (1 row)
>
> Actually, in principle somebody could make a datatype
> whose output function is volatile.  It's not clear
> what the use-case is, though, so we've established a
> project policy that code that invokes some arbitrary
> I/O function may assume that function is at least stable.
> Thus, array_to_string() is marked stable, and likewise
> for some other polymorphic functions such as format().
>
> Ideally, perhaps, we'd determine the volatility level of
> polymorphic functions like array_to_string() based on the
> actual input data types --- but it's hard to see how to do
> that.
>
>             regards, tom lane



Re: BUG #17360: array_to_string should be immutable instead of stable

От
"David G. Johnston"
Дата:
On Mon, Jan 10, 2022 at 7:54 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
> The array_to_string function should have a volatility of immutable,

Nope.  It invokes an arbitrary datatype I/O function,
which might only be stable.  As an example:

regression=# begin;
BEGIN
regression=*# select array_to_string(array[now()], ',');

That feels wrong.  It's not like we are passing the "now()" function to the function and invoking it later.  So far as array_to_string is concerned it is being given a literal value.

By your argument, isfinite(timestamp) should be stable but it is immutable (it's the first example I stumbled across, I suspect there are quite a few more I could contrive).

select isfinite(now());

 
        array_to_string       
-------------------------------
 2022-01-10 09:47:06.605304-05
(1 row)

regression=*# set timezone to UTC;
SET
regression=*# select array_to_string(array[now()], ',');
        array_to_string       
-------------------------------
 2022-01-10 14:47:06.605304+00
(1 row)

[...]
Ideally, perhaps, we'd determine the volatility level of
polymorphic functions like array_to_string() based on the
actual input data types --- but it's hard to see how to do
that.

In short, that doesn't make sense.  The volatility level of a function is only determined by the implementation code of said function.  The function invoking expression volatility level depends upon the most volatile behavior of all functions used in the expression.  That we should be doing if we aren't already.

David J.

Re: BUG #17360: array_to_string should be immutable instead of stable

От
"David G. Johnston"
Дата:


On Mon, Jan 10, 2022, 08:54 David G. Johnston <david.g.johnston@gmail.com> wrote:
On Mon, Jan 10, 2022 at 7:54 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
> The array_to_string function should have a volatility of immutable,

Nope.  It invokes an arbitrary datatype I/O function,
which might only be stable.  As an example:

regression=# begin;
BEGIN
regression=*# select array_to_string(array[now()], ',');

That feels wrong.  It's not like we are passing the "now()" function to the function and invoking it later.  So far as array_to_string is concerned it is being given a literal value.


Nevermind.  It's the internal timestampt to text cast I'm forgetting.

David k


Re: BUG #17360: array_to_string should be immutable instead of stable

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Mon, Jan 10, 2022 at 7:54 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Nope.  It invokes an arbitrary datatype I/O function,
>> which might only be stable.  As an example:

> That feels wrong.  It's not like we are passing the "now()" function to the
> function and invoking it later.  So far as array_to_string is concerned it
> is being given a literal value.

now() is not the problem; it's the datatype output function that's
the problem.  I was perhaps being too thrifty with keystrokes in
my example, so here's another one:

regression=# show timezone;
     TimeZone
------------------
 America/New_York
(1 row)

regression=# select '{2022-01-10 00:00-05}'::timestamptz[];
        timestamptz
----------------------------
 {"2022-01-10 00:00:00-05"}
(1 row)

regression=# select array_to_string('{2022-01-10 00:00-05}'::timestamptz[], ',');
    array_to_string
------------------------
 2022-01-10 00:00:00-05
(1 row)

regression=# set timezone = UTC;
SET
regression=# select '{2022-01-10 00:00-05}'::timestamptz[];
        timestamptz
----------------------------
 {"2022-01-10 05:00:00+00"}
(1 row)

regression=# select array_to_string('{2022-01-10 00:00-05}'::timestamptz[], ',');
    array_to_string
------------------------
 2022-01-10 05:00:00+00
(1 row)

Now do you see the issue?  The input datum is identical in all four
queries, but the resulting strings are not, so these functions
cannot be considered immutable.

> In short, that doesn't make sense.  The volatility level of a function is
> only determined by the implementation code of said function.

array_to_string is invoking timestamptz_out along the way to
creating its result.  Although array_to_string's own behavior
is immutable, timestamptz_out's is not.

> The function
> invoking expression volatility level depends upon the most volatile
> behavior of all functions used in the expression.  That we should be doing
> if we aren't already.

The core of the difficulty is that although timestamptz_out
is getting called, that's nowhere visible in the parse tree.
I suppose we could decide that it's illegal to allow
array_to_string() or format() to exist, but I don't think
anybody will like that answer.

I did just have a thought about this though --- now that we've
invented planner support functions [1], maybe we could define
a support function request that is "tell me the true volatility
of this function call".  Then array_to_string() could have a
support function that looks at the output function for its
input array's element type, and format()'s could determine the
most volatile of the output functions of any of its inputs, etc.

            regards, tom lane

[1] https://www.postgresql.org/docs/devel/xfunc-optimization.html



Re: BUG #17360: array_to_string should be immutable instead of stable

От
"David G. Johnston"
Дата:
On Mon, Jan 10, 2022 at 9:33 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
The core of the difficulty is that although timestamptz_out
is getting called, that's nowhere visible in the parse tree.
I suppose we could decide that it's illegal to allow
array_to_string() or format() to exist, but I don't think
anybody will like that answer.

Yeah.  The user expectation (and arguably architectural ideal) that the output representation of data types is immutable being violated is the bigger issue here but one that we are basically stuck with.  As you note below, however, since real complaints mostly manifest in polymorphic situations, coming up with a solution at that scope, one that makes function volatility a planning time property instead of a create function time one, makes the most sense.  Since the function is being defined for planning time type resolution, extending that to volatility is internally consistent.

David J.