Обсуждение: Is a function to a 1-component record type undeclarable?

Поиск
Список
Период
Сортировка

Is a function to a 1-component record type undeclarable?

От
Chapman Flack
Дата:
Hi,

This example in the docs declares a function returning an anonymous
2-component record:

CREATE FUNCTION dup(in int, out f1 int, out f2 text)
    AS $$ SELECT $1, CAST($1 AS text) || ' is text' $$
    LANGUAGE SQL;

The same declaration can be changed to have just one OUT parameter:

CREATE FUNCTION dup(in int, out f text)
    AS $$ SELECT CAST($1 AS text) || ' is text' $$
    LANGUAGE SQL;

But it then behaves as a function returning a text scalar, not a record.
It is distinguishable in the catalog though; it has prorettype text,
proallargtypes {int,text}, proargmodes {i,o}, proargnames {"",f}.

The first declaration can have RETURNS RECORD explicitly added (which
doesn't change its meaning any).

If RETURNS RECORD is added to the second, this error results:

ERROR:  function result type must be text because of OUT parameters

Is that a better outcome than saying "ah, the human has said what he means,
and intends a record type here"? It seems the case could easily be
distinguished in the catalog by storing record as prorettype.

Perhaps more surprisingly, the RETURNS TABLE syntax for the set-returning
case has the same quirk; RETURNS TABLE (f text) behaves as setof text
rather than setof record. Again it's distinguishable in the catalog,
this time with t in place of o in proargmodes.

In this case, clearly the meaning of RETURNS TABLE with one component
can't be changed, as it's already established the way it is, but the
equivalent syntax with one OUT parameter and RETURNS RECORD is currently
rejected with an error just as in the non-SETOF case, so would it not
be equally feasible to just allow that syntax and let it mean what it says?

Regards,
-Chap


In passing, I also noticed RETURNS TABLE () is a syntax error. I have
no use case in mind for a function returning an empty composite result,
but I note that we do allow zero-column tables and empty composite types.
And it still has 1 bit of entropy: you can tell an empty composite value
from a null.



Re: Is a function to a 1-component record type undeclarable?

От
Tom Lane
Дата:
Chapman Flack <chap@anastigmatix.net> writes:
> The same declaration can be changed to have just one OUT parameter:

> CREATE FUNCTION dup(in int, out f text)
>     AS $$ SELECT CAST($1 AS text) || ' is text' $$
>     LANGUAGE SQL;

> But it then behaves as a function returning a text scalar, not a record.

Yup, that's intentional, and documented.  It seems more useful to allow
you to declare a scalar-returning function in this style, if you wish,
than to make it mean a one-component record.  If you really want a
one-component composite type, you can declare such a type explicitly
and return that.

            regards, tom lane



Re: Is a function to a 1-component record type undeclarable?

От
Chapman Flack
Дата:
On 11/22/21 11:15, Tom Lane wrote:
> Yup, that's intentional, and documented.

I think I found where it's documented; nothing under argmode/column_type
/column_name, but just enough under rettype to entail the current behavior.

> It seems more useful to allow you to declare a scalar-returning function
> in this style, if you wish, than to make it mean a one-component record.

Would that usefulness be diminished any by allowing the currently-rejected
explicit RECORD syntax to be accepted and explicitly mean record?

Regards,
-Chap



Re: Is a function to a 1-component record type undeclarable?

От
Pavel Stehule
Дата:
Hi

po 22. 11. 2021 v 17:00 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
Hi,

This example in the docs declares a function returning an anonymous
2-component record:

CREATE FUNCTION dup(in int, out f1 int, out f2 text)
    AS $$ SELECT $1, CAST($1 AS text) || ' is text' $$
    LANGUAGE SQL;

The same declaration can be changed to have just one OUT parameter:

CREATE FUNCTION dup(in int, out f text)
    AS $$ SELECT CAST($1 AS text) || ' is text' $$
    LANGUAGE SQL;

But it then behaves as a function returning a text scalar, not a record.
It is distinguishable in the catalog though; it has prorettype text,
proallargtypes {int,text}, proargmodes {i,o}, proargnames {"",f}.

The first declaration can have RETURNS RECORD explicitly added (which
doesn't change its meaning any).

If RETURNS RECORD is added to the second, this error results:

ERROR:  function result type must be text because of OUT parameters

Is that a better outcome than saying "ah, the human has said what he means,
and intends a record type here"? It seems the case could easily be
distinguished in the catalog by storing record as prorettype.

Perhaps more surprisingly, the RETURNS TABLE syntax for the set-returning
case has the same quirk; RETURNS TABLE (f text) behaves as setof text
rather than setof record. Again it's distinguishable in the catalog,
this time with t in place of o in proargmodes.

In this case, clearly the meaning of RETURNS TABLE with one component
can't be changed, as it's already established the way it is, but the
equivalent syntax with one OUT parameter and RETURNS RECORD is currently
rejected with an error just as in the non-SETOF case, so would it not
be equally feasible to just allow that syntax and let it mean what it says?

I agree so this is not consistent, but changing semantics of RETURNS TABLE after 12 years is not good (and possible), and implemented behaviour has some logic based on the possibility to work with scalars in scalar and tabular contexts in Postgres. In this time lateral join was not implemented and sometimes the tabular functions had to be used in scalar context. Moreover, the overhead of work with composite types had much more impact than now, and there was strong preference to use composite types only when it was really necessary.
 
My opinions about allowing RETURNS RECORD for one column table is neutral. It can be relatively natural syntax for enforcing output composite type. On second hand, usually in usual usage, there is not a big difference if functions return a scalar or one column table (after implicit expansion), and the composite type should be slower. But if you need to make a composite, you can use just the ROW constructor.

postgres=# select row(g) from generate_series(1, 3) g(v);
┌─────┐
│ row │
╞═════╡
│ (1) │
│ (2) │
│ (3) │
└─────┘
(3 rows)

postgres=# select (row(g)).* from generate_series(1, 3) g(v);
┌────┐
│ f1 │
╞════╡
│  1 │
│  2 │
│  3 │
└────┘
(3 rows)

And if we allow RETURNS RECORD, then there will be new inconsistency between OUT variables and RETURNS TABLE, so at the end I don't see stronger benefits than negatives.

Do you have some real use cases, where proposed functionality will carry some benefit?

I remember, when I started with Postgres, and when I started hacking Postgres I had a lot of problems with implicit unpacking of composite types somewhere and with necessity to support scalar and composite values as separate classes. I agree, so this is a real issue, but the beginnings of this issue are 20 years ago, maybe 40 years ago, when composite types were introduced, and when composite types were mapped to SQL tables (when QEL was replaced by SQL). I don't think so now it is possible to fix it.



Regards,
-Chap


In passing, I also noticed RETURNS TABLE () is a syntax error. I have
no use case in mind for a function returning an empty composite result,
but I note that we do allow zero-column tables and empty composite types.
And it still has 1 bit of entropy: you can tell an empty composite value
from a null.

If RETURNS TABLE (1col) returns scalar, then RETURNS TABLE () does not make any sense. The sequence 0 .. empty record, 1 scalar, 2+ composite looks scary too.

Regards

Pavel

Re: Is a function to a 1-component record type undeclarable?

От
Chapman Flack
Дата:
On 11/22/21 11:59, Pavel Stehule wrote:
> And if we allow RETURNS RECORD, then there will be new inconsistency
> between OUT variables and RETURNS TABLE

I don't really see it as a new inconsistency, so much as the same old
inconsistency but with an escape hatch if you really mean the other thing.

I take the consistency you speak of here to be "anything you can say with
OUT parameters is equivalent to something you can say with RETURNS TABLE."
That is tidy, but I don't think it suffers much if it becomes "everything
you can say with RETURNS TABLE is something you can equivalently say with
OUT parameters, but there is one thing for historical reasons you can't say
with RETURNS TABLE, and if you need to say that, with OUT params you can."

> Do you have some real use cases, where proposed functionality will carry
> some benefit?

The most general might be a refactoring situation where you start with
something producing a two-component record and one of those goes away, and
you want to make the minimally invasive changes. Going through containing
queries to add or remove row() or .foo would be more invasive.

I often am coming from the position of a PL maintainer, where my aim
is to present an accurate picture of what is going on in PostgreSQL
to people who are thinking in Java, and to support them with language
constructs that will do what they expect. I happened to notice today
that I am generating SQL that won't succeed if a Java function declares
a one-component record result. Ok, so that's a bug I have to fix, and
document what the real behavior is. Beyond that, if I could also say
"if a one component record is really what you want, then write *this*",
I think that would be good.


It seems like something that would entail a very easy change in the docs.
The paragraph that now says

  When there are OUT or INOUT parameters, the RETURNS clause can be
  omitted. If present, it must agree with the result type implied by
  the output parameters: RECORD if there are multiple output parameters,
  or the same type as the single output parameter.

it could simply say

  When there are OUT or INOUT parameters, the RETURNS clause can be
  omitted. If present, it must agree with the result type implied by
  the output parameters: always RECORD if there are multiple output
  parameters. For exactly one output parameter, there is a choice:
  the same type as the single output parameter (which is the default
  if the clause is omitted), or RECORD if the function should really
  return a composite type with one component.

Regards,
-Chap



Re: Is a function to a 1-component record type undeclarable?

От
Pavel Stehule
Дата:


po 22. 11. 2021 v 18:43 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 11/22/21 11:59, Pavel Stehule wrote:
> And if we allow RETURNS RECORD, then there will be new inconsistency
> between OUT variables and RETURNS TABLE

I don't really see it as a new inconsistency, so much as the same old
inconsistency but with an escape hatch if you really mean the other thing.

I take the consistency you speak of here to be "anything you can say with
OUT parameters is equivalent to something you can say with RETURNS TABLE."
That is tidy, but I don't think it suffers much if it becomes "everything
you can say with RETURNS TABLE is something you can equivalently say with
OUT parameters, but there is one thing for historical reasons you can't say
with RETURNS TABLE, and if you need to say that, with OUT params you can."

> Do you have some real use cases, where proposed functionality will carry
> some benefit?

The most general might be a refactoring situation where you start with
something producing a two-component record and one of those goes away, and
you want to make the minimally invasive changes. Going through containing
queries to add or remove row() or .foo would be more invasive.

I think so on SQL level, the difference should be minimal.


I often am coming from the position of a PL maintainer, where my aim
is to present an accurate picture of what is going on in PostgreSQL
to people who are thinking in Java, and to support them with language
constructs that will do what they expect. I happened to notice today
that I am generating SQL that won't succeed if a Java function declares
a one-component record result. Ok, so that's a bug I have to fix, and
document what the real behavior is. Beyond that, if I could also say
"if a one component record is really what you want, then write *this*",
I think that would be good.

I understand. PLpgSQL does this magic implicitly, so users don't need to solve it, but the complexity of PLpgSQL (some routines) are significantly higher and some features cannot be implemented, because some semantics are ambiguous. Unfortunately, there can be performance differences although the behaviour on SQL level will be the same. And still it is an open question how much slower it is to work with one column composite than with scalar.



It seems like something that would entail a very easy change in the docs.
The paragraph that now says

  When there are OUT or INOUT parameters, the RETURNS clause can be
  omitted. If present, it must agree with the result type implied by
  the output parameters: RECORD if there are multiple output parameters,
  or the same type as the single output parameter.

it could simply say

  When there are OUT or INOUT parameters, the RETURNS clause can be
  omitted. If present, it must agree with the result type implied by
  the output parameters: always RECORD if there are multiple output
  parameters. For exactly one output parameter, there is a choice:
  the same type as the single output parameter (which is the default
  if the clause is omitted), or RECORD if the function should really
  return a composite type with one component.

+1

Regards

Pavel

Regards,
-Chap