Обсуждение: RowDescription for a function does not include table OID

Поиск
Список
Период
Сортировка

RowDescription for a function does not include table OID

От
Maxwell Dreytser
Дата:
Hello,

I am working on a meta-programming use-case where I need to scrape some detailed information about the results of a function that "RETURNS TABLE (LIKE physical_table)", which ends up with prorettype = 'physical_table'::regtype.
The problem is that for the query "SELECT * FROM my_function()" the RowDescription that is sent back shows 0 for Table OID and Column Index.

From Wireshark:
PostgreSQL
    Type: Row description
    Length: 219
    Field count: 7
        Column name: table_id
            Table OID: 0
            Column index: 0
            Type OID: 20
            Column length: 8
            Type modifier: -1
            Format: Binary (1)
<snipped>

I would expect that the Table OID contains the relation OID of this table, as it would do for a typical statement like "SELECT * FROM my_table". It would seem there is a bug here that is preventing PostgreSQL from connecting the dots.

Regards,
Maxwell.

Re: RowDescription for a function does not include table OID

От
"David G. Johnston"
Дата:
On Fri, Jun 21, 2024 at 7:42 AM Maxwell Dreytser <Maxwell.Dreytser@assistek.com> wrote:
I am working on a meta-programming use-case where I need to scrape some detailed information about the results of a function that "RETURNS TABLE (LIKE physical_table)"

Yes, but the bug is yours.  The definition you want is:  RETURNS SETOF physical_table (not tested though)

What you did was produce a one-column table whose column type is a composite (and whose name is like - what with case-folding of unquoted identifiers).  Since that table doesn't exist anywhere in the catalogs it has no TableOID.

David J.

Re: RowDescription for a function does not include table OID

От
Maxwell Dreytser
Дата:
On Friday, June 21, 2024 10:48 AM David G. Johnston <david.g.johnston@gmail.com>wrote:

>Yes, but the bug is yours.  The definition you want is:  RETURNS SETOF physical_table (not tested though)
>What you did was produce a one-column table whose column type is a composite (and whose name is like - what with
case-foldingof unquoted identifiers).  Since that table doesn't exist anywhere in the catalogs it has no TableOID. 

SETOF also does not return correct RowDescription data. Table OID and column number are still both 0.
Both versions have the exact same pg_proc.prorettype. If I join this onto pg_type, the pg_type.typrelid =
'physical_table'::regclass.

Regards,
Maxwell


Re: RowDescription for a function does not include table OID

От
"David G. Johnston"
Дата:
On Fri, Jun 21, 2024 at 8:04 AM Maxwell Dreytser <Maxwell.Dreytser@assistek.com> wrote:
On Friday, June 21, 2024 10:48 AM David G. Johnston <david.g.johnston@gmail.com>wrote:

>Yes, but the bug is yours.  The definition you want is:  RETURNS SETOF physical_table (not tested though)
>What you did was produce a one-column table whose column type is a composite (and whose name is like - what with case-folding of unquoted identifiers).  Since that table doesn't exist anywhere in the catalogs it has no TableOID.

SETOF also does not return correct RowDescription data. Table OID and column number are still both 0.
Both versions have the exact same pg_proc.prorettype. If I join this onto pg_type, the pg_type.typrelid = 'physical_table'::regclass.


Interesting, then I suppose it is semantics.  There is no table involved - you are referencing the type of that name, not the table - so no TableOID.  There is no guarantee the row you are holding came from a table - and I'd interpret the current behavior as conveying that fact.  Though the current wording: "If the field can be identified as a column of a specific table, the object ID of the table; otherwise zero."; and the observation that at least a human "can identify" a related column, leads one to reasonably infer the system should be able to make such an identification as well.

I would expect you'd be able to find the pg_type.oid value somewhere in the RowDescription given those specifications, but not the pg_type.typrelid value.  But since the header has no allowance for a row type oid this information does seem to be missing.

In short, the system doesn't generate the information you need, where you need it, to tie these pieces together.  Modifying existing elements of the backend protocol is not presently in the cards.

David J.

Re: RowDescription for a function does not include table OID

От
Maxwell Dreytser
Дата:
On Friday, June 21, 2024 11:28 AM David G. Johnston <david.g.johnston@gmail.com> wrote:

> Interesting, then I suppose it is semantics.  There is no table involved - you are referencing the type of that name,
notthe table - so no TableOID.  There is no guarantee the row you are holding came from a table - and I'd interpret the
currentbehavior as conveying that fact.  Though the current wording: "If the field can be identified as a column of a
specifictable, the object ID of the table; otherwise zero."; and the observation that at least a human "can identify" a
relatedcolumn, leads one to reasonably infer the system should be able to make such an identification as well. 

This is exactly my point. If the return type of the function is strongly linked (directly in the function schema) to
thetable according to pg_catalog, the field can obviously be tied to a specific column of that specific table. The
RowDescriptionnot having that value filled in is a violation of that promise. 

> In short, the system doesn't generate the information you need, where you need it, to tie these pieces together. 
Modifyingexisting elements of the backend protocol is not presently in the cards. 

From my perspective this is clearly a bug as there is no way to define a function in a way that provides enough data to
thereader. 

Regards,
Maxwell.


Re: RowDescription for a function does not include table OID

От
Tom Lane
Дата:
Maxwell Dreytser <Maxwell.Dreytser@assistek.com> writes:
> I am working on a meta-programming use-case where I need to scrape some detailed information about the results of a
functionthat "RETURNS TABLE (LIKE physical_table)", which ends up with prorettype = 'physical_table'::regtype. 
> The problem is that for the query "SELECT * FROM my_function()" the RowDescription that is sent back shows 0 for
TableOID and Column Index. 

Yes, that's expected.  You're selecting from a function, not a table.

> I would expect that the Table OID contains the relation OID of this
> table, as it would do for a typical statement like "SELECT * FROM
> my_table".

The PG wire protocol specification [1] defines these fields thus:

    If the field can be identified as a column of a specific
    table, the object ID of the table; otherwise zero.

    If the field can be identified as a column of a specific
    table, the attribute number of the column; otherwise zero.

My reading of that is that we should populate these fields only for
the case of direct selection from a table.  If you go further than
that, then first off you have a ton of definitional issues (should it
"look through" views, for example?), and second you probably break
applications that are expecting the existing, longstanding definition.

            regards, tom lane

[1] https://www.postgresql.org/docs/current/protocol-message-formats.html



Re: RowDescription for a function does not include table OID

От
"David G. Johnston"
Дата:
On Fri, Jun 21, 2024 at 8:41 AM Maxwell Dreytser <Maxwell.Dreytser@assistek.com> wrote:
On Friday, June 21, 2024 11:28 AM David G. Johnston <david.g.johnston@gmail.com> wrote:

> In short, the system doesn't generate the information you need, where you need it, to tie these pieces together.  Modifying existing elements of the backend protocol is not presently in the cards.

From my perspective this is clearly a bug as there is no way to define a function in a way that provides enough data to the reader.

Quick search turned up this prior thread:


Based upon that unargued point the only bug here is in the documentation, leaving the reader to assume that some effort will be made to chain together a function returns clause to a physical table through that table's automatically-generated composite type.  We don't and never will modify the existing protocol message semantics in that respect.

David J.

Re: RowDescription for a function does not include table OID

От
"David G. Johnston"
Дата:
On Fri, Jun 21, 2024 at 8:51 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

The PG wire protocol specification [1] defines these fields thus:

        If the field can be identified as a column of a specific
        table, the object ID of the table; otherwise zero.

        If the field can be identified as a column of a specific
        table, the attribute number of the column; otherwise zero.

My reading of that is that we should populate these fields only for
the case of direct selection from a table.

s/can be identified as/is/g  ?

Experience shows people are inferring a lot from "can be identified" so we should remove it.  "is" maybe over-simplifies a bit but in the correct direction.

David J.

Re: RowDescription for a function does not include table OID

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> Based upon that unargued point the only bug here is in the documentation,
> leaving the reader to assume that some effort will be made to chain
> together a function returns clause to a physical table through that table's
> automatically-generated composite type.

Hmm, I read the documentation as making minimal promises about how
much effort will be expended, not maximal ones.

But in any case, I repeat the point that you can't open this can of
worms without having a lot of definitional slipperiness wriggle out.
Here is an example:

regression=# create table foo(a int, b int);
CREATE TABLE
regression=# create table bar(x int, y int, z int);
CREATE TABLE
regression=# create function f(int) returns setof foo stable
begin atomic select y, z from bar where x = $1; end;
CREATE FUNCTION

What labeling would you expect for "select * from f(...)",
and on what grounds?  It is by no stretch of the imagination a
select from table foo.  Moreover, the system has fully enough
information to perceive the query as a select from bar after
inlining the function call:

regression=# explain verbose select * from f(42);
                         QUERY PLAN                         
------------------------------------------------------------
 Seq Scan on public.bar  (cost=0.00..35.50 rows=10 width=8)
   Output: bar.y, bar.z
   Filter: (bar.x = 42)
(3 rows)

In fact, if we implemented this labeling at the tail end of
planning rather than early in parsing, it'd be fairly hard
to avoid labeling the output columns as bar.* rather than
foo.*.  But we don't, and I'm not seeing an upside to
redefining how that works.

I've long forgotten the alleged JDBC connection that David
mentions, but it's surely just the tip of the iceberg of
client-side code that we could break if we change how this
works.

            regards, tom lane



Re: RowDescription for a function does not include table OID

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Fri, Jun 21, 2024 at 8:51 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The PG wire protocol specification [1] defines these fields thus:
>>     If the field can be identified as a column of a specific
>>     table, the object ID of the table; otherwise zero.

> s/can be identified as/is/g  ?

> Experience shows people are inferring a lot from "can be identified" so we
> should remove it.  "is" maybe over-simplifies a bit but in the correct
> direction.

I dunno, that seems to me to be just as open to argument if not
more so.  Perhaps some phrasing like "can be directly identified"?

The real point IMV is that it's based purely on parse analysis,
without looking into the behavior of views or functions (which
could change between parsing and execution, anyway).

            regards, tom lane