Обсуждение: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from aREFCURSOR

Поиск
Список
Период
Сортировка

[WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from aREFCURSOR

От
Dent John
Дата:
Hi folks,

Prompted originally by a post by Roman Pekar [1], I wanted to share a revised version of a patch that allows REFCURSOR results to be consumed as data in a regular SQL query as well as my thoughts on how to improve the area as a whole.

In order to be clear about the purpose and how I see it fitting into a broader context, I’ve started a new thread and I’d welcome discussion about it.


Background
----------

The ambition of this contribution is to make PostgreSQL able to efficiently support procedural language functions that either produce or consume sets (or both).

PostgreSQL already has some support for this in functions that return SETOFs. However, as my review [3] identified, there are some gaps in PostgreSQL’s current capability, as well as scope for extension to improve its overall capability.

This first patch addresses only a small part of the overall ambition, but I wanted to share both the patch and the overall ambition as work in progress, and I’d welcome comments on both. (The patch is still based on 12beta2.)


Problems to be solved
---------------------

1. Allow procedural languages (PLs) to efficiently consume sets of records

PostgreSQL does allow PL functions to consume sets, however it does so be feeding records to the function, one row per function invocation. REFCURSORs, however can be supplied as input parameters and their content consumed by the function as it wishes, but only if the PL supports the REFCURSOR concept.

Typically, PLs do allow SQL queries to be executed within the PL function [5, 6, 7]. However REFCURSOR results cannot be effectively consumed in a regular SQL SELECT, significantly limiting their use.


2. Allow plpgsql functions to efficiently return sets of records

By ‘efficiently’, I mean that a large result set should not be required to be staged before the executor is able to process it. Staging is not an issue for small sets, but for large sets and especially if they are subject to further processing, intermediate staging it is a performance disadvantage.

PostgreSQL already has some support for this functions that return SETOFs. At present, plpgsql cannot take advantage of this support while also achieving the efficiency criteria because, as the documentation [4] notes, all returned data is staged before it is retuned.

Addressing this limitation could also of benefit to other PLs, however a quick survey finds at least PL Python is already well-adapted to efficiently return SETOFs.


3. Allow optimisation of a returned query

plpgsql offers a syntactic shortcut to return the results of a SQL query directly. Despite appearing to return a query, the RETURN QUERY syntax actually returns the /results/ of the query [4]. This means the optimiser has no opportunity to influence its execution, such as by pushing down expressions into its WHERE clause, or taking advantage of alternative indexes to modify its sort order.

Other PLs are similarly constrained. Most PLs lack plpgsql’s syntactic sugar, but even though some PLs are better able to efficiently return SETOFs row-by-row, the optimiser cannot see “inside” the query that the PL executes even if its intent is to return the results directly.

Only SQL language functions are afforded the luxury of integration into then outer statement’s plan [8], but even SQL language functions are somewhat constrained in the types of dynamism that are permitted.


4. Allow a set-returning query to be supplied as an input parameter

It is possible to supply a scalar value, or function call that returns a scalar value as an input parameter. And, with problems 1 & 2 addressed, sets can be supplied as input parameters. However, a literal set-returning SQL query cannot be supplied as a parameter (not without PostgreSQL invoking the ‘calling’ function for each row in the set). REFCURSORs cannot be constructed natively from SQL.

A simplistic response would provide a trivial constructor for REFCURSORs, accepting the query as a text parameter. However it is quite unnatural to supply SQL in textual form, more especially to do so safely. So the challenge is to allow a set-returning subquery to be provided as a parameter in literal form.


Why are these problems important?
---------------------------------

My personal wish is for PostgreSQL to offer a feature set that is consistent with itself and without arbitrary limitation. A better argument might be that that it is desirable to match the features of other RDBMSs [9], or for reason of the use cases they address [10] that are new or push the boundaries of what PostgreSQL can do [1], or that are important such as fulfilling a DW/ETL need [11], or to more directly address approaches touted of NoSQL such as Map Reduce [12].


Design and implementation
-------------------------

1. Set returning function (SRF) for REFCURSOR

Tackling first problems 1 and (part of) 2, it seems easy and obvious to allow that REFCURSORs can be consumed in a SELECT query.

PostgreSQL already allows an array to be consumed one record per entry via the UNNEST(anyarray) built-in function [13]. Overloading UNNEST() to accept a REFCURSOR argument can be done easily, and the executor’s SRF machinery allows the result set to be consumed efficiently.

With such an implementation, and given a REFCURSOR-returning function, kv() the following syntax illustrates trivial usage:

SELECT * 
  FROM UNNEST (kv ('A')) 
      AS (key TEXT, val NUMERIC);

With this UNNEST() construct, it is possible to consume a returned REFCURSOR inline in a single SQL statement.

To complete the example, the function kv() might trivially be defined as:

CREATE FUNCTION kv (suffix text) 
  RETURNS REFCURSOR 
  STABLE LANGUAGE plpgsql 
  AS $$ 
DECLARE 
  cur REFCURSOR;
BEGIN 
  OPEN cur FOR EXECUTE 
    'SELECT * FROM kv_table_' || suffix;
  RETURN cur;
END;
$$;

Other obvious example setup includes:

create table kv_table_a (key text, value numeric);
insert into kv_table_a select 'ITEM_A', generate_series (0, 99);

It is also possible to accept a REFCURSOR as an input parameter:

CREATE FUNCTION limit_val (input_refcur text, val_limit numeric) 
  RETURNS REFCURSOR 
  STABLE LANGUAGE plpgsql 
  AS $$ 
DECLARE 
  cur REFCURSOR;
BEGIN 
  OPEN cur FOR SELECT * FROM UNNEST (input_refcur::REFCURSOR) as (key text, value numeric) WHERE value < val_limit;
  RETURN cur;
END;
$$;

SELECT * 
  FROM UNNEST (limit_val (kv ('A')::text, 10))
      AS (key TEXT, val NUMERIC);

Having this construct, it is possible for plpgsql FUNCTION’s to both accept and return REFCURSOR variables. In plpgsql, is would be unnecessary to cast the REFCURSOR to and from text, but other PLs, presumably lacking first class knowledge of the REFCURSOR type, probably need to do so. In above example, limit_val() illustrates how a REFCURSOR can be accepted in text form.

In my patch, I’ve used the SPI APIs to access the Portal which lies behind the REFCURSOR. Although SPI seems to offer an easy interface, and it’s also what plpgsql uses internally, I’m not sure it wouldn’t be better to access the Portal directly.

It is interesting to note that Oracle names its similar construct TABLE() [9], rather than UNNEST(), and in later releases, its use is optional. TABLE is a reserved word and it would be unusual to overload it, although we could educate the parser to treat it specially. Oracle compatibility is an important consideration, but this is a niche area.

If we continue to use REFCURSOR, it is difficult to make some function call-like construct optional because it is already syntactically possible to select FROM a REFCURSOR-returning function. For example, SELECT * FROM kv (‘A’), is a valid and effective expression, despite being of questionable construction and utility.

An alternative might build on top of existing support for returning SETOFs, which already requires no UNNEST()-like construct. This is attractive in principle, but it makes some of the further extensions discussed below more awkward (in my opinion). 


2. Query-bound REFCURSORs

Problem 3 could be addressed by educating the planner in how to extract the query inside the Portal behind the REFCURSOR.

At present, REFCURSORs are only bound to a query once they are OPENed, but when they are OPENed, the query also is fully planned, ready for immediate execution. The ambition is to allow the REFCURSOR’s query to be inlined within the outer query’s plan, so it seems wasteful to expend planner cycles, only for the plan to be thrown away.

The proposed implementation would (1) create an intermediate BOUND state for REFCURSORs, and (2) educate plpgsql about how to BIND unbound_cursorvar FOR query.

My first idea was to modify the REFCURSOR type itself, creating a new state, and adding storage for the BOUND query, but this seems unfeasible without extensive hackery. The REFCURSOR type is a trivial veneer atop a C string (which contains the Portal name), so there is no internal structure to extend.

So my plan is to retain the direct coupling of REFCURSOR<->Portal, and to allow plpgsql to set the query text at BIND time via PortalDefineQuery(). Existing plpgsql code should be unaffected as it need know nothing about the new BOUND state.

In order for any of this to work, the planner has to be able to extract the query from the returned Portal. It seems inline_set_returning_function() is the right place to make this extraction. Adding specific recognition for a function call to UNNEST() with single argument of type REFCURSOR is easy, and existing eval_const_expressions() semantics determine whether the single argument expression can be evaluated at plan time. (Of course, if it cannot, then it falls through to be processed at execution time by the REFCURSOR set returning function (SRF) described above.)

It feels slightly uncomfortable to have UNNEST(REFCURSOR) operate as a regular function, and also have specific recognition for UNNEST() elsewhere in the planner machinery. Arguably, this is already a kind of specific knowledge that inline_set_returning_function() has for SQL language FUNCTIONs, but the recognition I propose for UNNEST(REFCURSOR) is much narrower. An alternative might be to introduce a new type that inline_set_returning_function() can recognise (for example, INLINEABLE_QUERY), or to entirely separate the SRF case from the inlining case at a syntax level (for example, retaining UNNEST(REFCURSOR) as the SRF, but using INLINE(REFCURSOR) for the case at hand).

I’d welcome input here. Although the implementation seems quite feasible, the SQL and plpgsql syntax is less obvious.


3. Literal subquery type

Problem 4 could be addressed by educating the parser specially about the REFCURSOR type when faced with a literal query.

Consider this example:

SELECT * 
  FROM UNNEST (
      limit_val (
        REFCURSOR (
          SELECT key || '_COPY', value FROM kv_table_a
        ), 25)
    ) AS (key TEXT, val NUMERIC);

The REFCURSOR(literal_query) construct could be made to result in a BOUND REFCURSOR, in this case, with SELECT key || '_COPY', value FROM kv_table_a, and then passed as a constant parameter to limit_val().

Usefully, at present, the construct yields a syntax error: although REFCURSOR(string) is an already valid construct (being equivalent to CAST (string AS REFCURSOR)), it’s not acceptable to provide a literal subquery without parenthesis. So, while REFCURSOR ((SELECT 'some_cursor')) is roughly equivalent to CAST ('some_cursor' AS REFCURSOR), the near-alternative of REFCURSOR (SELECT 'some_cursor') is quite simply not valid.

If I’m right, the task is simply a matter of ‘plumbing through’ special knowledge of a REFCURSOR(literal_subquery) construct through the parser. It seems tricky as there are many affected code sites, but the work seems uncomplicated.

Educating the parser about special types seems, again, slightly uncomfortable. An alternative might be to create an intermediate construct: for example, QUERY(literal_subquery) might be made to return the parse tree as a pg_node_tree (similar to how VIEWs are exposed in the system catalogue), and REFCURSOR(pg_node_tree) could consume it, yielding a joined-up construct of REFCURSOR(QUERY(literal_subquery)). However, we might also simply accept REFCURSOR(literal_subquery) to be special, and if/when other need is found for a literal subquery as a parameter, then this becomes the way to supply it.

For point of reference, Oracle seems to have gone the way of making CURSOR(literal_subquery) do something similar, yielding a REF CURSOR, which allows the CURSOR to be passed by reference [2].


Other problems
--------------

1. This contribution does not actually address the limitation in plpgsql, that the “current implementation of RETURN NEXT and RETURN QUERY stores the entire result set before returning from the function” [4]. My original investigation [3] presumed this limitation to apply generally to all PLs, but I now realise this is not the case: at least the Python PL allows efficient return of SETOFs [5]. With this in mind, I see the plpgsql limitation as less encumbering (as plpython is presumably broadly available) but I’d be interested to know if this view is shared.

2. A perhaps more significant problem is the apparent duplication of plpgsql’s RETURN QUERY syntax. One could perhaps conceive that plpgsql supported an additional marker, for example, RETURN [INLINEABLE] QUERY. It is difficult to see this fitting well with other PLs.

3. The current proposal also requires to declare the expected record with an AS (...) construction. This is rather inconvenient, but it is difficult to see how it could be avoided.

4. Other PLs can use REFCURSORS by virtue of REFCURSOR being a thin veneer atop string. It is coherent if one understands how the PostgreSQL type system works, but quite strange otherwise. Better integration into other PLs and their type systems might be important.

5. As mentioned, SETOF is a near-equivalent for some use cases. There’s no way to cast the results of a function RETURNING SETOF to a REFCURSOR without something like REFCURSOR (SELECT * from <some function>(...)). It might be useful to offer a little syntactic sugar. Perhaps we could invent NEST(SETOF <some RECORD type>) could return REFCURSOR.


References
----------


Вложения

Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from a REFCURSOR

От
Roman Pekar
Дата:
Hi John,

Thanks for pushing this, for me it looks like promising start! I need a bit more time to go through the code (and I'm not an expert in Postgres internals in any way) but I really appreciate you doing this.

Roman

Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from aREFCURSOR

От
Dent John
Дата:
Hi folks,

I’ve made a revision of this patch. 

The significant change is to access the Portal using Portal APIs rather than through SPI. It seems the persisted state necessary to survive being used to retrieve a row at a time inside an SRF just isn’t a good fit for SPI. 

It turned out there was upstream machinery in the FunctionScan node that prevented Postgres being able to pipeline SRFs, even if they return ValuePerCall. So, in practice, this patch is of limited benefit without another patch that changes that behaviour (see [1]). Nevertheless, the code is independent so I’m submitting the two changes separately. 

I’ll push this into the Jan commit fest.

denty. 


Вложения

Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from a REFCURSOR

От
"Daniel Verite"
Дата:
    Dent John wrote:

> I’ve made a revision of this patch.

Some comments:

* the commitfest app did not extract up the patch from the mail,
possibly because it's buried in the MIME structure of the mail
(using plain text instead of HTML messages might help with that).
The patch has no status in http://commitfest.cputube.org/
probably because of this too.


* unnest-refcursor-v3.patch needs a slight rebase because this chunk
in the Makefile fails
-    regexp.o regproc.o ri_triggers.o rowtypes.o ruleutils.o \
+    refcursor.o regexp.o regproc.o ri_triggers.o rowtypes.o ruleutils.o \


* I'm under the impression that refcursor_unnest() is functionally
equivalent to a plpgsql implementation like this:

create function unnest(x refcursor) returns setof record as $$
declare
 r record;
begin
  loop
    fetch x into r;
    exit when not found;
    return next r;
  end loop;
end $$ language plpgsql;

but it would differ in performance, because internally a materialization step
could be avoided, but only when the other patch "Allow FunctionScans to
pipeline results"  gets in?
Or are performance benefits expected right away with this patch?

*
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -5941,7 +5941,7 @@ array_fill_internal(ArrayType *dims, ArrayType *lbs,


 /*
- * UNNEST
+ * UNNEST (array)
  */

This chunk looks unnecessary?

* some user-facing doc would be needed.

* Is it good to overload "unnest" rather than choosing a specific
function name?


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from aREFCURSOR

От
Dent John
Дата:
> On 9 Jan 2020, at 17:43, Daniel Verite <daniel@manitou-mail.org> wrote:
>
> […]
> (using plain text instead of HTML messages might help with that).

Thanks. I’ll do that next time.

> […]
> * unnest-refcursor-v3.patch needs a slight rebase because this chunk
> in the Makefile fails
> -    regexp.o regproc.o ri_triggers.o rowtypes.o ruleutils.o \
> +    refcursor.o regexp.o regproc.o ri_triggers.o rowtypes.o ruleutils.o \

Likewise I’ll make that rebase in the next version.

> * I'm under the impression that refcursor_unnest() is functionally
> equivalent to a plpgsql implementation like this:
>
> […]
>
> but it would differ in performance, because internally a materialization step
> could be avoided, but only when the other patch "Allow FunctionScans to
> pipeline results"  gets in?

Yes. That’s at least true if unnest(x) is used in the FROM. If it’s used in the SELECT, actually it can get the
performancebenefit right away. However, in the SELECT case, there’s a bit of a gotcha because anonymous records can’t
easilybe manipulated because they have no type information available. So to make a useful performance contribution, it
doesneed to be combined with another change — either to make it FROM pipeline as in my other patch, or perhaps enabling
anonymousrecord types to be cast or otherwise manipulated. 

> […]
> /*
> - * UNNEST
> + * UNNEST (array)
>  */
>
> This chunk looks unnecessary?

It was for purpose of disambiguating. But indeed it is unnecessary. Choosing a different name would avoid need for it.

> * some user-facing doc would be needed.

Indeed. I fully intend that. I figured I’d get the concept on a firmer footing first.

> * Is it good to overload "unnest" rather than choosing a specific
> function name?

Yeah. I wondered about that. A couple of syntactically obvious ideas were:

SELECT … FROM TABLE (x) (which is what I think Oracle does, but is reserved)

SELECT … FROM CURSOR (x) (which seems likely to confuse, but, surprisingly, isn’t actually reserved)

SELECT … FROM FETCH (x) (which I quite like, but is reserved)

SELECT … FROM ROWS_FROM (x) (is okay, but conflicts with our ROWS FROM construct)

So I kind of landed on UNNEST(x) out of lack of better idea. EXPAND(x) could be an option. Actually ROWS_IN(x) or
ROWS_OF(x)might work. 

Do you have any preference or suggestion?

Thanks a lot for the feedback.

denty.


Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from a REFCURSOR

От
"Daniel Verite"
Дата:
    Dent John wrote:

> Yes. That’s at least true if unnest(x) is used in the FROM. If it’s used in
> the SELECT, actually it can get the performance benefit right away

At a quick glance, I don't see it called in the select-list  in any
of the regression tests. When trying it, it appears to crash (segfault):

postgres=# begin;
BEGIN

postgres=# declare c cursor for select oid::int as i, relname::text as r from
pg_class;
DECLARE CURSOR

postgres=# select unnest('c'::refcursor);
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

The build is configured with:
./configure --enable-debug --with-icu --with-perl --enable-depend


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from aREFCURSOR

От
Dent John
Дата:
> On 10 Jan 2020, at 15:45, Daniel Verite <daniel@manitou-mail.org> wrote:
>
> At a quick glance, I don't see it called in the select-list  in any
> of the regression tests. […]

Yep. I didn’t test it because I figured it wasn’t particularly useful in that context. I’ll add some tests for that too
onceI get to the root of the problem. 

> postgres=# select unnest('c'::refcursor);
> server closed the connection unexpectedly
>     This probably means the server terminated abnormally
>     before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.

Okay. That’s pretty bad, isn’t it.

It’s crashing when it’s checking that the returned tuple matches the declared return type in rsinfo->setDesc. Seems
rsinfo->setDescgets overwritten. So I think I have a memory management problem. 

To be honest, I wasn’t fully sure I’d got a clear understanding of what is in what memory context, but things seemed to
workso I figured it was close. Seems I was wrong. I need a bit of time to review. Leave it with me, but I guess it’ll
taketo next weekend before I get more time. 

denty.


Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from a REFCURSOR

От
"Daniel Verite"
Дата:
    Dent John wrote:

> It’s crashing when it’s checking that the returned tuple matches the
> declared return type in rsinfo->setDesc. Seems rsinfo->setDesc gets
> overwritten. So I think I have a memory management problem.

What is the expected result anyway? A single column with a "record"
type? FWIW I notice that with plpgsql, this is not allowed to happen:

CREATE FUNCTION cursor_unnest(x refcursor) returns setof record
as $$
declare
 r record;
begin
  loop
    fetch x into r;
    exit when not found;
    return next r;
  end loop;
end $$ language plpgsql;

begin;

declare c cursor for select oid::int as i, relname::text as r from pg_class;

select cursor_unnest('c');

ERROR:    set-valued function called in context that cannot accept a set
CONTEXT:  PL/pgSQL function cursor_unnest(refcursor) line 8 at RETURN NEXT


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from aREFCURSOR

От
Dent John
Дата:
> On 14 Jan 2020, at 14:53, Daniel Verite <daniel@manitou-mail.org> wrote:
>
> What is the expected result anyway? A single column with a "record"
> type? FWIW I notice that with plpgsql, this is not allowed to happen:

Hmm. How interesting.

I had not really investigated what happens in the case of a function returning SETOF (untyped) RECORD in a SELECT
clausebecause, whatever the result, there’s no mechanism to access the individual fields. 

As you highlight, it doesn’t work at all in plpgsql, and plperl is the same.

However, SQL language functions get away with it. For example, inspired by _pg_expandarray():

CREATE OR REPLACE FUNCTION public.my_pg_expandarray(anyarray)
RETURNS SETOF record
LANGUAGE sql
IMMUTABLE PARALLEL SAFE STRICT
AS $function$
    select $1[s], s - pg_catalog.array_lower($1,1) + 1
        from pg_catalog.generate_series(pg_catalog.array_lower($1,1),
            pg_catalog.array_upper($1,1), 1) as g(s)
$function$

postgres=# select my_pg_expandarray (array[0, 1, 2, 3, 4]);
 my_pg_expandarray
-------------------
 (0,1)
 (1,2)
 (2,3)
 (3,4)
 (4,5)
(5 rows)

Back in the FROM clause, it’s possible to manipulate the individual fields:

postgres=# select b, a from my_pg_expandarray (array[0, 1, 2, 3, 4]) as r(a int, b int);
 b | a
---+---
 1 | 0
 2 | 1
 3 | 2
 4 | 3
 5 | 4
(5 rows)

It’s quite interesting. All the other PLs make explicit checks for rsinfo.expectedDesc being non-NULL, but fmgr_sql()
explicitlycalls out the contrary: “[…] note we do not require caller to provide an expectedDesc.” So I guess either
there’ssomething special about the SQL PL, or perhaps the other PLs are just inheriting a pattern of being cautious. 

Either way, though, there’s no way that I can see to "get at” the fields inside the anonymous record that is returned
whenthe function is in the SELECT list. 

But back to the failure, I still need to make it not crash. I guess it doesn’t matter whether I simply refuse to work
ifcalled from the SELECT list, or just return an anonymous record, like fmgr_sql() does. 

d.


Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from aREFCURSOR

От
Dent John
Дата:
> On 11 Jan 2020, at 12:04, Dent John <denty@QQdd.eu> wrote:
>
>> On 10 Jan 2020, at 15:45, Daniel Verite <daniel@manitou-mail.org> wrote:
>>
>> postgres=# select unnest('c'::refcursor);
>> server closed the connection unexpectedly
>>     This probably means the server terminated abnormally
>>     before or while processing the request.
>> The connection to the server was lost. Attempting reset: Failed.
>
> Okay. That’s pretty bad, isn’t it.

I’ve addressed the issue, which was due to me allocating the TupleDesc in the multi_call_memory_ctx, which seemed quite
reasonable,but it actually needs to be in ecxt_per_query_memory. It seems tSRF-mode queries are much more sensitive to
themisstep. 

A v4 patch is attached, which also renames UNNEST(REFCURSOR) to ROWS_IN(REFCURSOR), adds a test case for use in tSRF
mode,and makes some minor fixes to the support function. 

I have not yet made steps towards documentation, nor yet rebased, so the Makefile chunk will probably still fail.

denty.

Вложения

Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from aREFCURSOR

От
Dent John
Дата:
> On 19 Jan 2020, at 22:30, Dent John <denty@QQdd.eu> wrote:
>
> I have not yet made steps towards documentation, nor yet rebased, so the Makefile chunk will probably still fail.

Attached patch addresses these points, so should now apply cleanly agains dev.

I also changed the OID assigned to ROWS_IN and its support function.

In passing, I noticed there is one existing function that can consume and make good use of ROWS_IN’s result when used
inthe target list, which is row_to_json. This is good, as it makes ROWS_IN useful even outside of a change to allow
resultsin the FROM to be pipelined. I’ve called out row_to_json specifically in the documentation change. 

denty.


Вложения

Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from a REFCURSOR

От
Thomas Munro
Дата:
On Sat, Jan 25, 2020 at 11:59 PM Dent John <denty@qqdd.eu> wrote:
> Attached patch addresses these points, so should now apply cleanly agains dev.

From the trivialities department, I see a bunch of warnings about
local declaration placement (we're still using C90 rules for those by
project policy):

refcursor.c:138:3: error: ISO C90 forbids mixed declarations and code
[-Werror=declaration-after-statement]
   MemoryContext oldcontext =
MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
   ^



Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from aREFCURSOR

От
Dent John
Дата:
> On 18 Feb 2020, at 03:03, Thomas Munro <thomas.munro@gmail.com> wrote:
> 
> From the trivialities department, I see a bunch of warnings about
> local declaration placement (we're still using C90 rules for those by
> project policy):
> 
> refcursor.c:138:3: error: ISO C90 forbids mixed declarations and code
> [-Werror=declaration-after-statement]
>   MemoryContext oldcontext =
> MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
>   ^

Thanks for pointing that out.

I have updated the patch.

denty.


Вложения

Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from aREFCURSOR

От
Dent John
Дата:
> On 22 Feb 2020, at 10:38, Dent John <denty@QQdd.eu> wrote:
>
>> On 18 Feb 2020, at 03:03, Thomas Munro <thomas.munro@gmail.com> wrote:
>>
>> From the trivialities department, I see a bunch of warnings about
>> local declaration placement (we're still using C90 rules for those by
>> project policy):
>>
>> […]
>
> […]

My bad. I missed on declaration.

Another patch attached.

d.

Вложения

Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from a REFCURSOR

От
Massimo Fidanza
Дата:
This is an interesting feature, but it seems that the author has abandoned development, what happens now? Will this be postponed from commitfest to commitfest and never be taken over by anyone?

Massimo.

Il giorno ven 6 mar 2020 alle ore 22:36 Dent John <denty@qqdd.eu> ha scritto:
> On 22 Feb 2020, at 10:38, Dent John <denty@QQdd.eu> wrote:
>
>> On 18 Feb 2020, at 03:03, Thomas Munro <thomas.munro@gmail.com> wrote:
>>
>> From the trivialities department, I see a bunch of warnings about
>> local declaration placement (we're still using C90 rules for those by
>> project policy):
>>
>> […]
>
> […]

My bad. I missed on declaration.

Another patch attached.

d.

Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from a REFCURSOR

От
Dent John
Дата:
Hi Massimo,

Thanks for the interest, and my apologies for the late reply.

I’m not particularly abandoning it, but I don’t have particular reason to make further changes at the moment. Far as I’m concerned it works, and the main question is whether it is acceptable and useful.

I’d be happy if you have feedback that evolves it or might push it up the queue for commitfest review.

d.

On 18 Jan 2021, at 23:09, Massimo Fidanza <malix0@gmail.com> wrote:

This is an interesting feature, but it seems that the author has abandoned development, what happens now? Will this be postponed from commitfest to commitfest and never be taken over by anyone?

Massimo.

Il giorno ven 6 mar 2020 alle ore 22:36 Dent John <denty@qqdd.eu> ha scritto:
> On 22 Feb 2020, at 10:38, Dent John <denty@QQdd.eu> wrote:
>
>> On 18 Feb 2020, at 03:03, Thomas Munro <thomas.munro@gmail.com> wrote:
>>
>> From the trivialities department, I see a bunch of warnings about
>> local declaration placement (we're still using C90 rules for those by
>> project policy):
>>
>> […]
>
> […]

My bad. I missed on declaration.

Another patch attached.

d.

Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from a REFCURSOR

От
Massimo Fidanza
Дата:
Hi John,

I never build postgresql from source, so I must get some information on how to apply your patch and do some test. I can't review your code because I know nothing about Postgresql internals and just basic C. I am mainly a PL/SQL programmer, with experience with PHP, Python and Javascript. If I can give some contribution I will be happy, but I need some help.

Massimo

Il giorno dom 7 feb 2021 alle ore 22:35 Dent John <denty@qqdd.co.uk> ha scritto:
Hi Massimo,

Thanks for the interest, and my apologies for the late reply.

I’m not particularly abandoning it, but I don’t have particular reason to make further changes at the moment. Far as I’m concerned it works, and the main question is whether it is acceptable and useful.

I’d be happy if you have feedback that evolves it or might push it up the queue for commitfest review.

d.

On 18 Jan 2021, at 23:09, Massimo Fidanza <malix0@gmail.com> wrote:

This is an interesting feature, but it seems that the author has abandoned development, what happens now? Will this be postponed from commitfest to commitfest and never be taken over by anyone?

Massimo.

Il giorno ven 6 mar 2020 alle ore 22:36 Dent John <denty@qqdd.eu> ha scritto:
> On 22 Feb 2020, at 10:38, Dent John <denty@QQdd.eu> wrote:
>
>> On 18 Feb 2020, at 03:03, Thomas Munro <thomas.munro@gmail.com> wrote:
>>
>> From the trivialities department, I see a bunch of warnings about
>> local declaration placement (we're still using C90 rules for those by
>> project policy):
>>
>> […]
>
> […]

My bad. I missed on declaration.

Another patch attached.

d.

Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from a REFCURSOR

От
Dent John
Дата:
Hi Massimo,

Happy to help. And actually, end user (i.e., developer) feedback on the feature’s usefulness is probably one of the more important contributions.

d.

On 10 Feb 2021, at 08:57, Massimo Fidanza <malix0@gmail.com> wrote:

Hi John,

I never build postgresql from source, so I must get some information on how to apply your patch and do some test. I can't review your code because I know nothing about Postgresql internals and just basic C. I am mainly a PL/SQL programmer, with experience with PHP, Python and Javascript. If I can give some contribution I will be happy, but I need some help.

Massimo

Il giorno dom 7 feb 2021 alle ore 22:35 Dent John <denty@qqdd.co.uk> ha scritto:
Hi Massimo,

Thanks for the interest, and my apologies for the late reply.

I’m not particularly abandoning it, but I don’t have particular reason to make further changes at the moment. Far as I’m concerned it works, and the main question is whether it is acceptable and useful.

I’d be happy if you have feedback that evolves it or might push it up the queue for commitfest review.

d.

On 18 Jan 2021, at 23:09, Massimo Fidanza <malix0@gmail.com> wrote:

This is an interesting feature, but it seems that the author has abandoned development, what happens now? Will this be postponed from commitfest to commitfest and never be taken over by anyone?

Massimo.

Il giorno ven 6 mar 2020 alle ore 22:36 Dent John <denty@qqdd.eu> ha scritto:
> On 22 Feb 2020, at 10:38, Dent John <denty@QQdd.eu> wrote:
>
>> On 18 Feb 2020, at 03:03, Thomas Munro <thomas.munro@gmail.com> wrote:
>>
>> From the trivialities department, I see a bunch of warnings about
>> local declaration placement (we're still using C90 rules for those by
>> project policy):
>>
>> […]
>
> […]

My bad. I missed on declaration.

Another patch attached.

d.


Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from a REFCURSOR

От
"Daniel Verite"
Дата:
   Hi,

Trying the v7a patch, here are a few comments:

* SIGSEGV with ON HOLD cursors.

Reproducer:

declare c cursor with hold for select oid,relname
  from pg_class order by 1 limit 10;

select * from rows_in('c') as x(f1 oid,f2 name);

consumes a bit of time, then crashes and generates a 13 GB core file
without a usable stacktrace:

Core was generated by `postgres: daniel postgres [local] SELECT      '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f4c5b2f3dc9 in ?? ()
(gdb) bt
#0  0x00007f4c5b2f3dc9 in ?? ()
#1  0x0000564567efc505 in ?? ()
#2  0x0000000000000001 in ?? ()
#3  0x000056456a4b28f8 in ?? ()
#4  0x000056456a4b2908 in ?? ()
#5  0x000056456a4b2774 in ?? ()
#6  0x000056456a4ad218 in ?? ()
#7  0x000056456a4b1590 in ?? ()
#8  0x0000000000000010 in ?? ()
#9  0x0000000000000000 in ?? ()


* rows_in() does not fetch from the current position of the cursor,
but from the start. For instance, I would expect that if doing
FETCH FROM cursor followed by SELECT * FROM rows_in('cursor'), the first
row would be ignored by rows_in(). That seems more convenient and more
principled.


*
+  <para>
+   This section describes functions that cursors to be manipulated
+   in normal <command>SELECT</command> queries.
+  </para>

A verb seems to be missing.
It should be "function that *allow* cursors to be..." or something
like that?

*
+   The <type>REFCURSOR</type> must be open, and the query must be a
+   <command>SELECT</command> statement. If the <type>REFCURSOR</type>’s
+   output does not

After </type> there is a fancy quote (codepoint U+2019). There is
currently no codepoint outside of US-ASCII in *.sgml ref/*.sgml, so
they're probably not welcome.


* Also: does the community wants it as a built-in function in core?
As mentioned in a previous round of review, a function like this in
plpgsql comes close:

create function rows_in(x refcursor) returns setof record as $$
declare
 r record;
begin
  loop
    fetch x into r;
    exit when not found;
    return next r;
  end loop;
end $$ language plpgsql;



Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: https://www.manitou-mail.org
Twitter: @DanielVerite



Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from a REFCURSOR

От
Daniel Gustafsson
Дата:
> On 29 Jul 2021, at 16:45, Daniel Verite <daniel@manitou-mail.org> wrote:

> Trying the v7a patch, here are a few comments:

This thread has stalled with no update or response to the above, and the patch
errors out on make check for the plpgsql suite.  I'm marking this Returned with
Feedback, please resubmit an updated patch if you would like to pursue this
further.

--
Daniel Gustafsson        https://vmware.com/