Обсуждение: BUG #18077: PostgreSQL server subprocess crashed by a SELECT statement with WITH clause

Поиск
Список
Период
Сортировка

BUG #18077: PostgreSQL server subprocess crashed by a SELECT statement with WITH clause

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      18077
Logged by:          Jingzhou Fu
Email address:      fuboat@outlook.com
PostgreSQL version: 15.4
Operating system:   Ubuntu 20.04 x64
Description:

PostgreSQL server subprocess crashed by a SELECT statement with WITH clause.
It did not affect the main process. It can be reproduced on PostgreSQL
15.4.

PoC:
```sql
WITH x ( x ) AS ( SELECT ( 1 , 'x' ) ) SELECT FROM x WHERE ( SELECT FROM (
SELECT x ) x WHERE ( SELECT x ( x ) ) )
```

Backtrace of the crashed subprocess:
```
#0 0x957879 (GetRTEByRangeTablePosn+0x209)
#1 0x96ef5c (expandRecordVariable+0x16c)
#2 0x96f2d0 (expandRecordVariable+0x4e0)
#3 0x96f2d0 (expandRecordVariable+0x4e0)
#4 0x9468bc (ParseComplexProjection+0xbc)
#5 0x943823 (ParseFuncOrColumn+0x1123)
#6 0x93866a (transformExprRecurse+0x38ba)
#7 0x934d5b (transformExpr+0x4b)
#8 0x96a439 (transformTargetList+0x519)
#9 0x8c5835 (transformStmt+0x4b45)
#10 0x8c0cb0 (parse_sub_analyze+0xa0)
#11 0x936713 (transformExprRecurse+0x1963)
#12 0x934d5b (transformExpr+0x4b)
#13 0x913509 (transformWhereClause+0x49)
#14 0x8c589a (transformStmt+0x4baa)
#15 0x8c0cb0 (parse_sub_analyze+0xa0)
#16 0x936713 (transformExprRecurse+0x1963)
#17 0x934d5b (transformExpr+0x4b)
#18 0x913509 (transformWhereClause+0x49)
#19 0x8c589a (transformStmt+0x4baa)
#20 0x8bfa85 (parse_analyze_fixedparams+0x305)
#21 0x11c3f00 (exec_simple_query+0xd40)
#22 0x11bdfb4 (PostgresMain+0x2d94)
#23 0xf91d9e (BackendRun+0x7e)
#24 0xf9b7be (ServerLoop+0x20ae)
#25 0xf94094 (PostmasterMain+0x2264)
#26 0xd04462 (main+0x452)
#27 0x7f3ab3637083 (__libc_start_main+0xf3)
#28 0x4a0c4e (_start+0x2e)
```


Re: BUG #18077: PostgreSQL server subprocess crashed by a SELECT statement with WITH clause

От
Richard Guo
Дата:

On Wed, Aug 30, 2023 at 4:06 PM PG Bug reporting form <noreply@postgresql.org> wrote:
PostgreSQL server subprocess crashed by a SELECT statement with WITH clause.
It did not affect the main process. It can be reproduced on PostgreSQL
15.4.

PoC:
```sql
WITH x ( x ) AS ( SELECT ( 1 , 'x' ) ) SELECT FROM x WHERE ( SELECT FROM (
SELECT x ) x WHERE ( SELECT x ( x ) ) )
```

Thanks for the report!  Reproduced here on HEAD.  I looked into it a
little bit and it seems that when we expand a Var of type RECORD from a
RTE_SUBQUERY, we mess up with the level of ParseState.  For example,

select * from (SELECT(1, 'a')) as t(c)
WHERE (SELECT * FROM (SELECT c as c1) s
       WHERE (select * from func(c1) f));

When we expand Var 'c1' from func(c1), we figure out that it comes from
subquery 's'.  When we recurse into subquery 's', we just build an
additional level of ParseState atop the current ParseState, which seems
not correct.  Shouldn't we climb up by the nesting depth first before we
build the additional level of ParseState?  Something like

--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -1591,6 +1591,12 @@ expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
                     */
                    ParseState  mypstate = {0};

+                   for (int i = 0; i < netlevelsup; i++)
+                   {
+                       pstate = pstate->parentParseState;
+                       Assert(pstate != NULL);
+                   }
+
                    mypstate.parentParseState = pstate;
                    mypstate.p_rtable = rte->subquery->rtable;

Thanks
Richard

Re: BUG #18077: PostgreSQL server subprocess crashed by a SELECT statement with WITH clause

От
Richard Guo
Дата:

On Wed, Aug 30, 2023 at 7:42 PM Richard Guo <guofenglinux@gmail.com> wrote:
On Wed, Aug 30, 2023 at 4:06 PM PG Bug reporting form <noreply@postgresql.org> wrote:
PostgreSQL server subprocess crashed by a SELECT statement with WITH clause.
It did not affect the main process. It can be reproduced on PostgreSQL
15.4.

PoC:
```sql
WITH x ( x ) AS ( SELECT ( 1 , 'x' ) ) SELECT FROM x WHERE ( SELECT FROM (
SELECT x ) x WHERE ( SELECT x ( x ) ) )
```

Thanks for the report!  Reproduced here on HEAD.  I looked into it a
little bit and it seems that when we expand a Var of type RECORD from a
RTE_SUBQUERY, we mess up with the level of ParseState.  For example,

select * from (SELECT(1, 'a')) as t(c)
WHERE (SELECT * FROM (SELECT c as c1) s
       WHERE (select * from func(c1) f));

When we expand Var 'c1' from func(c1), we figure out that it comes from
subquery 's'.  When we recurse into subquery 's', we just build an
additional level of ParseState atop the current ParseState, which seems
not correct.  Shouldn't we climb up by the nesting depth first before we
build the additional level of ParseState?  Something like

--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -1591,6 +1591,12 @@ expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
                     */
                    ParseState  mypstate = {0};

+                   for (int i = 0; i < netlevelsup; i++)
+                   {
+                       pstate = pstate->parentParseState;
+                       Assert(pstate != NULL);
+                   }
+
                    mypstate.parentParseState = pstate;
                    mypstate.p_rtable = rte->subquery->rtable;

Here is the patch.

Thanks
Richard
Вложения

Re: BUG #18077: PostgreSQL server subprocess crashed by a SELECT statement with WITH clause

От
Tom Lane
Дата:
Richard Guo <guofenglinux@gmail.com> writes:
> On Wed, Aug 30, 2023 at 7:42 PM Richard Guo <guofenglinux@gmail.com> wrote:
>> When we expand Var 'c1' from func(c1), we figure out that it comes from
>> subquery 's'.  When we recurse into subquery 's', we just build an
>> additional level of ParseState atop the current ParseState, which seems
>> not correct.  Shouldn't we climb up by the nesting depth first before we
>> build the additional level of ParseState?  Something like
>> ...

> Here is the patch.

Yeah, I think your diagnosis is correct.  The existing regression tests
reach this code path, but not with netlevelsup different from zero.
I noted from the code coverage report that the same is true of the
nearby RTE_CTE code path: that does have a loop to crawl up the pstate
stack, but it isn't getting iterated.  The attached improved patch
extends the test case so it also covers that.

I would have liked to also cover the RTE_JOIN case, which the code
coverage report shows to be completely untested.  However, I failed
to make a test case that reached that.  I think it might be a lot
harder to reach in the wake of 9ce77d75c, which narrowed the cases
in which join alias Vars are created.

I also spent a little bit of effort on improving the comments and
removing cosmetic differences between the SUBQUERY and CTE cases.

            regards, tom lane

diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c
index 57247de363..3bc62ac3ba 100644
--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -1499,7 +1499,8 @@ ExpandRowReference(ParseState *pstate, Node *expr,
  * drill down to find the ultimate defining expression and attempt to infer
  * the tupdesc from it.  We ereport if we can't determine the tupdesc.
  *
- * levelsup is an extra offset to interpret the Var's varlevelsup correctly.
+ * levelsup is an extra offset to interpret the Var's varlevelsup correctly
+ * when recursing.  Outside callers should pass zero.
  */
 TupleDesc
 expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
@@ -1587,10 +1588,17 @@ expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
                     /*
                      * Recurse into the sub-select to see what its Var refers
                      * to.  We have to build an additional level of ParseState
-                     * to keep in step with varlevelsup in the subselect.
+                     * to keep in step with varlevelsup in the subselect;
+                     * furthermore, the subquery RTE might be from an outer
+                     * query level, in which case the ParseState for the
+                     * subselect must have that outer level as parent.
                      */
                     ParseState    mypstate = {0};
+                    Index        levelsup;

+                    /* this loop must work, since GetRTEByRangeTablePosn did */
+                    for (levelsup = 0; levelsup < netlevelsup; levelsup++)
+                        pstate = pstate->parentParseState;
                     mypstate.parentParseState = pstate;
                     mypstate.p_rtable = rte->subquery->rtable;
                     /* don't bother filling the rest of the fake pstate */
@@ -1641,12 +1649,11 @@ expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
                      * Recurse into the CTE to see what its Var refers to. We
                      * have to build an additional level of ParseState to keep
                      * in step with varlevelsup in the CTE; furthermore it
-                     * could be an outer CTE.
+                     * could be an outer CTE (compare SUBQUERY case above).
                      */
-                    ParseState    mypstate;
+                    ParseState    mypstate = {0};
                     Index        levelsup;

-                    MemSet(&mypstate, 0, sizeof(mypstate));
                     /* this loop must work, since GetCTEForRTE did */
                     for (levelsup = 0;
                          levelsup < rte->ctelevelsup + netlevelsup;
diff --git a/src/test/regress/expected/rowtypes.out b/src/test/regress/expected/rowtypes.out
index 981ee0811a..d735a95bdc 100644
--- a/src/test/regress/expected/rowtypes.out
+++ b/src/test/regress/expected/rowtypes.out
@@ -1240,6 +1240,42 @@ select r, r is null as isnull, r is not null as isnotnull from r;
  (,)         | t      | f
 (6 rows)

+--
+-- Check parsing of indirect references to composite values (bug #18077)
+--
+explain (verbose, costs off)
+with cte(c) as materialized (select row(1, 2)),
+     cte2(c) as (select * from cte)
+select * from cte2 as t
+where (select * from (select c as c1) s
+       where (select (c1).f1 > 0)) is not null;
+                 QUERY PLAN
+--------------------------------------------
+ CTE Scan on cte
+   Output: cte.c
+   Filter: ((SubPlan 3) IS NOT NULL)
+   CTE cte
+     ->  Result
+           Output: '(1,2)'::record
+   SubPlan 3
+     ->  Result
+           Output: cte.c
+           One-Time Filter: $2
+           InitPlan 2 (returns $2)
+             ->  Result
+                   Output: ((cte.c).f1 > 0)
+(13 rows)
+
+with cte(c) as materialized (select row(1, 2)),
+     cte2(c) as (select * from cte)
+select * from cte2 as t
+where (select * from (select c as c1) s
+       where (select (c1).f1 > 0)) is not null;
+   c
+-------
+ (1,2)
+(1 row)
+
 --
 -- Tests for component access / FieldSelect
 --
diff --git a/src/test/regress/sql/rowtypes.sql b/src/test/regress/sql/rowtypes.sql
index 565e6249d5..11bfcdee3a 100644
--- a/src/test/regress/sql/rowtypes.sql
+++ b/src/test/regress/sql/rowtypes.sql
@@ -494,6 +494,21 @@ with r(a,b) as materialized
           (null,row(1,2)), (null,row(null,null)), (null,null) )
 select r, r is null as isnull, r is not null as isnotnull from r;

+--
+-- Check parsing of indirect references to composite values (bug #18077)
+--
+explain (verbose, costs off)
+with cte(c) as materialized (select row(1, 2)),
+     cte2(c) as (select * from cte)
+select * from cte2 as t
+where (select * from (select c as c1) s
+       where (select (c1).f1 > 0)) is not null;
+
+with cte(c) as materialized (select row(1, 2)),
+     cte2(c) as (select * from cte)
+select * from cte2 as t
+where (select * from (select c as c1) s
+       where (select (c1).f1 > 0)) is not null;

 --
 -- Tests for component access / FieldSelect

Re: BUG #18077: PostgreSQL server subprocess crashed by a SELECT statement with WITH clause

От
Richard Guo
Дата:

On Sat, Sep 2, 2023 at 4:41 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <guofenglinux@gmail.com> writes:
> On Wed, Aug 30, 2023 at 7:42 PM Richard Guo <guofenglinux@gmail.com> wrote:
>> When we expand Var 'c1' from func(c1), we figure out that it comes from
>> subquery 's'.  When we recurse into subquery 's', we just build an
>> additional level of ParseState atop the current ParseState, which seems
>> not correct.  Shouldn't we climb up by the nesting depth first before we
>> build the additional level of ParseState?  Something like
>> ...

> Here is the patch.

Yeah, I think your diagnosis is correct.  The existing regression tests
reach this code path, but not with netlevelsup different from zero.
I noted from the code coverage report that the same is true of the
nearby RTE_CTE code path: that does have a loop to crawl up the pstate
stack, but it isn't getting iterated.  The attached improved patch
extends the test case so it also covers that.

+1 to the v2 patch.

BTW, do you think get_name_for_var_field() has similar problem for
RTE_SUBQUERY case?  The RTE_CTE code path in that function crawls up the
namespace stack before recursing into the CTE while the RTE_SUBQUERY
code patch does not, which looks like an oversight.  I tried to find a
test case to show it's indeed a problem but with no luck.

Thanks
Richard

Re: BUG #18077: PostgreSQL server subprocess crashed by a SELECT statement with WITH clause

От
Tom Lane
Дата:
Richard Guo <guofenglinux@gmail.com> writes:
> BTW, do you think get_name_for_var_field() has similar problem for
> RTE_SUBQUERY case?  The RTE_CTE code path in that function crawls up the
> namespace stack before recursing into the CTE while the RTE_SUBQUERY
> code patch does not, which looks like an oversight.

Hmm, seems suspicious ...

> I tried to find a
> test case to show it's indeed a problem but with no luck.

Note that any test case here would be of the form "dump a view
or rule definition", not "EXPLAIN".  What did you try?

            regards, tom lane



Re: BUG #18077: PostgreSQL server subprocess crashed by a SELECT statement with WITH clause

От
Richard Guo
Дата:

On Tue, Sep 5, 2023 at 10:04 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <guofenglinux@gmail.com> writes:
> BTW, do you think get_name_for_var_field() has similar problem for
> RTE_SUBQUERY case?  The RTE_CTE code path in that function crawls up the
> namespace stack before recursing into the CTE while the RTE_SUBQUERY
> code patch does not, which looks like an oversight.

Hmm, seems suspicious ...

> I tried to find a
> test case to show it's indeed a problem but with no luck.

Note that any test case here would be of the form "dump a view
or rule definition", not "EXPLAIN".  What did you try?

Ah, thanks.  I got one of the form "dump a view" leveraging your test
case from the v2 patch (with a minor tweak).

create view composite_v as
with cte(c) as materialized (select row(1, 2)),
     cte2(c) as (select * from cte)
select 1 from cte2 as t
where (select * from (select c as c1) s
       where (select (c1).f1 > 0)) is not null;

select pg_get_viewdef('composite_v', true);
ERROR:  bogus varno: 1

So it is indeed a problem!

Here is v3 patch which is v2 + fix for this issue.

Thanks
Richard
Вложения

Re: BUG #18077: PostgreSQL server subprocess crashed by a SELECT statement with WITH clause

От
"Lepikhov Andrei"
Дата:
Hi,

I am writing here just because you change this specific part of code.
Designing a custom node I found the problem with CTE and Subqueries. The reproduction sample looks quite similar to
yours:

create view tt24v as
with cte as materialized (select r from (values(1,2),(3,4)) r)
select (r).column2 as col_a, (rr).column2 as col_b from
  cte join (select rr from (values(1,7),(3,8)) rr limit 2) ss
  on (r).column1 = (rr).column1;
explain (verbose, costs off) select * from tt24v;

but fails with the error "failed to find plan for CTE ..." with a custom node over a JOIN. Doing a trick like in
trick.diffin attachment, I can obtain the next plan: 

 Result
   Output: (cte.r).column2, (ss.rr).column2
   CTE cte
     ->  Values Scan on "*VALUES*_2"
           Output: ROW("*VALUES*_2".column1, "*VALUES*_2".column2)
   ->  Custom Scan (XXX)
         Output: cte.r, ss.rr
         ->  Hash Join
               Output: cte.r, (ROW("*VALUES*".column1, "*VALUES*".column2))
               Hash Cond: ((cte.r).column1 = ((ROW("*VALUES*".column1, "*VALUES*".column2))).column1)
               ->  CTE Scan on cte
                     Output: cte.r
               ->  Hash
                     Output: (ROW("*VALUES*".column1, "*VALUES*".column2))
                     ->  Limit
                           Output: (ROW("*VALUES*".column1, "*VALUES*".column2))
                           ->  Values Scan on "*VALUES*"
                                 Output: ROW("*VALUES*".column1, "*VALUES*".column2)

The result node in attempt to deparse it's targetlist goes into OUTER_VAR - Custom node. After that it goes through the
INDEX_VARref to custom_scan_tlist, finds reference to the RangeTableEntry CTE, empty dpns->inner_plan and throws the
error.

As you can see, the problem here is in wrong assumption: custom_scan_tlist can contain direct references to CTEs and
Subqueriesas well as WorkTableScan or CteScan. 
Maybe to solve this problem too?

--
Regards,
Andrei Lepikhov

On Tue, Sep 5, 2023, at 9:37 AM, Richard Guo wrote:
> On Tue, Sep 5, 2023 at 10:04 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Richard Guo <guofenglinux@gmail.com> writes:
>> > BTW, do you think get_name_for_var_field() has similar problem for
>> > RTE_SUBQUERY case?  The RTE_CTE code path in that function crawls up the
>> > namespace stack before recursing into the CTE while the RTE_SUBQUERY
>> > code patch does not, which looks like an oversight.
>>
>> Hmm, seems suspicious ...
>>
>> > I tried to find a
>> > test case to show it's indeed a problem but with no luck.
>>
>> Note that any test case here would be of the form "dump a view
>> or rule definition", not "EXPLAIN".  What did you try?
>
> Ah, thanks.  I got one of the form "dump a view" leveraging your test
> case from the v2 patch (with a minor tweak).
>
> create view composite_v as
> with cte(c) as materialized (select row(1, 2)),
>      cte2(c) as (select * from cte)
> select 1 from cte2 as t
> where (select * from (select c as c1) s
>        where (select (c1).f1 > 0)) is not null;
>
> select pg_get_viewdef('composite_v', true);
> ERROR:  bogus varno: 1
>
> So it is indeed a problem!
>
> Here is v3 patch which is v2 + fix for this issue.

Вложения

Re: BUG #18077: PostgreSQL server subprocess crashed by a SELECT statement with WITH clause

От
Richard Guo
Дата:

On Wed, Sep 6, 2023 at 11:40 AM Lepikhov Andrei <a.lepikhov@postgrespro.ru> wrote:
Hi,

I am writing here just because you change this specific part of code.
Designing a custom node I found the problem with CTE and Subqueries. The reproduction sample looks quite similar to yours:

create view tt24v as
with cte as materialized (select r from (values(1,2),(3,4)) r)
select (r).column2 as col_a, (rr).column2 as col_b from
  cte join (select rr from (values(1,7),(3,8)) rr limit 2) ss
  on (r).column1 = (rr).column1;
explain (verbose, costs off) select * from tt24v;

but fails with the error "failed to find plan for CTE ..." with a custom node over a JOIN. 

The error message indicates that something must have gone wrong.  I
don't know well enough about custom scan, but I cannot reproduce this
error with your query.  Am I missing something?

Thanks
Richard

Re: BUG #18077: PostgreSQL server subprocess crashed by a SELECT statement with WITH clause

От
"Lepikhov Andrei"
Дата:

On Wed, Sep 6, 2023, at 1:55 PM, Richard Guo wrote:
> On Wed, Sep 6, 2023 at 11:40 AM Lepikhov Andrei
> <a.lepikhov@postgrespro.ru> wrote:
>> Hi,
>>
>> I am writing here just because you change this specific part of code.
>> Designing a custom node I found the problem with CTE and Subqueries. The reproduction sample looks quite similar to
yours:
>>
>> create view tt24v as
>> with cte as materialized (select r from (values(1,2),(3,4)) r)
>> select (r).column2 as col_a, (rr).column2 as col_b from
>>   cte join (select rr from (values(1,7),(3,8)) rr limit 2) ss
>>   on (r).column1 = (rr).column1;
>> explain (verbose, costs off) select * from tt24v;
>>
>> but fails with the error "failed to find plan for CTE ..." with a custom node over a JOIN.
>
> The error message indicates that something must have gone wrong.  I
> don't know well enough about custom scan, but I cannot reproduce this
> error with your query.  Am I missing something?
Yeah, you should design cusom node to reproduce it. I can't publish my current code, but will try to invent a simple
example.

--
Regards,
Andrei Lepikhov



Re: BUG #18077: PostgreSQL server subprocess crashed by a SELECT statement with WITH clause

От
"Lepikhov Andrei"
Дата:

On Wed, Sep 6, 2023, at 1:55 PM, Richard Guo wrote:
> On Wed, Sep 6, 2023 at 11:40 AM Lepikhov Andrei
> <a.lepikhov@postgrespro.ru> wrote:
>> Hi,
>>
>> I am writing here just because you change this specific part of code.
>> Designing a custom node I found the problem with CTE and Subqueries. The reproduction sample looks quite similar to
yours:
>>
>> create view tt24v as
>> with cte as materialized (select r from (values(1,2),(3,4)) r)
>> select (r).column2 as col_a, (rr).column2 as col_b from
>>   cte join (select rr from (values(1,7),(3,8)) rr limit 2) ss
>>   on (r).column1 = (rr).column1;
>> explain (verbose, costs off) select * from tt24v;
>>
>> but fails with the error "failed to find plan for CTE ..." with a custom node over a JOIN.
>
> The error message indicates that something must have gone wrong.  I
> don't know well enough about custom scan, but I cannot reproduce this
> error with your query.  Am I missing something?
I invented a dummy extension "pg_extension" [1], commit 4199a0c, which adds  CustomScan over the first
non-parameterizedHashJoin at the pathlist. 
The example presented in my letter earlier causes the ERROR on CTE. Moreover, if you remove the word 'materialized',
youwill find the same error on Subquery. 

[1] https://github.com/danolivo/pg_extension/tree/main

--
Regards,
Andrei Lepikhov



Re: BUG #18077: PostgreSQL server subprocess crashed by a SELECT statement with WITH clause

От
Michael Paquier
Дата:
On Thu, Sep 07, 2023 at 03:25:36PM +0700, Lepikhov Andrei wrote:
> I invented a dummy extension "pg_extension" [1], commit 4199a0c,
> which adds  CustomScan over the first non-parameterized HashJoin at
> the pathlist.
> The example presented in my letter earlier causes the ERROR on
> CTE. Moreover, if you remove the word 'materialized', you will find
> the same error on Subquery.
>
> [1] https://github.com/danolivo/pg_extension/tree/main

Digressing a bit here about this point..  In the long-term I think
that it would be a good idea to have a template module in
src/test/modules/ that shows how to use a CustomScan so as it is able
to demonstrate how this stuff works, and to check if it is works as
intended.  With regression tests, of course.
--
Michael

Вложения

Re: BUG #18077: PostgreSQL server subprocess crashed by a SELECT statement with WITH clause

От
Richard Guo
Дата:

On Thu, Sep 7, 2023 at 5:46 PM Michael Paquier <michael@paquier.xyz> wrote:
Digressing a bit here about this point..  In the long-term I think
that it would be a good idea to have a template module in
src/test/modules/ that shows how to use a CustomScan so as it is able
to demonstrate how this stuff works, and to check if it is works as
intended.  With regression tests, of course.

Agreed.  It would be very useful.

Thanks
Richard

Re: BUG #18077: PostgreSQL server subprocess crashed by a SELECT statement with WITH clause

От
"Lepikhov Andrei"
Дата:

On Thu, Sep 7, 2023, at 4:46 PM, Michael Paquier wrote:
> On Thu, Sep 07, 2023 at 03:25:36PM +0700, Lepikhov Andrei wrote:
>> I invented a dummy extension "pg_extension" [1], commit 4199a0c,
>> which adds  CustomScan over the first non-parameterized HashJoin at
>> the pathlist. 
>> The example presented in my letter earlier causes the ERROR on
>> CTE. Moreover, if you remove the word 'materialized', you will find
>> the same error on Subquery.
>> 
>> [1] https://github.com/danolivo/pg_extension/tree/main
>
> Digressing a bit here about this point..  In the long-term I think
> that it would be a good idea to have a template module in
> src/test/modules/ that shows how to use a CustomScan so as it is able
> to demonstrate how this stuff works, and to check if it is works as
> intended.  With regression tests, of course.

I agree. I often use it in different situations: for scan, join purposes, as a stat gathering tool and others. Having
sometemplates in the code base would be comfortable. Also, We should remember to add some examples of extensible node
usage...
 
In the attachment - rewritten code of the CustomScan node, as mentioned earlier, as a test module with one regression
test.It shows both CTE and Subquery problems (they have different sources of error). Also, I have attached a patch that
fixesthe problem's symptoms - not the origins of the problem, just for demonstration.
 

-- 
Regards,
Andrei Lepikhov
Вложения

Re: BUG #18077: PostgreSQL server subprocess crashed by a SELECT statement with WITH clause

От
Tom Lane
Дата:
Richard Guo <guofenglinux@gmail.com> writes:
> Here is v3 patch which is v2 + fix for this issue.

This seems not quite right yet: we need to pass the correct
parent-namespaces list to set_deparse_for_query, else set_rtable_names
might make unexpected choices.  I think the net effect of what you
have would only be to make generated table-alias names more unique
than necessary (i.e, avoiding collisions with names that are not
really in scope), but still this could be confusingly inconsistent.
So we should do more like the attached.

I set about back-patching this, and discovered that your deparse
test case exposes additional problems in the back branches.  We
get "record type has not been registered" failures in deparsing,
or even in trying to parse the view to begin with, unless we
back-patch d57534740 into pre-v14 branches and also 8b7a0f1d1
into pre-v13 branches.  At the time I'd thought d57534740's bug
could not be exposed without SEARCH BREADTH FIRST, but that was
clearly a failure of imagination.  As for 8b7a0f1d1, I'd judged
it too narrow of a corner case to be worth back-patching, and
maybe it still is: I don't think it's reachable without attempting
to fetch a ".fN" field out of an anonymous record type.  Still,
we do document that ".fN" is what the generated names are, so
it seems like people ought to be able to use them.  On balance,
therefore, I'm inclined to back-patch both of those.

Thoughts?

            regards, tom lane

diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c
index 57247de363..3bc62ac3ba 100644
--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -1499,7 +1499,8 @@ ExpandRowReference(ParseState *pstate, Node *expr,
  * drill down to find the ultimate defining expression and attempt to infer
  * the tupdesc from it.  We ereport if we can't determine the tupdesc.
  *
- * levelsup is an extra offset to interpret the Var's varlevelsup correctly.
+ * levelsup is an extra offset to interpret the Var's varlevelsup correctly
+ * when recursing.  Outside callers should pass zero.
  */
 TupleDesc
 expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
@@ -1587,10 +1588,17 @@ expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
                     /*
                      * Recurse into the sub-select to see what its Var refers
                      * to.  We have to build an additional level of ParseState
-                     * to keep in step with varlevelsup in the subselect.
+                     * to keep in step with varlevelsup in the subselect;
+                     * furthermore, the subquery RTE might be from an outer
+                     * query level, in which case the ParseState for the
+                     * subselect must have that outer level as parent.
                      */
                     ParseState    mypstate = {0};
+                    Index        levelsup;

+                    /* this loop must work, since GetRTEByRangeTablePosn did */
+                    for (levelsup = 0; levelsup < netlevelsup; levelsup++)
+                        pstate = pstate->parentParseState;
                     mypstate.parentParseState = pstate;
                     mypstate.p_rtable = rte->subquery->rtable;
                     /* don't bother filling the rest of the fake pstate */
@@ -1641,12 +1649,11 @@ expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
                      * Recurse into the CTE to see what its Var refers to. We
                      * have to build an additional level of ParseState to keep
                      * in step with varlevelsup in the CTE; furthermore it
-                     * could be an outer CTE.
+                     * could be an outer CTE (compare SUBQUERY case above).
                      */
-                    ParseState    mypstate;
+                    ParseState    mypstate = {0};
                     Index        levelsup;

-                    MemSet(&mypstate, 0, sizeof(mypstate));
                     /* this loop must work, since GetCTEForRTE did */
                     for (levelsup = 0;
                          levelsup < rte->ctelevelsup + netlevelsup;
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 97b0ef22ac..68f301484e 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -7820,22 +7820,28 @@ get_name_for_var_field(Var *var, int fieldno,
                          * Recurse into the sub-select to see what its Var
                          * refers to. We have to build an additional level of
                          * namespace to keep in step with varlevelsup in the
-                         * subselect.
+                         * subselect; furthermore, the subquery RTE might be
+                         * from an outer query level, in which case the
+                         * namespace for the subselect must have that outer
+                         * level as parent namespace.
                          */
+                        List       *save_nslist = context->namespaces;
+                        List       *parent_namespaces;
                         deparse_namespace mydpns;
                         const char *result;

+                        parent_namespaces = list_copy_tail(context->namespaces,
+                                                           netlevelsup);
+
                         set_deparse_for_query(&mydpns, rte->subquery,
-                                              context->namespaces);
+                                              parent_namespaces);

-                        context->namespaces = lcons(&mydpns,
-                                                    context->namespaces);
+                        context->namespaces = lcons(&mydpns, parent_namespaces);

                         result = get_name_for_var_field((Var *) expr, fieldno,
                                                         0, context);

-                        context->namespaces =
-                            list_delete_first(context->namespaces);
+                        context->namespaces = save_nslist;

                         return result;
                     }
@@ -7927,7 +7933,7 @@ get_name_for_var_field(Var *var, int fieldno,
                                                         attnum);

                     if (ste == NULL || ste->resjunk)
-                        elog(ERROR, "subquery %s does not have attribute %d",
+                        elog(ERROR, "CTE %s does not have attribute %d",
                              rte->eref->aliasname, attnum);
                     expr = (Node *) ste->expr;
                     if (IsA(expr, Var))
@@ -7935,21 +7941,22 @@ get_name_for_var_field(Var *var, int fieldno,
                         /*
                          * Recurse into the CTE to see what its Var refers to.
                          * We have to build an additional level of namespace
-                         * to keep in step with varlevelsup in the CTE.
-                         * Furthermore it could be an outer CTE, so we may
-                         * have to delete some levels of namespace.
+                         * to keep in step with varlevelsup in the CTE;
+                         * furthermore it could be an outer CTE (compare
+                         * SUBQUERY case above).
                          */
                         List       *save_nslist = context->namespaces;
-                        List       *new_nslist;
+                        List       *parent_namespaces;
                         deparse_namespace mydpns;
                         const char *result;

+                        parent_namespaces = list_copy_tail(context->namespaces,
+                                                           ctelevelsup);
+
                         set_deparse_for_query(&mydpns, ctequery,
-                                              context->namespaces);
+                                              parent_namespaces);

-                        new_nslist = list_copy_tail(context->namespaces,
-                                                    ctelevelsup);
-                        context->namespaces = lcons(&mydpns, new_nslist);
+                        context->namespaces = lcons(&mydpns, parent_namespaces);

                         result = get_name_for_var_field((Var *) expr, fieldno,
                                                         0, context);
diff --git a/src/test/regress/expected/rowtypes.out b/src/test/regress/expected/rowtypes.out
index 981ee0811a..8f3c153bac 100644
--- a/src/test/regress/expected/rowtypes.out
+++ b/src/test/regress/expected/rowtypes.out
@@ -1240,6 +1240,66 @@ select r, r is null as isnull, r is not null as isnotnull from r;
  (,)         | t      | f
 (6 rows)

+--
+-- Check parsing of indirect references to composite values (bug #18077)
+--
+explain (verbose, costs off)
+with cte(c) as materialized (select row(1, 2)),
+     cte2(c) as (select * from cte)
+select * from cte2 as t
+where (select * from (select c as c1) s
+       where (select (c1).f1 > 0)) is not null;
+                 QUERY PLAN
+--------------------------------------------
+ CTE Scan on cte
+   Output: cte.c
+   Filter: ((SubPlan 3) IS NOT NULL)
+   CTE cte
+     ->  Result
+           Output: '(1,2)'::record
+   SubPlan 3
+     ->  Result
+           Output: cte.c
+           One-Time Filter: $2
+           InitPlan 2 (returns $2)
+             ->  Result
+                   Output: ((cte.c).f1 > 0)
+(13 rows)
+
+with cte(c) as materialized (select row(1, 2)),
+     cte2(c) as (select * from cte)
+select * from cte2 as t
+where (select * from (select c as c1) s
+       where (select (c1).f1 > 0)) is not null;
+   c
+-------
+ (1,2)
+(1 row)
+
+-- Also check deparsing of such cases
+create view composite_v as
+with cte(c) as materialized (select row(1, 2)),
+     cte2(c) as (select * from cte)
+select 1 as one from cte2 as t
+where (select * from (select c as c1) s
+       where (select (c1).f1 > 0)) is not null;
+select pg_get_viewdef('composite_v', true);
+                     pg_get_viewdef
+--------------------------------------------------------
+  WITH cte(c) AS MATERIALIZED (                        +
+          SELECT ROW(1, 2) AS "row"                    +
+         ), cte2(c) AS (                               +
+          SELECT cte.c                                 +
+            FROM cte                                   +
+         )                                             +
+  SELECT 1 AS one                                      +
+    FROM cte2 t                                        +
+   WHERE (( SELECT s.c1                                +
+            FROM ( SELECT t.c AS c1) s                 +
+           WHERE ( SELECT (s.c1).f1 > 0))) IS NOT NULL;
+(1 row)
+
+drop view composite_v;
 --
 -- Tests for component access / FieldSelect
 --
diff --git a/src/test/regress/sql/rowtypes.sql b/src/test/regress/sql/rowtypes.sql
index 565e6249d5..fd47dc9e0f 100644
--- a/src/test/regress/sql/rowtypes.sql
+++ b/src/test/regress/sql/rowtypes.sql
@@ -494,6 +494,31 @@ with r(a,b) as materialized
           (null,row(1,2)), (null,row(null,null)), (null,null) )
 select r, r is null as isnull, r is not null as isnotnull from r;

+--
+-- Check parsing of indirect references to composite values (bug #18077)
+--
+explain (verbose, costs off)
+with cte(c) as materialized (select row(1, 2)),
+     cte2(c) as (select * from cte)
+select * from cte2 as t
+where (select * from (select c as c1) s
+       where (select (c1).f1 > 0)) is not null;
+
+with cte(c) as materialized (select row(1, 2)),
+     cte2(c) as (select * from cte)
+select * from cte2 as t
+where (select * from (select c as c1) s
+       where (select (c1).f1 > 0)) is not null;
+
+-- Also check deparsing of such cases
+create view composite_v as
+with cte(c) as materialized (select row(1, 2)),
+     cte2(c) as (select * from cte)
+select 1 as one from cte2 as t
+where (select * from (select c as c1) s
+       where (select (c1).f1 > 0)) is not null;
+select pg_get_viewdef('composite_v', true);
+drop view composite_v;

 --
 -- Tests for component access / FieldSelect

Re: BUG #18077: PostgreSQL server subprocess crashed by a SELECT statement with WITH clause

От
Richard Guo
Дата:

On Sat, Sep 16, 2023 at 2:38 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <guofenglinux@gmail.com> writes:
> Here is v3 patch which is v2 + fix for this issue.

This seems not quite right yet: we need to pass the correct
parent-namespaces list to set_deparse_for_query, else set_rtable_names
might make unexpected choices.  I think the net effect of what you
have would only be to make generated table-alias names more unique
than necessary (i.e, avoiding collisions with names that are not
really in scope), but still this could be confusingly inconsistent.
So we should do more like the attached.

Yes, you're right.  And we need to do the same for the RTE_CTE case.
 
I set about back-patching this, and discovered that your deparse
test case exposes additional problems in the back branches.  We
get "record type has not been registered" failures in deparsing,
or even in trying to parse the view to begin with, unless we
back-patch d57534740 into pre-v14 branches and also 8b7a0f1d1
into pre-v13 branches.  At the time I'd thought d57534740's bug
could not be exposed without SEARCH BREADTH FIRST, but that was
clearly a failure of imagination.  As for 8b7a0f1d1, I'd judged
it too narrow of a corner case to be worth back-patching, and
maybe it still is: I don't think it's reachable without attempting
to fetch a ".fN" field out of an anonymous record type.  Still,
we do document that ".fN" is what the generated names are, so
it seems like people ought to be able to use them.  On balance,
therefore, I'm inclined to back-patch both of those.

Agreed.  Thanks for pushing and back-patching this.

Thanks
Richard