Обсуждение: BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash

Поиск
Список
Период
Сортировка

BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      17800
Logged by:          Alexander Lakhin
Email address:      exclusion@gmail.com
PostgreSQL version: 15.2
Operating system:   Ubuntu 22.04
Description:

When executing the following queries:
CREATE TABLE t (a INT PRIMARY KEY, b TEXT) PARTITION BY LIST (a);

CREATE TABLE tp1 (b TEXT, a INT PRIMARY KEY);
ALTER TABLE t ATTACH PARTITION tp1 FOR VALUES IN (1);

CREATE TABLE tp2 (a INT PRIMARY KEY, b TEXT);
ALTER TABLE t ATTACH PARTITION tp2 FOR VALUES IN (2);

INSERT INTO t VALUES (1), (2);

INSERT INTO t VALUES (1), (2) ON CONFLICT(a)
  DO UPDATE SET (a, b) = (SELECT t.a, t.b || '+');

I get the server crashed with the coredump:
Core was generated by `postgres: law regression [local] INSERT
                        '.
Program terminated with signal SIGSEGV, Segmentation fault.

warning: Section `.reg-xstate/1121242' in core file too small.
#0  0x000056173f49d91d in pg_detoast_datum_packed (datum=0x2) at
fmgr.c:1853
1853            if (VARATT_IS_COMPRESSED(datum) ||
VARATT_IS_EXTERNAL(datum))
(gdb) bt
#0  0x000056173f49d91d in pg_detoast_datum_packed (datum=0x2) at
fmgr.c:1853
#1  0x000056173f44bb3c in textcat (fcinfo=0x561740c0a4d8) at varlena.c:750
#2  0x000056173efebd6e in ExecInterpExpr (state=0x561740c09ff0,
econtext=0x561740c09d18, 
    isnull=0x7ffc5f595d2f) at execExprInterp.c:752
#3  0x000056173f04a86a in ExecEvalExprSwitchContext (state=0x561740c09ff0,
econtext=0x561740c09d18, 
    isNull=0x7ffc5f595d2f) at ../../../src/include/executor/executor.h:344
#4  0x000056173f04a8e2 in ExecProject (projInfo=0x561740c09fe8)
    at ../../../src/include/executor/executor.h:378
#5  0x000056173f04ab19 in ExecResult (pstate=0x561740c09c08) at
nodeResult.c:136
#6  0x000056173f04e40d in ExecProcNode (node=0x561740c09c08) at
../../../src/include/executor/executor.h:262
#7  0x000056173f0508fe in ExecSetParamPlan (node=0x561740c112c0,
econtext=0x561740c0f8c8)
    at nodeSubplan.c:1133
#8  0x000056173efef225 in ExecEvalParamExec (state=0x561740c0fa80,
op=0x561740c0fb48, econtext=0x561740c0f8c8)
    at execExprInterp.c:2428
#9  0x000056173efec5e8 in ExecInterpExpr (state=0x561740c0fa80,
econtext=0x561740c0f8c8, 
    isnull=0x7ffc5f5961cf) at execExprInterp.c:1065
#10 0x000056173efee179 in ExecInterpExprStillValid (state=0x561740c0fa80,
econtext=0x561740c0f8c8, 
    isNull=0x7ffc5f5961cf) at execExprInterp.c:1838
#11 0x000056173f040b4a in ExecEvalExprSwitchContext (state=0x561740c0fa80,
econtext=0x561740c0f8c8, 
    isNull=0x7ffc5f5961cf) at ../../../src/include/executor/executor.h:344
#12 0x000056173f040bc2 in ExecProject (projInfo=0x561740c0fa78)
    at ../../../src/include/executor/executor.h:378
#13 0x000056173f044def in ExecOnConflictUpdate (context=0x7ffc5f596420,
resultRelInfo=0x561740c2c418, 
    conflictTid=0x7ffc5f5962e2, excludedSlot=0x561740c0b138, canSetTag=true,
returning=0x7ffc5f5962e8)
    at nodeModifyTable.c:2659
#14 0x000056173f0426d8 in ExecInsert (context=0x7ffc5f596420,
resultRelInfo=0x561740c2c418, 
    slot=0x561740c0b138, canSetTag=true, inserted_tuple=0x0,
insert_destrel=0x0) at nodeModifyTable.c:1059
#15 0x000056173f046dc6 in ExecModifyTable (pstate=0x561740c0a578) at
nodeModifyTable.c:3810
#16 0x000056173f004dd3 in ExecProcNodeFirst (node=0x561740c0a578) at
execProcnode.c:464
#17 0x000056173eff7ff6 in ExecProcNode (node=0x561740c0a578) at
../../../src/include/executor/executor.h:262
#18 0x000056173effae2b in ExecutePlan (estate=0x561740c09938,
planstate=0x561740c0a578, 
    use_parallel_mode=false, operation=CMD_INSERT, sendTuples=false,
numberTuples=0, 
    direction=ForwardScanDirection, dest=0x561740c0f310, execute_once=true)
at execMain.c:1633
#19 0x000056173eff86de in standard_ExecutorRun (queryDesc=0x561740c12c78,
direction=ForwardScanDirection, 
    count=0, execute_once=true) at execMain.c:364
#20 0x000056173eff84e7 in ExecutorRun (queryDesc=0x561740c12c78,
direction=ForwardScanDirection, count=0, 
    execute_once=true) at execMain.c:308
#21 0x000056173f2aa948 in ProcessQuery (plan=0x561740c20158, 
    sourceText=0x561740b1a178 "INSERT INTO t VALUES (1), (2) ON
CONFLICT(a)\n  DO UPDATE SET (a, b) = (SELECT t.a, t.b || '+');",
params=0x0, queryEnv=0x0, dest=0x561740c0f310, qc=0x7ffc5f596880) at
pquery.c:160
#22 0x000056173f2ac4ec in PortalRunMulti (portal=0x561740b926c8,
isTopLevel=true, setHoldSnapshot=false, 
    dest=0x561740c0f310, altdest=0x561740c0f310, qc=0x7ffc5f596880) at
pquery.c:1277
#23 0x000056173f2ab9e0 in PortalRun (portal=0x561740b926c8,
count=9223372036854775807, isTopLevel=true, 
    run_once=true, dest=0x561740c0f310, altdest=0x561740c0f310,
qc=0x7ffc5f596880) at pquery.c:791
#24 0x000056173f2a4743 in exec_simple_query (
    query_string=0x561740b1a178 "INSERT INTO t VALUES (1), (2) ON
CONFLICT(a)\n  DO UPDATE SET (a, b) = (SELECT t.a, t.b || '+');") at
postgres.c:1237
#25 0x000056173f2a9754 in PostgresMain (dbname=0x561740b52578 "regression",
username=0x561740b17708 "law")
    at postgres.c:4565
#26 0x000056173f1cd4bb in BackendRun (port=0x561740b42b10) at
postmaster.c:4461
#27 0x000056173f1ccd47 in BackendStartup (port=0x561740b42b10) at
postmaster.c:4189
#28 0x000056173f1c908c in ServerLoop () at postmaster.c:1779
#29 0x000056173f1c8936 in PostmasterMain (argc=3, argv=0x561740b15640) at
postmaster.c:1463
#30 0x000056173f082cfa in main (argc=3, argv=0x561740b15640) at main.c:200

But when executing:
UPDATE t SET (a, b) = (SELECT t.a, t.b || '+');
I get:
ERROR:  attribute 1 of type tp1 has wrong type
DETAIL:  Table has type text, but query expects integer.

By comparing two callstacks I can see that in the second case
ExecInterpExprStillValid() is executed after the latest
ExecEvalExprSwitchContext().
The ExecInterpExprStillValid() function contains:
        /* skip the check during further executions */
        state->evalfunc = (ExprStateEvalFunc) state->evalfunc_private;

If just call evalfunc_private() here, the first case ends with the error as
expected.

Observed on REL_11_STABLE..master.


Re: BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash

От
Michael Paquier
Дата:
On Sat, Feb 18, 2023 at 08:00:00AM +0000, PG Bug reporting form wrote:
> INSERT INTO t VALUES (1), (2) ON CONFLICT(a)
>   DO UPDATE SET (a, b) = (SELECT t.a, t.b || '+');
>
> But when executing:
> UPDATE t SET (a, b) = (SELECT t.a, t.b || '+');
> I get:
> ERROR:  attribute 1 of type tp1 has wrong type
> DETAIL:  Table has type text, but query expects integer.

Reproduced here.

> By comparing two callstacks I can see that in the second case
> ExecInterpExprStillValid() is executed after the latest
> ExecEvalExprSwitchContext().
> The ExecInterpExprStillValid() function contains:
>         /* skip the check during further executions */
>         state->evalfunc = (ExprStateEvalFunc) state->evalfunc_private;
>
> If just call evalfunc_private() here, the first case ends with the error as
> expected.

Yeah, it would sound logic to me to have consistency with the
ExecEvalExprSwitchContext() checks here, so it seems like the executor
has missed the call for a long time.  Would you like to write a patch,
perhaps?  Did you bisect the origin of that?
--
Michael

Вложения
Michael Paquier <michael@paquier.xyz> writes:
> On Sat, Feb 18, 2023 at 08:00:00AM +0000, PG Bug reporting form wrote:
>> But when executing:
>> UPDATE t SET (a, b) = (SELECT t.a, t.b || '+');
>> I get:
>> ERROR:  attribute 1 of type tp1 has wrong type
>> DETAIL:  Table has type text, but query expects integer.

> Reproduced here.

I think there may be two different bugs here.  The coredump in the
ON CONFLICT case goes back to v11 for me (and v10 doesn't support
the primary-key-on-partition case at all, so this error may be
aboriginal to the feature).  But I only see this "wrong type" failure
in v14 and later.  I didn't try bisecting yet.

            regards, tom lane



I wrote:
> I think there may be two different bugs here.  The coredump in the
> ON CONFLICT case goes back to v11 for me (and v10 doesn't support
> the primary-key-on-partition case at all, so this error may be
> aboriginal to the feature).  But I only see this "wrong type" failure
> in v14 and later.  I didn't try bisecting yet.

Bisecting produced some interesting results: the "wrong type"
failure has existed for a pretty long time on HEAD.  The reason
I don't see it on v13 etc is that commits 3f7323cbb et al fixed
it in those branches.  That's probably accidental, but I'm too
tired to poke at it more tonight.

            regards, tom lane



I wrote:
> Bisecting produced some interesting results: the "wrong type"
> failure has existed for a pretty long time on HEAD.  The reason
> I don't see it on v13 etc is that commits 3f7323cbb et al fixed
> it in those branches.  That's probably accidental, but I'm too
> tired to poke at it more tonight.

Sigh ... no, it's not accidental: this is exactly the same problem
3f7323cbb fixed, but popping up in two new places.

In the first place, the test cases we had that led to 3f7323cbb
were using traditionally-inherited tables, in which an update
of this sort leads to a plan like

regression=# create table ti (a int, b text);
CREATE TABLE
regression=# create table tichild() inherits(ti);
CREATE TABLE
regression=# explain (verbose, costs off) UPDATE ti SET (a, b) = (SELECT ti.a, ti.b || '+');
                                QUERY PLAN
---------------------------------------------------------------------------
 Update on public.ti
   Update on public.ti ti_1
   Update on public.tichild ti_2
   ->  Result
         Output: $2, $3, (SubPlan 1 (returns $2,$3)), ti.tableoid, ti.ctid
         ->  Append
               ->  Seq Scan on public.ti ti_1
                     Output: ti_1.a, ti_1.b, ti_1.tableoid, ti_1.ctid
               ->  Seq Scan on public.tichild ti_2
                     Output: ti_2.a, ti_2.b, ti_2.tableoid, ti_2.ctid
         SubPlan 1 (returns $2,$3)
           ->  Result
                 Output: ti.a, (ti.b || '+'::text)
(13 rows)

But if the target is a partitioned table, as in Alexander's example:

regression=# explain (verbose, costs off) UPDATE t SET (a, b) = (SELECT t.a, t.b || '+');
                                    QUERY PLAN
-----------------------------------------------------------------------------------
 Update on public.t
   Update on public.tp1 t_1
   Update on public.tp2 t_2
   ->  Append
         ->  Seq Scan on public.tp1 t_1
               Output: $2, $3, (SubPlan 1 (returns $2,$3)), t_1.tableoid, t_1.ctid
               SubPlan 1 (returns $2,$3)
                 ->  Result
                       Output: t_1.a, (t_1.b || '+'::text)
         ->  Seq Scan on public.tp2 t_2
               Output: $2, $3, (SubPlan 1 (returns $2,$3)), t_2.tableoid, t_2.ctid
(11 rows)

I'm not real sure why this discrepancy exists, or whether it's a good
idea.  But at any rate, the reason why I thought 3f7323cbb would only
be needed in the back branches is that I thought we had eliminated all
cases of making multiple clones of an UPDATE's target list when we
nuked inheritance_planner.  But here we have multiple clones of a
MULTIEXPR_SUBLINK SubPlan, and they're all sharing the same output
parameters ($2 and $3, here), and so the confusion about which args list
has to be executed during a recomputation comes right back.

If that were the only problem then I'd seriously think about fixing it
by disallowing this sort of push-down of the UPDATE targetlist when
MULTIEXPR_SUBLINKs are present.  However, the reason we're seeing a
comparable problem in ON CONFLICT is that ExecInitPartitionInfo *also*
makes clones of an UPDATE targetlist, and it is far too naive to fix
this problem.

Not sure about a good fix.  We could imagine porting v13's
SS_make_multiexprs_unique logic forward into the newer branches,
but that depends on a lot of planner infrastructure that won't
be available when ExecInitPartitionInfo runs.  I think in the
back branches we might be forced into disallowing MULTIEXPR_SUBLINKs
in ON CONFLICT targetlists, because supporting them properly is
looking like a mess.

At this point I'm seriously regretting the entire MULTIEXPR_SUBLINK
design, and am wondering if we could find a different solution for
that.  That couldn't lead to a back-patchable fix, obviously.

Another angle is that having ExecInitPartitionInfo doing this sort
of work is a big misallocation of responsibility.  If these tlists
were getting built in the planner it'd be far easier to fix.

            regards, tom lane



Re: BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash

От
Andres Freund
Дата:
Hi,

On 2023-02-21 07:49:25 +0900, Michael Paquier wrote:
> On Sat, Feb 18, 2023 at 08:00:00AM +0000, PG Bug reporting form wrote:
> > By comparing two callstacks I can see that in the second case
> > ExecInterpExprStillValid() is executed after the latest
> > ExecEvalExprSwitchContext().
> > The ExecInterpExprStillValid() function contains:
> >         /* skip the check during further executions */
> >         state->evalfunc = (ExprStateEvalFunc) state->evalfunc_private;
> > 
> > If just call evalfunc_private() here, the first case ends with the error as
> > expected.
> 
> Yeah, it would sound logic to me to have consistency with the
> ExecEvalExprSwitchContext() checks here, so it seems like the executor
> has missed the call for a long time.  Would you like to write a patch,
> perhaps?  Did you bisect the origin of that?

What inconsistency / missed call? We use ExecInterpExprStillValid() on the
first execution, but not on later executions. I don't see cases where we omit
calls to ExecInterpExprStillValid().

Afaict this is a problem of a wrongly generated target list, which isn't what
ExecInterpExprStillValid() guards against:
    /*
     * First time through, check whether attribute matches Var.  Might not be
     * ok anymore, due to schema changes. We do that by setting up a callback
     * that does checking on the first call, which then sets the evalfunc
     * callback to the actual method of execution.
     */
    state->evalfunc = ExecInterpExprStillValid;

Greetings,

Andres Freund



Andres Freund <andres@anarazel.de> writes:
> Afaict this is a problem of a wrongly generated target list, which isn't what
> ExecInterpExprStillValid() guards against:

The targetlists are okay, really.  The core problem is that each
targetlist has an instance of the MULTIEXPR_SUBLINK SubPlan with a
differently-mutated "args" list, and it looks to me like we correctly
mutated that for the associated child table.  But because all the
instances share the same output Params, the ExecSetParamPlan mechanism
gets confused about which version of the SubPlan it ought to invoke
to recompute the output Params.

It occurs to me that one possible fix is to make MULTIEXPR_SUBLINK
and the associated output Params use a separate ParamExecData array;
instead of the query-wide es_param_exec_vals array, use one that
is local to the specific targetlist's ExprState.  I'm not sure how
much violence that does to the current notion of an ExprState ---
do we think that is read-only during execution?

If we did have a local-to-the-expression ParamExecData array, maybe that
could be used to get a cleaner fix for things like the domain VALUES
and case-test-expression hacks.

            regards, tom lane



Re: BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash

От
Andres Freund
Дата:
Hi,

On 2023-02-21 12:27:54 -0500, Tom Lane wrote:
> If that were the only problem then I'd seriously think about fixing it
> by disallowing this sort of push-down of the UPDATE targetlist when
> MULTIEXPR_SUBLINKs are present.

For a moment I was wondering whether we could do the opposite. Force the
subplan references to be below the Append node. But leaving aside that that
doesn't look trivial, it wouldn't do anything for the insert [oc] case.


> However, the reason we're seeing a comparable problem in ON CONFLICT is that
> ExecInitPartitionInfo *also* makes clones of an UPDATE targetlist, and it is
> far too naive to fix this problem.
> 
> Not sure about a good fix.  We could imagine porting v13's
> SS_make_multiexprs_unique logic forward into the newer branches,
> but that depends on a lot of planner infrastructure that won't
> be available when ExecInitPartitionInfo runs.

And we really can't change the plan shape much at that point, given that we're
deferring ExecInitPartitionInfo to happen as late as possible.


> I think in the
> back branches we might be forced into disallowing MULTIEXPR_SUBLINKs
> in ON CONFLICT targetlists, because supporting them properly is
> looking like a mess.

Are you thinking of doing that in general, or only for partitioned tables? I'm
worried that this would break a lot of working queries.

Perhaps we could restrict cases of erroring out to when there's a mismatch in
column order between the partitioned table and the partition? IIUC we'll,
somewhat accidentally, do the right thing if the column order matches?

I continue to maintain that it was a seriously bad idea to support this level
of difference between partitioned table and partitions. It adds ugliness and
inefficiencies all over.


> Another angle is that having ExecInitPartitionInfo doing this sort
> of work is a big misallocation of responsibility.  If these tlists
> were getting built in the planner it'd be far easier to fix.

Agreed, that doesn't seem quite right. Generally it feels a bunch of
responsibilities that should be more on the planner level have been pushed
down into execPartition.c - but it's not obvious how to fix that in all cases
without increasing the overhead when using runtime partition pruning.

Greetings,

Andres Freund



Re: BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash

От
Andres Freund
Дата:
Hi,

On 2023-02-21 15:16:11 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Afaict this is a problem of a wrongly generated target list, which isn't what
> > ExecInterpExprStillValid() guards against:
>
> The targetlists are okay, really.  The core problem is that each
> targetlist has an instance of the MULTIEXPR_SUBLINK SubPlan with a
> differently-mutated "args" list, and it looks to me like we correctly
> mutated that for the associated child table.  But because all the
> instances share the same output Params, the ExecSetParamPlan mechanism
> gets confused about which version of the SubPlan it ought to invoke
> to recompute the output Params.

It doesn't seem crazy to describe that as a wrong targetlist :). But anyway,
all I meant was that the problem isn't one that ExecInterpExprStillValid() can
handle.


> It occurs to me that one possible fix is to make MULTIEXPR_SUBLINK
> and the associated output Params use a separate ParamExecData array;
> instead of the query-wide es_param_exec_vals array, use one that
> is local to the specific targetlist's ExprState.  I'm not sure how
> much violence that does to the current notion of an ExprState ---
> do we think that is read-only during execution?

I don't think you would need to modify ExprState - the information about
params etc comes from the ExprContext, right?  So we'd need to build a
different ExprContext for partitions, and use that when evaluating the
expressions.

Oh, I guess you might be referencing ExecInitExprWithParams(), from
6719b238e8f0? I don't like that much, it seems like the wrong level - all
other similar info comes from the ExprContext, why not here as well? Either
way, it looks to me like it'd not be used here, as that's just used for
PARAM_EXTERN, but we're dealing with PARAM_EXEC. Just changing the
->ext_params at runtime wouldn't work, it seems to be resolved when the
expression is built.

We don't currently have infrastructure for setting
econtext->ecxt_param_exec_vals to something else, but that shouldn't be too
hard to add.


> If we did have a local-to-the-expression ParamExecData array, maybe that
> could be used to get a cleaner fix for things like the domain VALUES
> and case-test-expression hacks.

Hm, I'm not quite following along here.

Greetings,

Andres Freund



Andres Freund <andres@anarazel.de> writes:
> On 2023-02-21 15:16:11 -0500, Tom Lane wrote:
>> It occurs to me that one possible fix is to make MULTIEXPR_SUBLINK
>> and the associated output Params use a separate ParamExecData array;
>> instead of the query-wide es_param_exec_vals array, use one that
>> is local to the specific targetlist's ExprState.  I'm not sure how
>> much violence that does to the current notion of an ExprState ---
>> do we think that is read-only during execution?

> I don't think you would need to modify ExprState - the information about
> params etc comes from the ExprContext, right?  So we'd need to build a
> different ExprContext for partitions, and use that when evaluating the
> expressions.
> ...
> We don't currently have infrastructure for setting
> econtext->ecxt_param_exec_vals to something else, but that shouldn't be too
> hard to add.

No, that won't work, because many usages of PARAM_EXEC Params are
specifically intended to transmit datums from one expression (plan node)
to another.  That's why that array was query-global to begin with.
What I'm wondering about is adding a separate array, and likely a separate
ParamKind, that would have a less-than-query-wide scope.  We might be able
to get away with having that be plan-node-wide, but making it local to the
specific compiled expression feels safer and easier to reason about.

>> If we did have a local-to-the-expression ParamExecData array, maybe that
>> could be used to get a cleaner fix for things like the domain VALUES
>> and case-test-expression hacks.

> Hm, I'm not quite following along here.

I'm just arm-waving at this point, it's not real clear to me either.
But I do remember that we have some ugly hacks centered around the
fact that domain VALUES and CaseTestExpr are implemented with a single
datum slot per EContext.  I'd rather convert them into something like
PARAM_EXEC with no sharing.

            regards, tom lane



Re: BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash

От
Andres Freund
Дата:
Hi,

On 2023-02-21 15:55:15 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2023-02-21 15:16:11 -0500, Tom Lane wrote:
> >> It occurs to me that one possible fix is to make MULTIEXPR_SUBLINK
> >> and the associated output Params use a separate ParamExecData array;
> >> instead of the query-wide es_param_exec_vals array, use one that
> >> is local to the specific targetlist's ExprState.  I'm not sure how
> >> much violence that does to the current notion of an ExprState ---
> >> do we think that is read-only during execution?
> 
> > I don't think you would need to modify ExprState - the information about
> > params etc comes from the ExprContext, right?  So we'd need to build a
> > different ExprContext for partitions, and use that when evaluating the
> > expressions.
> > ...
> > We don't currently have infrastructure for setting
> > econtext->ecxt_param_exec_vals to something else, but that shouldn't be too
> > hard to add.
> 
> No, that won't work, because many usages of PARAM_EXEC Params are
> specifically intended to transmit datums from one expression (plan node)
> to another.  That's why that array was query-global to begin with.
> What I'm wondering about is adding a separate array, and likely a separate
> ParamKind, that would have a less-than-query-wide scope.  We might be able
> to get away with having that be plan-node-wide, but making it local to the
> specific compiled expression feels safer and easier to reason about.

What I was trying to suggest is that you could have a dedicated ExprContext
that'd point to such a separate array. That'd allow you to choose the the
separate array on a per-expression-evaluation basis (not even per ExprState).
We already have multiple ExprContexts in some nodes, so this wouldn't break
new ground.

Greetings,

Andres Freund



Andres Freund <andres@anarazel.de> writes:
> On 2023-02-21 15:55:15 -0500, Tom Lane wrote:
>> What I'm wondering about is adding a separate array, and likely a separate
>> ParamKind, that would have a less-than-query-wide scope.  We might be able
>> to get away with having that be plan-node-wide, but making it local to the
>> specific compiled expression feels safer and easier to reason about.

> What I was trying to suggest is that you could have a dedicated ExprContext
> that'd point to such a separate array. That'd allow you to choose the the
> separate array on a per-expression-evaluation basis (not even per ExprState).
> We already have multiple ExprContexts in some nodes, so this wouldn't break
> new ground.

I'd really like to *not* need the surrounding plan node to know about
this.  The tlist push-down behavior shown upthread would result in
that requirement propagating to just about every plan node type,
certainly every one that allows projection.

If we're certain that we'll only need this for MULTIEXPR_SUBLINK and
thus only in tlists, we could conceivably put the support into
ExecProject and friends rather than directly in the ExprState
infrastructure.  But that feels like a rather strange compromise,
and it'd foreclose using the concept for other short-lifespan
Param-like nodes.

Another idea I'm toying with is that the expression compiler could
allocate some space when it sees a MULTIEXPR_SUBLINK, and then
connect up the multiexec Params to that.

            regards, tom lane



Re: BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash

От
Andres Freund
Дата:
Hi,

On 2023-02-21 17:34:30 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2023-02-21 15:55:15 -0500, Tom Lane wrote:
> >> What I'm wondering about is adding a separate array, and likely a separate
> >> ParamKind, that would have a less-than-query-wide scope.  We might be able
> >> to get away with having that be plan-node-wide, but making it local to the
> >> specific compiled expression feels safer and easier to reason about.
> 
> > What I was trying to suggest is that you could have a dedicated ExprContext
> > that'd point to such a separate array. That'd allow you to choose the the
> > separate array on a per-expression-evaluation basis (not even per ExprState).
> > We already have multiple ExprContexts in some nodes, so this wouldn't break
> > new ground.
> 
> I'd really like to *not* need the surrounding plan node to know about
> this.  The tlist push-down behavior shown upthread would result in
> that requirement propagating to just about every plan node type,
> certainly every one that allows projection.

Hm, fair point. I had thought about this in a too isolated way, about
expressions evaluated as part of nodeModifyTable.c, rather than stuff
downstream for it.

I was pondering changing EState->es_param_exec_vals while descending into
partition-specific code, just in nodeModifyTable.c, but that doesn't work
as-is, because all the ExprContexts have ->ecxt_param_exec_vals set to the
EState one at node initialization time.

I don't know why we have a copy of es_param_exec_vals in the ExprContext,
tbh. It probably is a tiny bit faster, due to avoiding one level of
indirection, but I have a hard time believing it matters compared to other
costs.


> If we're certain that we'll only need this for MULTIEXPR_SUBLINK and
> thus only in tlists, we could conceivably put the support into
> ExecProject and friends rather than directly in the ExprState
> infrastructure.  But that feels like a rather strange compromise,
> and it'd foreclose using the concept for other short-lifespan
> Param-like nodes.

Perhaps we should deal with this by generating a distinct type of expression
step, that looks up information about the param in a different place? Nothing
forces us to have the expression step look into

    prm = &(econtext->ecxt_param_exec_vals[op->d.param.paramid]);

If we e.g. emitted a distinct step type for MULTIEXPR_SUBLINK, we could have
it look up params in llast(exprstate->parent->state->es_multilink) that we
push/pop on in nodeModifyTable.c.  I think that should work, but it ain't
pretty.


A related idea is be to perform more of the necessary lookups during
expression compilation. If we figured out the execPlan node during expression
compilation, we'd not run into danger of looking up the wrong plan during
expression evaluation.


We already do look into estate during expression compilation for information
about the es_param_list_info, so this wouldn't be breaking new ground.



> Another idea I'm toying with is that the expression compiler could
> allocate some space when it sees a MULTIEXPR_SUBLINK, and then
> connect up the multiexec Params to that.

Where are you thinking of getting the information for connecting the params
from? I don't think we currently have a good way to figure that out during
evaluation time, right?

Greetings,

Andres Freund



Andres Freund <andres@anarazel.de> writes:
> Perhaps we should deal with this by generating a distinct type of expression
> step, that looks up information about the param in a different place? Nothing
> forces us to have the expression step look into
>     prm = &(econtext->ecxt_param_exec_vals[op->d.param.paramid]);

Right, where I was going was to have a distinct EEOP type that finds
the ParamExecData in some other way.  The main question is where to keep
that not-so-global ParamExecData.

>> Another idea I'm toying with is that the expression compiler could
>> allocate some space when it sees a MULTIEXPR_SUBLINK, and then
>> connect up the multiexec Params to that.

> Where are you thinking of getting the information for connecting the params
> from? I don't think we currently have a good way to figure that out during
> evaluation time, right?

It would have to look something like the forward-jump fixup logic,
that is keep track of unfinished Param-referencing steps and go back
to fill them in when it finds the driving MULTIEXPR_SUBLINK and allocates
some space to hold the output of that.  A lot of details still to be
filled in there, but it doesn't seem very different from stuff we're
already doing.

            regards, tom lane



Re: BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash

От
Andres Freund
Дата:
Hi,

On 2023-02-21 19:00:07 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Perhaps we should deal with this by generating a distinct type of expression
> > step, that looks up information about the param in a different place? Nothing
> > forces us to have the expression step look into
> >     prm = &(econtext->ecxt_param_exec_vals[op->d.param.paramid]);
> 
> Right, where I was going was to have a distinct EEOP type that finds
> the ParamExecData in some other way.  The main question is where to keep
> that not-so-global ParamExecData.

We currently overwrite prm->execPlan in ExecInitSubPlan(), when creating a
second reference to the subplan. Which is why we then end up using the wrong
SubPlanState in ExecSetParamPlan().

The problem of using the wrong SubPlanState doesn't look too hard to solve: We
could stash the "actual" execPlan in scratch.d.param.something, instead of
looking it up during ExecEvalParamExec().


I think that'd not be quite enough, because due to sharing the same
ParamExecData, we wouldn't know when to recompute the plan.


It also looks like something might not yet quite compute/adjust the types
completely enough in execPartition.c...

Greetings,

Andres Freund



Re: BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash

От
Andres Freund
Дата:
Hi,

On 2023-02-21 17:17:05 -0800, Andres Freund wrote:
> We currently overwrite prm->execPlan in ExecInitSubPlan(), when creating a
> second reference to the subplan. Which is why we then end up using the wrong
> SubPlanState in ExecSetParamPlan().
> 
> The problem of using the wrong SubPlanState doesn't look too hard to solve: We
> could stash the "actual" execPlan in scratch.d.param.something, instead of
> looking it up during ExecEvalParamExec().

It's not quite that easy, because there can be references to subplans that
aren't subquery specific.


I did come up with a, very hacky at this point, prototype that does seems to
fix the issue.

The main problem here IMO is that
a) there can be many different ParamExecData->execPlans for a single param
b) we only want a single value for a PARAM_EXEC to be valid at a time

One solution to that is to move execPlans out of ParamExecData.

In the attached prototype I added a two dimensional list to EState. The first
level is indexed by the paramid, the second identifies partition specific
plans. For non-partition specific subplans it is 0, for partition specific
ones it is a new integer assigned sequentially within ExecInitPartitionInfo().

When generating a reference to a PARAM_EXEC parameter, the current
partition-specific offset is added to the expression step.

One significant complication is that we can't actually know at the point we
encounter the parameter whether the subplan is partition specific or not, or
at least it wasn't obvious to me how to do so. So ExecEvalParamExec() has to
try both.

Obviously the attached patch is in no way meant to be more than a proof of
concept.


I don't think we can move all of the responsibilities here to the planner - we
don't want to generate a plan that has pre-generated expressions, with
different param ids, for every potentially affected partition. So I do think
we need some runtime way of identifying the correct execPlan.


But I think the planner could make the executors job easier, by providing
conveniently accessible information about what a specific paramid is used
for. If we e.g. knew that a specific Param
a) will require execPlan processing
b) references an expression that is expected to be partition specific

we could figure out during expression compilation whether to make the
expression step reference the "query global" plan, or not. Instead of
deferring that to a runtime check in ExecEvalParamExec().


The attached really is just an exploration of the idea, not something anywhere
near a real fix.


It also doesn't yet address the issue with the wrong subplan chosen when the
planner expands partitions, as that doesn't go through
ExecInitPartitionInfo(). But I'm not sure that should be addressed by the same
mechanism - that seems a bit more the planner's responsibility. This means
we'll continue to fail to evaluate
  explain (verbose, analyze) UPDATE t SET (a, b) = (SELECT t.a, t.b || '+')
except that now we'll print
  WARNING:  01000: overwriting previous subplan exec plan: 0
  LOCATION:  ExecInitSubPlan, nodeSubplan.c:884
which probably ought to at least be an assertion failure if not a runtime
ERROR, once we hit that, things won't work reliably.


While hacking on this I found it helpful to have ruleutils/explain provide a
bit more detail:
- have references to subplans print the arguments
- deparse onConflictSet
- print subplans if evaluated in the context of a different node

Not sure if any of that is interesting enough to be worth doing outside of
hacking on code like this.

Greetings,

Andres Freund

Вложения
Andres Freund <andres@anarazel.de> writes:
> I did come up with a, very hacky at this point, prototype that does seems to
> fix the issue.

Hmm ... this doesn't look very much like what I was imagining.  Let
me draft a prototype and we can compare.

            regards, tom lane



I wrote:
> Hmm ... this doesn't look very much like what I was imagining.  Let
> me draft a prototype and we can compare.

Here's what I was thinking about.  I didn't bother adding regression
test cases yet, but it fixes both of the symptoms Alexander found.

This looks pretty workable to me, and in particular I think it'd be
safe to backpatch (with some fields moved to the end of their structs
to satisfy ABI worries).  We could then revert 3f7323cbb et al
in the back branches.

            regards, tom lane

diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index 812ead95bc..23f2dbfef3 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -133,6 +133,7 @@ ExecInitExpr(Expr *node, PlanState *parent)
     /* Initialize ExprState with empty step list */
     state = makeNode(ExprState);
     state->expr = node;
+    state->last_param = -1;
     state->parent = parent;
     state->ext_params = NULL;

@@ -170,6 +171,7 @@ ExecInitExprWithParams(Expr *node, ParamListInfo ext_params)
     /* Initialize ExprState with empty step list */
     state = makeNode(ExprState);
     state->expr = node;
+    state->last_param = -1;
     state->parent = NULL;
     state->ext_params = ext_params;

@@ -222,6 +224,7 @@ ExecInitQual(List *qual, PlanState *parent)

     state = makeNode(ExprState);
     state->expr = (Expr *) qual;
+    state->last_param = -1;
     state->parent = parent;
     state->ext_params = NULL;

@@ -367,6 +370,7 @@ ExecBuildProjectionInfo(List *targetList,
     projInfo->pi_state.type = T_ExprState;
     state = &projInfo->pi_state;
     state->expr = (Expr *) targetList;
+    state->last_param = -1;
     state->parent = parent;
     state->ext_params = NULL;

@@ -538,6 +542,7 @@ ExecBuildUpdateProjection(List *targetList,
         state->expr = (Expr *) targetList;
     else
         state->expr = NULL;        /* not used */
+    state->last_param = -1;
     state->parent = parent;
     state->ext_params = NULL;

@@ -874,18 +879,46 @@ ExecCheck(ExprState *state, ExprContext *econtext)
 /*
  * Prepare a compiled expression for execution.  This has to be called for
  * every ExprState before it can be executed.
- *
- * NB: While this currently only calls ExecReadyInterpretedExpr(),
- * this will likely get extended to further expression evaluation methods.
- * Therefore this should be used instead of directly calling
- * ExecReadyInterpretedExpr().
  */
 static void
 ExecReadyExpr(ExprState *state)
 {
+    /*
+     * We need one last fixup in the steps list: if there are any subplan
+     * nodes with private output arrays, their associated PARAM_EXEC params
+     * need to be given pointers to those arrays.
+     */
+    if (state->parent != NULL)
+    {
+        ListCell   *lc;
+
+        foreach(lc, state->parent->subPlan)
+        {
+            SubPlanState *sstate = lfirst_node(SubPlanState, lc);
+            List       *setParam;
+            int            idx;
+
+            if (sstate->private_exec_vals == NULL)
+                continue;        /* nothing to do here */
+
+            setParam = sstate->subplan->setParam;
+            idx = state->last_param;
+            while (idx >= 0)
+            {
+                ExprEvalStep *as = &state->steps[idx];
+
+                if (list_member_int(setParam, as->d.param.paramid))
+                    as->d.param.private_exec_vals = sstate->private_exec_vals;
+                idx = as->d.param.prev_param;
+            }
+        }
+    }
+
+    /* Now see if JIT wants to compile it */
     if (jit_compile_expr(state))
         return;

+    /* Nope, use the default interpreted-expression implementation */
     ExecReadyInterpretedExpr(state);
 }

@@ -993,7 +1026,12 @@ ExecInitExprRec(Expr *node, ExprState *state,
                         scratch.opcode = EEOP_PARAM_EXEC;
                         scratch.d.param.paramid = param->paramid;
                         scratch.d.param.paramtype = param->paramtype;
+                        /* For now, assume it's not a MULTIEXPR output Param */
+                        scratch.d.param.private_exec_vals = NULL;
+                        /* ...but link it into a list so we can fix it later */
+                        scratch.d.param.prev_param = state->last_param;
                         ExprEvalPushStep(state, &scratch);
+                        state->last_param = state->steps_len - 1;
                         break;
                     case PARAM_EXTERN:

@@ -1622,6 +1660,7 @@ ExecInitExprRec(Expr *node, ExprState *state,
                  */
                 elemstate = makeNode(ExprState);
                 elemstate->expr = acoerce->elemexpr;
+                elemstate->last_param = -1;
                 elemstate->parent = state->parent;
                 elemstate->ext_params = state->ext_params;

@@ -3270,6 +3309,7 @@ ExecBuildAggTrans(AggState *aggstate, AggStatePerPhase phase,
     LastAttnumInfo deform = {0, 0, 0};

     state->expr = (Expr *) aggstate;
+    state->last_param = -1;
     state->parent = parent;

     scratch.resvalue = &state->resvalue;
@@ -3739,6 +3779,7 @@ ExecBuildGroupingEqual(TupleDesc ldesc, TupleDesc rdesc,

     state->expr = NULL;
     state->flags = EEO_FLAG_IS_QUAL;
+    state->last_param = -1;
     state->parent = parent;

     scratch.resvalue = &state->resvalue;
@@ -3889,6 +3930,7 @@ ExecBuildParamSetEqual(TupleDesc desc,

     state->expr = NULL;
     state->flags = EEO_FLAG_IS_QUAL;
+    state->last_param = -1;
     state->parent = parent;

     scratch.resvalue = &state->resvalue;
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 19351fe34b..268c7fde70 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -2413,15 +2413,19 @@ ExecEvalFuncExprStrictFusage(ExprState *state, ExprEvalStep *op,
 /*
  * Evaluate a PARAM_EXEC parameter.
  *
- * PARAM_EXEC params (internal executor parameters) are stored in the
- * ecxt_param_exec_vals array, and can be accessed by array index.
+ * Most PARAM_EXEC params (internal executor parameters) are stored in the
+ * ecxt_param_exec_vals array, but ones representing MULTIEXPR subplan outputs
+ * may be in a private array.  In either case the paramid is the array index.
  */
 void
 ExecEvalParamExec(ExprState *state, ExprEvalStep *op, ExprContext *econtext)
 {
+    ParamExecData *param_array = op->d.param.private_exec_vals;
     ParamExecData *prm;

-    prm = &(econtext->ecxt_param_exec_vals[op->d.param.paramid]);
+    if (param_array == NULL)
+        param_array = econtext->ecxt_param_exec_vals;
+    prm = &(param_array[op->d.param.paramid]);
     if (unlikely(prm->execPlan != NULL))
     {
         /* Parameter not evaluated yet, so go do it */
diff --git a/src/backend/executor/nodeSubplan.c b/src/backend/executor/nodeSubplan.c
index 64af36a187..79d8f1a33e 100644
--- a/src/backend/executor/nodeSubplan.c
+++ b/src/backend/executor/nodeSubplan.c
@@ -255,20 +255,30 @@ ExecScanSubPlan(SubPlanState *node,
      * same underlying subplan.  However, that fails to hold for MULTIEXPRs
      * because they can have non-empty args lists, and the "same" args might
      * have mutated into different forms in different parts of a plan tree.
-     * There is currently no problem because MULTIEXPR can appear only in an
-     * UPDATE's top-level target list, so it won't get duplicated anyplace.
-     * Postgres versions before v14 had to make concrete efforts to avoid
-     * sharing output parameters across different clones of a MULTIEXPR, and
-     * the problem could recur someday.
+     * Therefore, MULTIEXPR SubPlans that are not initplans (i.e., have
+     * non-empty args lists) are given private output arrays that are
+     * allocated when the containing targetlist expression is compiled.  This
+     * solution works because the referencing Params must be part of the same
+     * targetlist expression, so we can fix them up to find the private output
+     * array during expression compilation.
      */
     if (subLinkType == MULTIEXPR_SUBLINK)
     {
-        EState       *estate = node->parent->state;
+        ParamExecData *param_array;
+
+        /*
+         * If subplan has a private output array, use that; otherwise use
+         * es_param_exec_vals.
+         */
+        if (node->private_exec_vals)
+            param_array = node->private_exec_vals;
+        else
+            param_array = node->parent->state->es_param_exec_vals;

         foreach(l, subplan->setParam)
         {
             int            paramid = lfirst_int(l);
-            ParamExecData *prm = &(estate->es_param_exec_vals[paramid]);
+            ParamExecData *prm = &(param_array[paramid]);

             prm->execPlan = node;
         }
@@ -830,6 +840,7 @@ ExecInitSubPlan(SubPlan *subplan, PlanState *parent)
     /*
      * initialize my state
      */
+    sstate->private_exec_vals = NULL;
     sstate->curTuple = NULL;
     sstate->curArray = PointerGetDatum(NULL);
     sstate->projLeft = NULL;
@@ -861,12 +872,36 @@ ExecInitSubPlan(SubPlan *subplan, PlanState *parent)
      */
     if (subplan->setParam != NIL && subplan->subLinkType != CTE_SUBLINK)
     {
+        ParamExecData *param_array = estate->es_param_exec_vals;
         ListCell   *lst;

+        /*
+         * If this is a MULTIEXPR subplan (and not an initplan), it needs a
+         * private ParamExecData output array.  We index this array with the
+         * same Param IDs used for the query-global es_param_exec_vals array,
+         * so there will be some wasted space, but not enough to sweat over.
+         */
+        if (subplan->subLinkType == MULTIEXPR_SUBLINK &&
+            subplan->parParam != NIL)
+        {
+            int            max_paramid = 0;
+
+            foreach(lst, subplan->setParam)
+            {
+                int            paramid = lfirst_int(lst);
+
+                max_paramid = Max(max_paramid, paramid);
+            }
+
+            param_array = palloc0_array(ParamExecData, max_paramid + 1);
+            sstate->private_exec_vals = param_array;
+        }
+
+        /* Now set the execPlan links within the correct output array. */
         foreach(lst, subplan->setParam)
         {
             int            paramid = lfirst_int(lst);
-            ParamExecData *prm = &(estate->es_param_exec_vals[paramid]);
+            ParamExecData *prm = &(param_array[paramid]);

             prm->execPlan = sstate;
         }
@@ -1057,8 +1092,8 @@ ExecInitSubPlan(SubPlan *subplan, PlanState *parent)
  * Note that this routine MUST clear the execPlan fields of the plan's
  * output parameters after evaluating them!
  *
- * The results of this function are stored in the EState associated with the
- * ExprContext (particularly, its ecxt_param_exec_vals); any pass-by-ref
+ * The results of this function are usually stored in the es_param_exec_vals
+ * array of the EState associated with the ExprContext; any pass-by-ref
  * result Datums are allocated in the EState's per-query memory.  The passed
  * econtext can be any ExprContext belonging to that EState; which one is
  * important only to the extent that the ExprContext's per-tuple memory
@@ -1067,6 +1102,11 @@ ExecInitSubPlan(SubPlan *subplan, PlanState *parent)
  * that data isn't needed after we return.  In practice, because initplan
  * parameters are never more complex than Vars, Aggrefs, etc, evaluating them
  * currently never leaks any memory anyway.)
+ *
+ * However, for MULTIEXPR subplans we instead store the output in a private
+ * output array.  This avoids confusion when several "mutated" versions of
+ * the same MULTIEXPR subplan appear in a query, which is possible in an
+ * UPDATE on a partitioned table or inheritance tree.
  * ----------------------------------------------------------------
  */
 void
@@ -1077,6 +1117,7 @@ ExecSetParamPlan(SubPlanState *node, ExprContext *econtext)
     SubLinkType subLinkType = subplan->subLinkType;
     EState       *estate = planstate->state;
     ScanDirection dir = estate->es_direction;
+    ParamExecData *output_array;
     MemoryContext oldcontext;
     TupleTableSlot *slot;
     ListCell   *pvar;
@@ -1126,6 +1167,17 @@ ExecSetParamPlan(SubPlanState *node, ExprContext *econtext)
         planstate->chgParam = bms_add_member(planstate->chgParam, paramid);
     }

+    /*
+     * Decide which ParamExecData array receives the output.  Typically it's
+     * the ecxt_param_exec_vals array, but not if the subplan has a private
+     * output array.  (That's currently only possible for MULTIEXPR subplans,
+     * but it's simplest to support it for all cases.)
+     */
+    if (node->private_exec_vals)
+        output_array = node->private_exec_vals;
+    else
+        output_array = econtext->ecxt_param_exec_vals;
+
     /*
      * Run the plan.  (If it needs to be rescanned, the first ExecProcNode
      * call will take care of that.)
@@ -1141,7 +1193,7 @@ ExecSetParamPlan(SubPlanState *node, ExprContext *econtext)
         {
             /* There can be only one setParam... */
             int            paramid = linitial_int(subplan->setParam);
-            ParamExecData *prm = &(econtext->ecxt_param_exec_vals[paramid]);
+            ParamExecData *prm = &(output_array[paramid]);

             prm->execPlan = NULL;
             prm->value = BoolGetDatum(true);
@@ -1191,7 +1243,7 @@ ExecSetParamPlan(SubPlanState *node, ExprContext *econtext)
         foreach(l, subplan->setParam)
         {
             int            paramid = lfirst_int(l);
-            ParamExecData *prm = &(econtext->ecxt_param_exec_vals[paramid]);
+            ParamExecData *prm = &(output_array[paramid]);

             prm->execPlan = NULL;
             prm->value = heap_getattr(node->curTuple, i, tdesc,
@@ -1204,7 +1256,7 @@ ExecSetParamPlan(SubPlanState *node, ExprContext *econtext)
     {
         /* There can be only one setParam... */
         int            paramid = linitial_int(subplan->setParam);
-        ParamExecData *prm = &(econtext->ecxt_param_exec_vals[paramid]);
+        ParamExecData *prm = &(output_array[paramid]);

         /*
          * We build the result array in query context so it won't disappear;
@@ -1226,7 +1278,7 @@ ExecSetParamPlan(SubPlanState *node, ExprContext *econtext)
         {
             /* There can be only one setParam... */
             int            paramid = linitial_int(subplan->setParam);
-            ParamExecData *prm = &(econtext->ecxt_param_exec_vals[paramid]);
+            ParamExecData *prm = &(output_array[paramid]);

             prm->execPlan = NULL;
             prm->value = BoolGetDatum(false);
@@ -1238,7 +1290,7 @@ ExecSetParamPlan(SubPlanState *node, ExprContext *econtext)
             foreach(l, subplan->setParam)
             {
                 int            paramid = lfirst_int(l);
-                ParamExecData *prm = &(econtext->ecxt_param_exec_vals[paramid]);
+                ParamExecData *prm = &(output_array[paramid]);

                 prm->execPlan = NULL;
                 prm->value = (Datum) 0;
@@ -1302,6 +1354,8 @@ ExecReScanSetParamPlan(SubPlanState *node, PlanState *parent)
         elog(ERROR, "setParam list of initplan is empty");
     if (bms_is_empty(planstate->plan->extParam))
         elog(ERROR, "extParam set of initplan is empty");
+    if (node->private_exec_vals != NULL)
+        elog(ERROR, "initplan should not have private output area");

     /*
      * Don't actually re-scan: it'll happen inside ExecSetParamPlan if needed.
diff --git a/src/include/executor/execExpr.h b/src/include/executor/execExpr.h
index 06c3adc0a1..c43bdcea2b 100644
--- a/src/include/executor/execExpr.h
+++ b/src/include/executor/execExpr.h
@@ -379,6 +379,11 @@ typedef struct ExprEvalStep
         {
             int            paramid;    /* numeric ID for parameter */
             Oid            paramtype;    /* OID of parameter's datatype */
+            /* These fields are used for EEOP_PARAM_EXEC only: */
+            /* Param's value is to be sought at private_exec_vals[paramid], */
+            /* or ecxt_param_exec_vals[paramid] if private_exec_vals is NULL */
+            ParamExecData *private_exec_vals;
+            int            prev_param; /* previous PARAM_EXEC's index, or -1 */
         }            param;

         /* for EEOP_PARAM_CALLBACK */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 20f4c8b35f..2ae9cc702e 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -119,8 +119,9 @@ typedef struct ExprState

     int            steps_len;        /* number of steps currently */
     int            steps_alloc;    /* allocated length of steps array */
+    int            last_param;        /* index of last EEOP_PARAM_EXEC step, or -1 */

-#define FIELDNO_EXPRSTATE_PARENT 11
+#define FIELDNO_EXPRSTATE_PARENT 12
     struct PlanState *parent;    /* parent PlanState node, if any */
     ParamListInfo ext_params;    /* for compiling PARAM_EXTERN nodes */

@@ -948,6 +949,7 @@ typedef struct SubPlanState
     struct PlanState *parent;    /* parent plan node's state tree */
     ExprState  *testexpr;        /* state of combining expression */
     List       *args;            /* states of argument expression(s) */
+    ParamExecData *private_exec_vals;    /* if not NULL, output values go here */
     HeapTuple    curTuple;        /* copy of most recent tuple from subplan */
     Datum        curArray;        /* most recent array from ARRAY() subplan */
     /* these are used when hashing the subselect's output: */
diff --git a/src/include/nodes/params.h b/src/include/nodes/params.h
index ad23113a61..90ddbaffed 100644
--- a/src/include/nodes/params.h
+++ b/src/include/nodes/params.h
@@ -132,14 +132,16 @@ typedef struct ParamListInfoData
  *      ParamExecData entries are used for executor internal parameters
  *      (that is, values being passed into or out of a sub-query).  The
  *      paramid of a PARAM_EXEC Param is a (zero-based) index into an
- *      array of ParamExecData records, which is referenced through
- *      es_param_exec_vals or ecxt_param_exec_vals.
+ *      array of ParamExecData records, which is typically found via
+ *      es_param_exec_vals or ecxt_param_exec_vals.  But Params that
+ *      reference the output of a MULTIEXPR subplan will reference a
+ *      private ParamExecData array for that specific subplan.
  *
  *      If execPlan is not NULL, it points to a SubPlanState node that needs
  *      to be executed to produce the value.  (This is done so that we can have
  *      lazy evaluation of InitPlans: they aren't executed until/unless a
- *      result value is needed.)    Otherwise the value is assumed to be valid
- *      when needed.
+ *      result value is needed.  MULTIEXPR subplans also use this mechanism.)
+ *      Otherwise the value is assumed to be valid when needed.
  * ----------------
  */


Re: BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash

От
Andres Freund
Дата:
Hi,

On 2023-02-23 13:53:34 -0500, Tom Lane wrote:
> I wrote:
> > Hmm ... this doesn't look very much like what I was imagining.  Let
> > me draft a prototype and we can compare.
> 
> Here's what I was thinking about.  I didn't bother adding regression
> test cases yet, but it fixes both of the symptoms Alexander found.

I'm not sure I really like the design of the params being local to a single
ExprState, or even local to individual steps in the expression. It seems to
buy further into making MULTIEXPR a special case. Particularly because we here
don't actually need multiple values to live at the same time, we just need
multiple execPlan fields.


Doing that amount of additional work in ExecReadyExpr() seems worrisome to me
- looks like it'd trigger in a lot of expressions that won't need any
adjustment. We could easily short-circuit based on last_param not being set
though.

But perhaps we don't actually need the work in ExecReadyExpr()? What if we
moved private_exec_vals + a bitmap when to use it into ExprState? Afaict we
don't have cases where single paramid could be used multiple times within a
single expression?

I think that might also provide a better basis for redesigning CaseTestExpr
etc, they could use that ExprState local param array as well?

Perhaps the planner could at some point provide metadata about the params in a
query, e.g. whether they ought to be used in a expression-local (eventually
perhaps also node-local?) way, or query-wide way. Then we could emit a
dedicated expression step for each of the cases, which we can't easily right
now.

Greetings,

Andres Freund



Andres Freund <andres@anarazel.de> writes:
> I'm not sure I really like the design of the params being local to a single
> ExprState, or even local to individual steps in the expression. It seems to
> buy further into making MULTIEXPR a special case.

Well, it *is* a special case, because it's (ab)using a mechanism
originally meant only for initplans to do something else.  Maybe
we should throw out the whole implementation and start over, but
that line of thinking isn't going to lead to something back-patchable.
I'm not very sure what we would do fundamentally differently anyway.

In any case, I don't see what's the problem with Param values being
transmitted locally to an expression.  As I hand-waved earlier,
things like CaseTestValue might profitably be treated that way.

> Doing that amount of additional work in ExecReadyExpr() seems worrisome to me
> - looks like it'd trigger in a lot of expressions that won't need any
> adjustment. We could easily short-circuit based on last_param not being set
> though.

Yeah, there's room for some optimization there, but I was trying
to minimize the amount of change to the data structures.  One idea,
if we don't mind adding another pointer field to ExprState, is to
make a list of just the SubPlans that have private output arrays.
Then, in the vast majority of expressions, that list will be empty
and we needn't do anything much in ExecReadyExpr.  I also contemplated
building some intermediate data structure to ease matching of Params
and SubPlans --- however, it's not real clear that that wouldn't cost
more than it saves.  We aren't likely to have very many of these
SubPlans in any one query.  (Even the specialized-list idea could be
a net loss once you consider the palloc overhead of making a list.
Maybe we should chain the interesting EEOP_SUBPLAN steps together,
similarly to what I did with EEOP_PARAM_EXEC?  There's room for a
link field.)

> But perhaps we don't actually need the work in ExecReadyExpr()? What if we
> moved private_exec_vals + a bitmap when to use it into ExprState?

Maybe, but then you're adding runtime cost to EEOP_PARAM_EXEC execution
(to check the bitmap) to save compile cost.  Doesn't sound promising.

You do have one good point here, which is that we don't really need N
private_exec_vals if there are multiple MULTIEXPR subplans --- they
could share one.  But I'm not sure how much contortionism would be
involved in exploiting that observation.

> Afaict we
> don't have cases where single paramid could be used multiple times within a
> single expression?

Right, the paramids should be unique within the expression.

            regards, tom lane



Re: BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash

От
Andres Freund
Дата:
Hi,

On 2023-02-23 17:06:03 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > I'm not sure I really like the design of the params being local to a single
> > ExprState, or even local to individual steps in the expression. It seems to
> > buy further into making MULTIEXPR a special case.
>
> Well, it *is* a special case, because it's (ab)using a mechanism
> originally meant only for initplans to do something else.

I guess my discomfort about per-expression (or really, per subplan) param
originates in that feeling foreign to the whole point of params, which is to
share a state between expressions.  What you're proposing is a bespoke thing,
that only works within a single targetlist.  With those restrictions it feels
like it's misusing PARAM_EXEC.

And we're not even really the per-subplan param arrays for different values,
just for different ParamExecData->execPlan fields.


> Maybe we should throw out the whole implementation and start over, but that
> line of thinking isn't going to lead to something back-patchable.  I'm not
> very sure what we would do fundamentally differently anyway.

The way MULTIEXPR expressions work seems pretty weird to - partially inherited
from initPlans, admittedly.

My understanding is:

When we build the evaluation step for the Param, we don't yet know that we're
dealing with a MULTIEXPR (nor do we have a reference to the relevant
SubPlan)). At the end of the targetlist, we have a special SubPlan, which make
ExecInitSubPlan() set ParamExecData->execPlan to its SubPlanState for all the
output parameters, to let ExecEvalParamExec know that the first reference to
one of the output params needs to evalute the plan. But that means that we
need to reset execPlan between rows, which is handled by the no-output
ExecScanSubPlan() invocation at the end of the targetlist.  That just seems
baroque.


ISTM that a saner sequence of expression steps would be:
- steps to evalute the 1st argument of the MULTIEXPR, targetting SubPlanState->args[0]
- steps to evalute the 2nd argument of the MULTIEXPR, targetting SubPlanState->args[1]
...
- step to execute the subplan, computing output parameters
- PARAM_SUBPLAN step referencing one of the outputs
- steps for another output column
- PARAM_SUBPLAN step referencing one of the outputs
...

That'd completely obviate the need for any use of execPlan and thus remove the
problem with getting confused about which subplan we need to execute.


Unfortunately, we can't easily produce that today, because we don't have easy
access to the SubPlan[State] at the time we encounter the Params.


I am starting to wonder if a cleaner fix wouldn't be to add magic to
ExecBuildProjectionInfo(), to find the junklist targetlist with the subplan,
and then generate something like what I described above. Likely skipping the
optimized/inlined evaluation of the arguments, initially at least.


I didn't think of this until just now, but we actually already do a separate
traversal of the expressions: ExecInitExprSlots().  Obviously the name
wouldn't fit anymore, but it seems perfectly suited for collecting subplans
that we'd need to evaluate?


Let me try to hack that up.

Greetings,

Andres Freund



Re: BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash

От
Andres Freund
Дата:
Hi,

On 2023-02-23 15:36:36 -0800, Andres Freund wrote:
> When we build the evaluation step for the Param, we don't yet know that we're
> dealing with a MULTIEXPR (nor do we have a reference to the relevant
> SubPlan)). At the end of the targetlist, we have a special SubPlan, which make
> ExecInitSubPlan() set ParamExecData->execPlan to its SubPlanState for all the
> output parameters, to let ExecEvalParamExec know that the first reference to
> one of the output params needs to evalute the plan. But that means that we
> need to reset execPlan between rows, which is handled by the no-output
> ExecScanSubPlan() invocation at the end of the targetlist.  That just seems
> baroque.

There's at least one case in the regression tests where a correlated MULTIEXPR
is in a non-resjunk TLE. I assume due to subquery pushdown.  Is there a
problem with that?  I don't immediately see any, but though it's worth
mentioning.


>
> ISTM that a saner sequence of expression steps would be:
> - steps to evalute the 1st argument of the MULTIEXPR, targetting SubPlanState->args[0]
> - steps to evalute the 2nd argument of the MULTIEXPR, targetting SubPlanState->args[1]
> ...
> - step to execute the subplan, computing output parameters
> - PARAM_SUBPLAN step referencing one of the outputs
> - steps for another output column
> - PARAM_SUBPLAN step referencing one of the outputs
> ...
>
> That'd completely obviate the need for any use of execPlan and thus remove the
> problem with getting confused about which subplan we need to execute.
>
>
> Unfortunately, we can't easily produce that today, because we don't have easy
> access to the SubPlan[State] at the time we encounter the Params.
>
>
> I am starting to wonder if a cleaner fix wouldn't be to add magic to
> ExecBuildProjectionInfo(), to find the junklist targetlist with the subplan,
> and then generate something like what I described above. Likely skipping the
> optimized/inlined evaluation of the arguments, initially at least.
>
>
> I didn't think of this until just now, but we actually already do a separate
> traversal of the expressions: ExecInitExprSlots().  Obviously the name
> wouldn't fit anymore, but it seems perfectly suited for collecting subplans
> that we'd need to evaluate?
>
>
> Let me try to hack that up.

Here's a rough prototype for that. Certainly would need a good bit more
polish, but I think the approach looks pretty promising?

I didn't do the part about evaluating the 'input' parameters as part of the
outer ExprState. Still think that's a good idea, but it's somewhat orthogonal
to the problems we're trying to fix.

Greetings,

Andres Freund

Вложения
Andres Freund <andres@anarazel.de> writes:
>> When we build the evaluation step for the Param, we don't yet know that we're
>> dealing with a MULTIEXPR (nor do we have a reference to the relevant
>> SubPlan)). At the end of the targetlist, we have a special SubPlan, which make
>> ExecInitSubPlan() set ParamExecData->execPlan to its SubPlanState for all the
>> output parameters, to let ExecEvalParamExec know that the first reference to
>> one of the output params needs to evalute the plan. But that means that we
>> need to reset execPlan between rows, which is handled by the no-output
>> ExecScanSubPlan() invocation at the end of the targetlist.  That just seems
>> baroque.

Yup, it absolutely is.  This idea of having the expression compiler just
reorder the tlist entries is definitely interesting.  I recall that I
wondered about whether we could do that when I first made the MULTIEXPR
patch, but doing it in the parse tree causes a lot of problems because
there are places that assume resjunk entries come after not-resjunk ones.
I don't see a reason why we couldn't reorder during compile though ---
and that will work in all the branches we still support.

The main concern I've got about this prototype is that it's not clear
to me whether we can back-patch addition of a new EEOP step type without
causing ABI issues.  However, why do we need a new step type?  Seems to
me that EEOP_SUBPLAN will serve fine, if we just undo the special
treatment of MULTIEXPR in ExecScanSubPlan and let it go ahead and
evaluate the subplan and assign param values.

> There's at least one case in the regression tests where a correlated MULTIEXPR
> is in a non-resjunk TLE. I assume due to subquery pushdown.  Is there a
> problem with that?  I don't immediately see any, but though it's worth
> mentioning.

My recollection is that the planner is pretty cavalier about whether
resjunk entries get marked as such in lower plan levels.  I wouldn't
worry about this (but by the same token, don't do anything that
relies on the resjunk marks being accurate).

> I didn't do the part about evaluating the 'input' parameters as part of the
> outer ExprState. Still think that's a good idea, but it's somewhat orthogonal
> to the problems we're trying to fix.

Agreed, that's nothing to be doing in a bug-fix patch.  I think we just
want to re-order the steps to put the EEOP_SUBPLAN at the front of the
tlist, and then get rid of the execPlan manipulations and the other
special-casing of MULTIEXPR.  Anything else would be HEAD-only.

Are you planning to push forward with this, or do you want me to?
It's really my bug, since the existing MULTIEXPR implementation
is my fault.

            regards, tom lane



Re: BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash

От
Andres Freund
Дата:
Hi,

On 2023-02-24 12:26:06 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> >> When we build the evaluation step for the Param, we don't yet know that we're
> >> dealing with a MULTIEXPR (nor do we have a reference to the relevant
> >> SubPlan)). At the end of the targetlist, we have a special SubPlan, which make
> >> ExecInitSubPlan() set ParamExecData->execPlan to its SubPlanState for all the
> >> output parameters, to let ExecEvalParamExec know that the first reference to
> >> one of the output params needs to evalute the plan. But that means that we
> >> need to reset execPlan between rows, which is handled by the no-output
> >> ExecScanSubPlan() invocation at the end of the targetlist.  That just seems
> >> baroque.
> 
> Yup, it absolutely is.  This idea of having the expression compiler just
> reorder the tlist entries is definitely interesting.  I recall that I
> wondered about whether we could do that when I first made the MULTIEXPR
> patch, but doing it in the parse tree causes a lot of problems because
> there are places that assume resjunk entries come after not-resjunk ones.
> I don't see a reason why we couldn't reorder during compile though ---
> and that will work in all the branches we still support.

Yea, I had briefly looked at what it would take to reorder in the planner, and
quickly gave up.


> The main concern I've got about this prototype is that it's not clear
> to me whether we can back-patch addition of a new EEOP step type without
> causing ABI issues.  However, why do we need a new step type?  Seems to
> me that EEOP_SUBPLAN will serve fine, if we just undo the special
> treatment of MULTIEXPR in ExecScanSubPlan and let it go ahead and
> evaluate the subplan and assign param values.

I think we could introduce a new step type, but I also agree we can easily
work around needing that. The main reason I didn't use EEOP_SUBPLAN was that
it seemed cleaner to not assume that there's a return value / a place to put a
pseudo return value. But we could easily make that a variant of EEOP_SUBPLAN
in the back branches.

One argument for a separate step type / separate signature for evaluating a
MULTIEXPR is that that will make it easier to evaluate arguments as part of
the surrounding ExprState.


> > There's at least one case in the regression tests where a correlated MULTIEXPR
> > is in a non-resjunk TLE. I assume due to subquery pushdown.  Is there a
> > problem with that?  I don't immediately see any, but though it's worth
> > mentioning.
> 
> My recollection is that the planner is pretty cavalier about whether
> resjunk entries get marked as such in lower plan levels.  I wouldn't
> worry about this (but by the same token, don't do anything that
> relies on the resjunk marks being accurate).

Makes sense.

I noticed this because I'd initially put in an a defense assert ensuring that
we'd not see a MULTIEXPR in a non-resjunk tle, which triggered in the pushdown
case.


> > I didn't do the part about evaluating the 'input' parameters as part of the
> > outer ExprState. Still think that's a good idea, but it's somewhat orthogonal
> > to the problems we're trying to fix.
> 
> Agreed, that's nothing to be doing in a bug-fix patch.  I think we just
> want to re-order the steps to put the EEOP_SUBPLAN at the front of the
> tlist, and then get rid of the execPlan manipulations and the other
> special-casing of MULTIEXPR.  Anything else would be HEAD-only.
> 
> Are you planning to push forward with this, or do you want me to?
> It's really my bug, since the existing MULTIEXPR implementation
> is my fault.

I'd happy if you had a go at it.  I might take a stab at moving the the
argument evaluation inline, after this goes in.

The amount of "mini-expressions" is one of the main sources of overhead with
JIT. Which also got worse over time with more and more partitioning related
stuff...

Greetings,

Andres Freund



Andres Freund <andres@anarazel.de> writes:
> On 2023-02-24 12:26:06 -0500, Tom Lane wrote:
>> Are you planning to push forward with this, or do you want me to?
>> It's really my bug, since the existing MULTIEXPR implementation
>> is my fault.

> I'd happy if you had a go at it.  I might take a stab at moving the the
> argument evaluation inline, after this goes in.

Sounds like a plan.

            regards, tom lane



OK, so this worked out quite well ... it's only about 50 net new lines
of code, and there's no data structure changes outside the contents of
compiled expressions, so no reason to fear ABI problems.

I did the renaming you had comments suggesting, but perhaps you want
to bikeshed those names?

            regards, tom lane

diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index 812ead95bc..05d56fb5b2 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -52,12 +52,15 @@
 #include "utils/typcache.h"


-typedef struct LastAttnumInfo
+typedef struct ExprPreScanInfo
 {
+    /* Highest attribute numbers fetched from inner/outer/scan tuple slots: */
     AttrNumber    last_inner;
     AttrNumber    last_outer;
     AttrNumber    last_scan;
-} LastAttnumInfo;
+    /* MULTIEXPR SubPlan nodes appearing in the expression: */
+    List       *multiexpr_subplans;
+} ExprPreScanInfo;

 static void ExecReadyExpr(ExprState *state);
 static void ExecInitExprRec(Expr *node, ExprState *state,
@@ -65,9 +68,9 @@ static void ExecInitExprRec(Expr *node, ExprState *state,
 static void ExecInitFunc(ExprEvalStep *scratch, Expr *node, List *args,
                          Oid funcid, Oid inputcollid,
                          ExprState *state);
-static void ExecInitExprSlots(ExprState *state, Node *node);
-static void ExecPushExprSlots(ExprState *state, LastAttnumInfo *info);
-static bool get_last_attnums_walker(Node *node, LastAttnumInfo *info);
+static void ExecCreateSetupSteps(ExprState *state, Node *node);
+static void ExecPushSetupSteps(ExprState *state, ExprPreScanInfo *info);
+static bool expr_pre_scan_walker(Node *node, ExprPreScanInfo *info);
 static bool ExecComputeSlotInfo(ExprState *state, ExprEvalStep *op);
 static void ExecInitWholeRowVar(ExprEvalStep *scratch, Var *variable,
                                 ExprState *state);
@@ -136,8 +139,8 @@ ExecInitExpr(Expr *node, PlanState *parent)
     state->parent = parent;
     state->ext_params = NULL;

-    /* Insert EEOP_*_FETCHSOME steps as needed */
-    ExecInitExprSlots(state, (Node *) node);
+    /* Insert setup steps as needed */
+    ExecCreateSetupSteps(state, (Node *) node);

     /* Compile the expression proper */
     ExecInitExprRec(node, state, &state->resvalue, &state->resnull);
@@ -173,8 +176,8 @@ ExecInitExprWithParams(Expr *node, ParamListInfo ext_params)
     state->parent = NULL;
     state->ext_params = ext_params;

-    /* Insert EEOP_*_FETCHSOME steps as needed */
-    ExecInitExprSlots(state, (Node *) node);
+    /* Insert setup steps as needed */
+    ExecCreateSetupSteps(state, (Node *) node);

     /* Compile the expression proper */
     ExecInitExprRec(node, state, &state->resvalue, &state->resnull);
@@ -228,8 +231,8 @@ ExecInitQual(List *qual, PlanState *parent)
     /* mark expression as to be used with ExecQual() */
     state->flags = EEO_FLAG_IS_QUAL;

-    /* Insert EEOP_*_FETCHSOME steps as needed */
-    ExecInitExprSlots(state, (Node *) qual);
+    /* Insert setup steps as needed */
+    ExecCreateSetupSteps(state, (Node *) qual);

     /*
      * ExecQual() needs to return false for an expression returning NULL. That
@@ -372,8 +375,8 @@ ExecBuildProjectionInfo(List *targetList,

     state->resultslot = slot;

-    /* Insert EEOP_*_FETCHSOME steps as needed */
-    ExecInitExprSlots(state, (Node *) targetList);
+    /* Insert setup steps as needed */
+    ExecCreateSetupSteps(state, (Node *) targetList);

     /* Now compile each tlist column */
     foreach(lc, targetList)
@@ -524,7 +527,7 @@ ExecBuildUpdateProjection(List *targetList,
     int            nAssignableCols;
     bool        sawJunk;
     Bitmapset  *assignedCols;
-    LastAttnumInfo deform = {0, 0, 0};
+    ExprPreScanInfo deform = {0, 0, 0, NIL};
     ExprEvalStep scratch = {0};
     int            outerattnum;
     ListCell   *lc,
@@ -603,11 +606,11 @@ ExecBuildUpdateProjection(List *targetList,
      * number of columns of the "outer" tuple.
      */
     if (evalTargetList)
-        get_last_attnums_walker((Node *) targetList, &deform);
+        expr_pre_scan_walker((Node *) targetList, &deform);
     else
         deform.last_outer = nAssignableCols;

-    ExecPushExprSlots(state, &deform);
+    ExecPushSetupSteps(state, &deform);

     /*
      * Now generate code to evaluate the tlist's assignable expressions or
@@ -677,20 +680,8 @@ ExecBuildUpdateProjection(List *targetList,
     }

     /*
-     * If we're evaluating the tlist, must evaluate any resjunk columns too.
-     * (This matters for things like MULTIEXPR_SUBLINK SubPlans.)
+     * We don't bother evaluating any tlist entries that are marked resjunk.
      */
-    if (evalTargetList)
-    {
-        for_each_cell(lc, targetList, lc)
-        {
-            TargetEntry *tle = lfirst_node(TargetEntry, lc);
-
-            Assert(tle->resjunk);
-            ExecInitExprRec(tle->expr, state,
-                            &state->resvalue, &state->resnull);
-        }
-    }

     /*
      * Now generate code to copy over any old columns that were not assigned
@@ -1402,6 +1393,21 @@ ExecInitExprRec(Expr *node, ExprState *state,
                 SubPlan    *subplan = (SubPlan *) node;
                 SubPlanState *sstate;

+                /*
+                 * Real execution of a MULTIEXPR SubPlan has already been
+                 * done. What we have to do here is return a dummy NULL record
+                 * value in case this targetlist element is assigned
+                 * someplace.
+                 */
+                if (subplan->subLinkType == MULTIEXPR_SUBLINK)
+                {
+                    scratch.opcode = EEOP_CONST;
+                    scratch.d.constval.value = (Datum) 0;
+                    scratch.d.constval.isnull = true;
+                    ExprEvalPushStep(state, &scratch);
+                    break;
+                }
+
                 if (!state->parent)
                     elog(ERROR, "SubPlan found with no parent plan");

@@ -2542,36 +2548,38 @@ ExecInitFunc(ExprEvalStep *scratch, Expr *node, List *args, Oid funcid,
 }

 /*
- * Add expression steps deforming the ExprState's inner/outer/scan slots
- * as much as required by the expression.
+ * Add expression steps performing setup that's needed before any of the
+ * main execution of the expression.
  */
 static void
-ExecInitExprSlots(ExprState *state, Node *node)
+ExecCreateSetupSteps(ExprState *state, Node *node)
 {
-    LastAttnumInfo info = {0, 0, 0};
+    ExprPreScanInfo info = {0, 0, 0, NIL};

-    /*
-     * Figure out which attributes we're going to need.
-     */
-    get_last_attnums_walker(node, &info);
+    /* Prescan to find out what we need. */
+    expr_pre_scan_walker(node, &info);

-    ExecPushExprSlots(state, &info);
+    /* And generate those steps. */
+    ExecPushSetupSteps(state, &info);
 }

 /*
- * Add steps deforming the ExprState's inner/out/scan slots as much as
- * indicated by info. This is useful when building an ExprState covering more
- * than one expression.
+ * Add steps performing expression setup as indicated by "info".
+ * This is useful when building an ExprState covering more than one expression.
  */
 static void
-ExecPushExprSlots(ExprState *state, LastAttnumInfo *info)
+ExecPushSetupSteps(ExprState *state, ExprPreScanInfo *info)
 {
     ExprEvalStep scratch = {0};
+    ListCell   *lc;

     scratch.resvalue = NULL;
     scratch.resnull = NULL;

-    /* Emit steps as needed */
+    /*
+     * Add steps deforming the ExprState's inner/outer/scan slots as much as
+     * required by any Vars appearing in the expression.
+     */
     if (info->last_inner > 0)
     {
         scratch.opcode = EEOP_INNER_FETCHSOME;
@@ -2602,13 +2610,48 @@ ExecPushExprSlots(ExprState *state, LastAttnumInfo *info)
         if (ExecComputeSlotInfo(state, &scratch))
             ExprEvalPushStep(state, &scratch);
     }
+
+    /*
+     * Add steps to execute any MULTIEXPR SubPlans appearing in the
+     * expression.  We need to evaluate these before any of the Params
+     * referencing their outputs are used, but after we've prepared for any
+     * Var references they may contain.  (There cannot be cross-references
+     * between MULTIEXPR SubPlans, so we needn't worry about their order.)
+     */
+    foreach(lc, info->multiexpr_subplans)
+    {
+        SubPlan    *subplan = (SubPlan *) lfirst(lc);
+        SubPlanState *sstate;
+
+        Assert(subplan->subLinkType == MULTIEXPR_SUBLINK);
+
+        /* This should match what ExecInitExprRec does for other SubPlans: */
+
+        if (!state->parent)
+            elog(ERROR, "SubPlan found with no parent plan");
+
+        sstate = ExecInitSubPlan(subplan, state->parent);
+
+        /* add SubPlanState nodes to state->parent->subPlan */
+        state->parent->subPlan = lappend(state->parent->subPlan,
+                                         sstate);
+
+        scratch.opcode = EEOP_SUBPLAN;
+        scratch.d.subplan.sstate = sstate;
+
+        /* The result can be ignored, but we better put it somewhere */
+        scratch.resvalue = &state->resvalue;
+        scratch.resnull = &state->resnull;
+
+        ExprEvalPushStep(state, &scratch);
+    }
 }

 /*
- * get_last_attnums_walker: expression walker for ExecInitExprSlots
+ * expr_pre_scan_walker: expression walker for ExecCreateSetupSteps
  */
 static bool
-get_last_attnums_walker(Node *node, LastAttnumInfo *info)
+expr_pre_scan_walker(Node *node, ExprPreScanInfo *info)
 {
     if (node == NULL)
         return false;
@@ -2636,6 +2679,16 @@ get_last_attnums_walker(Node *node, LastAttnumInfo *info)
         return false;
     }

+    /* Collect all MULTIEXPR SubPlans, too */
+    if (IsA(node, SubPlan))
+    {
+        SubPlan    *subplan = (SubPlan *) node;
+
+        if (subplan->subLinkType == MULTIEXPR_SUBLINK)
+            info->multiexpr_subplans = lappend(info->multiexpr_subplans,
+                                               subplan);
+    }
+
     /*
      * Don't examine the arguments or filters of Aggrefs or WindowFuncs,
      * because those do not represent expressions to be evaluated within the
@@ -2648,7 +2701,7 @@ get_last_attnums_walker(Node *node, LastAttnumInfo *info)
         return false;
     if (IsA(node, GroupingFunc))
         return false;
-    return expression_tree_walker(node, get_last_attnums_walker,
+    return expression_tree_walker(node, expr_pre_scan_walker,
                                   (void *) info);
 }

@@ -3267,7 +3320,7 @@ ExecBuildAggTrans(AggState *aggstate, AggStatePerPhase phase,
     PlanState  *parent = &aggstate->ss.ps;
     ExprEvalStep scratch = {0};
     bool        isCombine = DO_AGGSPLIT_COMBINE(aggstate->aggsplit);
-    LastAttnumInfo deform = {0, 0, 0};
+    ExprPreScanInfo deform = {0, 0, 0, NIL};

     state->expr = (Expr *) aggstate;
     state->parent = parent;
@@ -3283,18 +3336,18 @@ ExecBuildAggTrans(AggState *aggstate, AggStatePerPhase phase,
     {
         AggStatePerTrans pertrans = &aggstate->pertrans[transno];

-        get_last_attnums_walker((Node *) pertrans->aggref->aggdirectargs,
-                                &deform);
-        get_last_attnums_walker((Node *) pertrans->aggref->args,
-                                &deform);
-        get_last_attnums_walker((Node *) pertrans->aggref->aggorder,
-                                &deform);
-        get_last_attnums_walker((Node *) pertrans->aggref->aggdistinct,
-                                &deform);
-        get_last_attnums_walker((Node *) pertrans->aggref->aggfilter,
-                                &deform);
+        expr_pre_scan_walker((Node *) pertrans->aggref->aggdirectargs,
+                             &deform);
+        expr_pre_scan_walker((Node *) pertrans->aggref->args,
+                             &deform);
+        expr_pre_scan_walker((Node *) pertrans->aggref->aggorder,
+                             &deform);
+        expr_pre_scan_walker((Node *) pertrans->aggref->aggdistinct,
+                             &deform);
+        expr_pre_scan_walker((Node *) pertrans->aggref->aggfilter,
+                             &deform);
     }
-    ExecPushExprSlots(state, &deform);
+    ExecPushSetupSteps(state, &deform);

     /*
      * Emit instructions for each transition value / grouping set combination.
diff --git a/src/backend/executor/nodeSubplan.c b/src/backend/executor/nodeSubplan.c
index 64af36a187..c136f75ac2 100644
--- a/src/backend/executor/nodeSubplan.c
+++ b/src/backend/executor/nodeSubplan.c
@@ -235,47 +235,6 @@ ExecScanSubPlan(SubPlanState *node,
     ListCell   *l;
     ArrayBuildStateAny *astate = NULL;

-    /*
-     * MULTIEXPR subplans, when "executed", just return NULL; but first we
-     * mark the subplan's output parameters as needing recalculation.  (This
-     * is a bit of a hack: it relies on the subplan appearing later in its
-     * targetlist than any of the referencing Params, so that all the Params
-     * have been evaluated before we re-mark them for the next evaluation
-     * cycle.  But in general resjunk tlist items appear after non-resjunk
-     * ones, so this should be safe.)  Unlike ExecReScanSetParamPlan, we do
-     * *not* set bits in the parent plan node's chgParam, because we don't
-     * want to cause a rescan of the parent.
-     *
-     * Note: we are also relying on MULTIEXPR SubPlans not sharing any output
-     * parameters with other SubPlans, because if one does then it is unclear
-     * which SubPlanState node the parameter's execPlan field will be pointing
-     * to when we come to evaluate the parameter.  We can allow plain initplan
-     * SubPlans to share output parameters, because it doesn't actually matter
-     * which initplan SubPlan we reference as long as they all point to the
-     * same underlying subplan.  However, that fails to hold for MULTIEXPRs
-     * because they can have non-empty args lists, and the "same" args might
-     * have mutated into different forms in different parts of a plan tree.
-     * There is currently no problem because MULTIEXPR can appear only in an
-     * UPDATE's top-level target list, so it won't get duplicated anyplace.
-     * Postgres versions before v14 had to make concrete efforts to avoid
-     * sharing output parameters across different clones of a MULTIEXPR, and
-     * the problem could recur someday.
-     */
-    if (subLinkType == MULTIEXPR_SUBLINK)
-    {
-        EState       *estate = node->parent->state;
-
-        foreach(l, subplan->setParam)
-        {
-            int            paramid = lfirst_int(l);
-            ParamExecData *prm = &(estate->es_param_exec_vals[paramid]);
-
-            prm->execPlan = node;
-        }
-        *isNull = true;
-        return (Datum) 0;
-    }
-
     /* Initialize ArrayBuildStateAny in caller's context, if needed */
     if (subLinkType == ARRAY_SUBLINK)
         astate = initArrayResultAny(subplan->firstColType,
@@ -327,6 +286,11 @@ ExecScanSubPlan(SubPlanState *node,
      * NULL.  Assuming we get a tuple, we just use its first column (there can
      * be only one non-junk column in this case).
      *
+     * For MULTIEXPR_SUBLINK, we push the per-column subplan outputs out to
+     * the setParams and then return a dummy false value.  There must not be
+     * multiple tuples returned from the subplan; if zero tuples are produced,
+     * set the setParams to NULL.
+     *
      * For ARRAY_SUBLINK we allow the subplan to produce any number of tuples,
      * and form an array of the first column's values.  Note in particular
      * that we produce a zero-element array if no tuples are produced (this is
@@ -378,6 +342,47 @@ ExecScanSubPlan(SubPlanState *node,
             continue;
         }

+        if (subLinkType == MULTIEXPR_SUBLINK)
+        {
+            /* cannot allow multiple input tuples for MULTIEXPR sublink */
+            if (found)
+                ereport(ERROR,
+                        (errcode(ERRCODE_CARDINALITY_VIOLATION),
+                         errmsg("more than one row returned by a subquery used as an expression")));
+            found = true;
+
+            /*
+             * We need to copy the subplan's tuple in case any result is of
+             * pass-by-ref type --- our output values will point into this
+             * copied tuple!  Can't use the subplan's instance of the tuple
+             * since it won't still be valid after next ExecProcNode() call.
+             * node->curTuple keeps track of the copied tuple for eventual
+             * freeing.
+             */
+            if (node->curTuple)
+                heap_freetuple(node->curTuple);
+            node->curTuple = ExecCopySlotHeapTuple(slot);
+
+            /*
+             * Now set all the setParam params from the columns of the tuple
+             */
+            col = 1;
+            foreach(plst, subplan->setParam)
+            {
+                int            paramid = lfirst_int(plst);
+                ParamExecData *prmdata;
+
+                prmdata = &(econtext->ecxt_param_exec_vals[paramid]);
+                Assert(prmdata->execPlan == NULL);
+                prmdata->value = heap_getattr(node->curTuple, col, tdesc,
+                                              &(prmdata->isnull));
+                col++;
+            }
+
+            /* keep scanning subplan to make sure there's only one tuple */
+            continue;
+        }
+
         if (subLinkType == ARRAY_SUBLINK)
         {
             Datum        dvalue;
@@ -473,6 +478,20 @@ ExecScanSubPlan(SubPlanState *node,
             result = (Datum) 0;
             *isNull = true;
         }
+        else if (subLinkType == MULTIEXPR_SUBLINK)
+        {
+            /* We don't care about function result, but set the setParams */
+            foreach(l, subplan->setParam)
+            {
+                int            paramid = lfirst_int(l);
+                ParamExecData *prmdata;
+
+                prmdata = &(econtext->ecxt_param_exec_vals[paramid]);
+                Assert(prmdata->execPlan == NULL);
+                prmdata->value = (Datum) 0;
+                prmdata->isnull = true;
+            }
+        }
     }

     return result;
@@ -848,10 +867,9 @@ ExecInitSubPlan(SubPlan *subplan, PlanState *parent)
     sstate->cur_eq_funcs = NULL;

     /*
-     * If this is an initplan or MULTIEXPR subplan, it has output parameters
-     * that the parent plan will use, so mark those parameters as needing
-     * evaluation.  We don't actually run the subplan until we first need one
-     * of its outputs.
+     * If this is an initplan, it has output parameters that the parent plan
+     * will use, so mark those parameters as needing evaluation.  We don't
+     * actually run the subplan until we first need one of its outputs.
      *
      * A CTE subplan's output parameter is never to be evaluated in the normal
      * way, so skip this in that case.
@@ -859,7 +877,8 @@ ExecInitSubPlan(SubPlan *subplan, PlanState *parent)
      * Note that we don't set parent->chgParam here: the parent plan hasn't
      * been run yet, so no need to force it to re-run.
      */
-    if (subplan->setParam != NIL && subplan->subLinkType != CTE_SUBLINK)
+    if (subplan->setParam != NIL && subplan->parParam == NIL &&
+        subplan->subLinkType != CTE_SUBLINK)
     {
         ListCell   *lst;

@@ -1079,7 +1098,6 @@ ExecSetParamPlan(SubPlanState *node, ExprContext *econtext)
     ScanDirection dir = estate->es_direction;
     MemoryContext oldcontext;
     TupleTableSlot *slot;
-    ListCell   *pvar;
     ListCell   *l;
     bool        found = false;
     ArrayBuildStateAny *astate = NULL;
@@ -1089,6 +1107,8 @@ ExecSetParamPlan(SubPlanState *node, ExprContext *econtext)
         elog(ERROR, "ANY/ALL subselect unsupported as initplan");
     if (subLinkType == CTE_SUBLINK)
         elog(ERROR, "CTE subplans should not be executed via ExecSetParamPlan");
+    if (subplan->parParam || node->args)
+        elog(ERROR, "correlated subplans should not be executed via ExecSetParamPlan");

     /*
      * Enforce forward scan direction regardless of caller. It's hard but not
@@ -1106,26 +1126,6 @@ ExecSetParamPlan(SubPlanState *node, ExprContext *econtext)
      */
     oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);

-    /*
-     * Set Params of this plan from parent plan correlation values. (Any
-     * calculation we have to do is done in the parent econtext, since the
-     * Param values don't need to have per-query lifetime.)  Currently, we
-     * expect only MULTIEXPR_SUBLINK plans to have any correlation values.
-     */
-    Assert(subplan->parParam == NIL || subLinkType == MULTIEXPR_SUBLINK);
-    Assert(list_length(subplan->parParam) == list_length(node->args));
-
-    forboth(l, subplan->parParam, pvar, node->args)
-    {
-        int            paramid = lfirst_int(l);
-        ParamExecData *prm = &(econtext->ecxt_param_exec_vals[paramid]);
-
-        prm->value = ExecEvalExprSwitchContext((ExprState *) lfirst(pvar),
-                                               econtext,
-                                               &(prm->isnull));
-        planstate->chgParam = bms_add_member(planstate->chgParam, paramid);
-    }
-
     /*
      * Run the plan.  (If it needs to be rescanned, the first ExecProcNode
      * call will take care of that.)
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 1be1642d92..b4292253cc 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -972,9 +972,9 @@ typedef struct SubLink
  * The values are assigned to the global PARAM_EXEC params indexed by parParam
  * (the parParam and args lists must have the same ordering).  setParam is a
  * list of the PARAM_EXEC params that are computed by the sub-select, if it
- * is an initplan; they are listed in order by sub-select output column
- * position.  (parParam and setParam are integer Lists, not Bitmapsets,
- * because their ordering is significant.)
+ * is an initplan or MULTIEXPR plan; they are listed in order by sub-select
+ * output column position.  (parParam and setParam are integer Lists, not
+ * Bitmapsets, because their ordering is significant.)
  *
  * Also, the planner computes startup and per-call costs for use of the
  * SubPlan.  Note that these include the cost of the subquery proper,
@@ -1009,8 +1009,8 @@ typedef struct SubPlan
     /* Note: parallel_safe does not consider contents of testexpr or args */
     /* Information for passing params into and out of the subselect: */
     /* setParam and parParam are lists of integers (param IDs) */
-    List       *setParam;        /* initplan subqueries have to set these
-                                 * Params for parent plan */
+    List       *setParam;        /* initplan and MULTIEXPR subqueries have to
+                                 * set these Params for parent plan */
     List       *parParam;        /* indices of input Params from parent plan */
     List       *args;            /* exprs to pass as parParam values */
     /* Estimated execution costs: */
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index 2d49e765de..e2a0dc80b2 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1713,6 +1713,116 @@ reset enable_seqscan;
 reset enable_indexscan;
 reset enable_bitmapscan;
 --
+-- Check handling of MULTIEXPR SubPlans in inherited updates
+--
+create table inhpar(f1 int, f2 name);
+create table inhcld(f2 name, f1 int);
+alter table inhcld inherit inhpar;
+insert into inhpar select x, x::text from generate_series(1,5) x;
+insert into inhcld select x::text, x from generate_series(6,10) x;
+explain (verbose, costs off)
+update inhpar i set (f1, f2) = (select i.f1, i.f2 || '-' from int4_tbl limit 1);
+                               QUERY PLAN
+-------------------------------------------------------------------------
+ Update on public.inhpar i
+   Update on public.inhpar i_1
+   Update on public.inhcld i_2
+   ->  Result
+         Output: $2, $3, (SubPlan 1 (returns $2,$3)), i.tableoid, i.ctid
+         ->  Append
+               ->  Seq Scan on public.inhpar i_1
+                     Output: i_1.f1, i_1.f2, i_1.tableoid, i_1.ctid
+               ->  Seq Scan on public.inhcld i_2
+                     Output: i_2.f1, i_2.f2, i_2.tableoid, i_2.ctid
+         SubPlan 1 (returns $2,$3)
+           ->  Limit
+                 Output: (i.f1), (((i.f2)::text || '-'::text))
+                 ->  Seq Scan on public.int4_tbl
+                       Output: i.f1, ((i.f2)::text || '-'::text)
+(15 rows)
+
+update inhpar i set (f1, f2) = (select i.f1, i.f2 || '-' from int4_tbl limit 1);
+select * from inhpar;
+ f1 | f2
+----+-----
+  1 | 1-
+  2 | 2-
+  3 | 3-
+  4 | 4-
+  5 | 5-
+  6 | 6-
+  7 | 7-
+  8 | 8-
+  9 | 9-
+ 10 | 10-
+(10 rows)
+
+drop table inhpar cascade;
+NOTICE:  drop cascades to table inhcld
+--
+-- And the same for partitioned cases
+--
+create table inhpar(f1 int primary key, f2 name) partition by range (f1);
+create table inhcld1(f2 name, f1 int primary key);
+create table inhcld2(f1 int primary key, f2 name);
+alter table inhpar attach partition inhcld1 for values from (1) to (5);
+alter table inhpar attach partition inhcld2 for values from (5) to (100);
+insert into inhpar select x, x::text from generate_series(1,10) x;
+explain (verbose, costs off)
+update inhpar i set (f1, f2) = (select i.f1, i.f2 || '-' from int4_tbl limit 1);
+                                    QUERY PLAN
+-----------------------------------------------------------------------------------
+ Update on public.inhpar i
+   Update on public.inhcld1 i_1
+   Update on public.inhcld2 i_2
+   ->  Append
+         ->  Seq Scan on public.inhcld1 i_1
+               Output: $2, $3, (SubPlan 1 (returns $2,$3)), i_1.tableoid, i_1.ctid
+               SubPlan 1 (returns $2,$3)
+                 ->  Limit
+                       Output: (i_1.f1), (((i_1.f2)::text || '-'::text))
+                       ->  Seq Scan on public.int4_tbl
+                             Output: i_1.f1, ((i_1.f2)::text || '-'::text)
+         ->  Seq Scan on public.inhcld2 i_2
+               Output: $2, $3, (SubPlan 1 (returns $2,$3)), i_2.tableoid, i_2.ctid
+(13 rows)
+
+update inhpar i set (f1, f2) = (select i.f1, i.f2 || '-' from int4_tbl limit 1);
+select * from inhpar;
+ f1 | f2
+----+-----
+  1 | 1-
+  2 | 2-
+  3 | 3-
+  4 | 4-
+  5 | 5-
+  6 | 6-
+  7 | 7-
+  8 | 8-
+  9 | 9-
+ 10 | 10-
+(10 rows)
+
+-- Also check ON CONFLICT
+insert into inhpar as i values (3), (7) on conflict (f1)
+  do update set (f1, f2) = (select i.f1, i.f2 || '+');
+select * from inhpar order by f1;  -- tuple order might be unstable here
+ f1 | f2
+----+-----
+  1 | 1-
+  2 | 2-
+  3 | 3-+
+  4 | 4-
+  5 | 5-
+  6 | 6-
+  7 | 7-+
+  8 | 8-
+  9 | 9-
+ 10 | 10-
+(10 rows)
+
+drop table inhpar cascade;
+--
 -- Check handling of a constant-null CHECK constraint
 --
 create table cnullparent (f1 int);
diff --git a/src/test/regress/sql/inherit.sql b/src/test/regress/sql/inherit.sql
index 195aedb5ff..5db6dbc191 100644
--- a/src/test/regress/sql/inherit.sql
+++ b/src/test/regress/sql/inherit.sql
@@ -629,6 +629,44 @@ reset enable_seqscan;
 reset enable_indexscan;
 reset enable_bitmapscan;

+--
+-- Check handling of MULTIEXPR SubPlans in inherited updates
+--
+create table inhpar(f1 int, f2 name);
+create table inhcld(f2 name, f1 int);
+alter table inhcld inherit inhpar;
+insert into inhpar select x, x::text from generate_series(1,5) x;
+insert into inhcld select x::text, x from generate_series(6,10) x;
+
+explain (verbose, costs off)
+update inhpar i set (f1, f2) = (select i.f1, i.f2 || '-' from int4_tbl limit 1);
+update inhpar i set (f1, f2) = (select i.f1, i.f2 || '-' from int4_tbl limit 1);
+select * from inhpar;
+
+drop table inhpar cascade;
+
+--
+-- And the same for partitioned cases
+--
+create table inhpar(f1 int primary key, f2 name) partition by range (f1);
+create table inhcld1(f2 name, f1 int primary key);
+create table inhcld2(f1 int primary key, f2 name);
+alter table inhpar attach partition inhcld1 for values from (1) to (5);
+alter table inhpar attach partition inhcld2 for values from (5) to (100);
+insert into inhpar select x, x::text from generate_series(1,10) x;
+
+explain (verbose, costs off)
+update inhpar i set (f1, f2) = (select i.f1, i.f2 || '-' from int4_tbl limit 1);
+update inhpar i set (f1, f2) = (select i.f1, i.f2 || '-' from int4_tbl limit 1);
+select * from inhpar;
+
+-- Also check ON CONFLICT
+insert into inhpar as i values (3), (7) on conflict (f1)
+  do update set (f1, f2) = (select i.f1, i.f2 || '+');
+select * from inhpar order by f1;  -- tuple order might be unstable here
+
+drop table inhpar cascade;
+
 --
 -- Check handling of a constant-null CHECK constraint
 --

Re: BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash

От
Andres Freund
Дата:
Hi,

On 2023-02-24 17:12:43 -0500, Tom Lane wrote:
> OK, so this worked out quite well ... it's only about 50 net new lines
> of code, and there's no data structure changes outside the contents of
> compiled expressions, so no reason to fear ABI problems.

Nice.


> I did the renaming you had comments suggesting, but perhaps you want
> to bikeshed those names?

Not really. I guess it'd be mildly nicer to have Expr somewhere in the name,
but whatever.


> @@ -677,20 +680,8 @@ ExecBuildUpdateProjection(List *targetList,
>      }
>  
>      /*
> -     * If we're evaluating the tlist, must evaluate any resjunk columns too.
> -     * (This matters for things like MULTIEXPR_SUBLINK SubPlans.)
> +     * We don't bother evaluating any tlist entries that are marked resjunk.
>       */
> -    if (evalTargetList)
> -    {
> -        for_each_cell(lc, targetList, lc)
> -        {
> -            TargetEntry *tle = lfirst_node(TargetEntry, lc);
> -
> -            Assert(tle->resjunk);
> -            ExecInitExprRec(tle->expr, state,
> -                            &state->resvalue, &state->resnull);
> -        }
> -    }
>  
>      /*
>       * Now generate code to copy over any old columns that were not assigned

The "don't bother" comment looks a bit lonely now :)

Greetings,

Andres Freund



Andres Freund <andres@anarazel.de> writes:
> On 2023-02-24 17:12:43 -0500, Tom Lane wrote:
>> I did the renaming you had comments suggesting, but perhaps you want
>> to bikeshed those names?

> Not really. I guess it'd be mildly nicer to have Expr somewhere in the name,
> but whatever.

Maybe Exec{Create|Push}ExprSetupSteps?

> The "don't bother" comment looks a bit lonely now :)

I figured it was important to leave a note that that was intentional.
Doesn't have to look exactly like that, though.

            regards, tom lane



Re: BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash

От
Andres Freund
Дата:
Hi,

On 2023-02-24 18:49:10 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2023-02-24 17:12:43 -0500, Tom Lane wrote:
> >> I did the renaming you had comments suggesting, but perhaps you want
> >> to bikeshed those names?
> 
> > Not really. I guess it'd be mildly nicer to have Expr somewhere in the name,
> > but whatever.
> 
> Maybe Exec{Create|Push}ExprSetupSteps?

WFM.

> > The "don't bother" comment looks a bit lonely now :)
> 
> I figured it was important to leave a note that that was intentional.
> Doesn't have to look exactly like that, though.

I'd just move it to the the loop evaluating the non resjunk columns.

Greetings,

Andres Freund



Andres Freund <andres@anarazel.de> writes:
> On 2023-02-24 18:49:10 -0500, Tom Lane wrote:
>> Maybe Exec{Create|Push}ExprSetupSteps?

> WFM.

>> I figured it was important to leave a note that that was intentional.
>> Doesn't have to look exactly like that, though.

> I'd just move it to the the loop evaluating the non resjunk columns.

Done like that.

            regards, tom lane



Re: BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash

От
Andres Freund
Дата:
Hi,

On 2023-02-25 14:45:33 -0500, Tom Lane wrote:
> Done like that.

Thanks!


Your commit message referenced commit 3f7323cbb, which contains:

    That technique is borrowed from the far older code that supports
    initplans, and it works okay in that case because the cloned SubPlan nodes
    are essentially identical.  So it doesn't matter which one of the clones
    the shared ParamExecData.execPlan field might point to.

Out of curiosity: Are there cases where we actually overwrite execPlan for
initplans? I couldn't find any with a quick assertion. ISTM that that largely
should be prevented by initplans being initialized once, in ExecInitNode(). Do
we have cases where the same initplan (with the same paramids, obviously), is
used by multiple nodes?

Greetings,

Andres Freund



Andres Freund <andres@anarazel.de> writes:
> Your commit message referenced commit 3f7323cbb, which contains:

>     That technique is borrowed from the far older code that supports
>     initplans, and it works okay in that case because the cloned SubPlan nodes
>     are essentially identical.  So it doesn't matter which one of the clones
>     the shared ParamExecData.execPlan field might point to.

Yeah, I now think that was a bit of a misstatement; not about the bug,
but about what happens with initplans.

> Out of curiosity: Are there cases where we actually overwrite execPlan for
> initplans? I couldn't find any with a quick assertion. ISTM that that largely
> should be prevented by initplans being initialized once, in ExecInitNode().

True.  Even if an initplan is referenced in multiple places, it will be
attached to just one plan node's initPlan list, so there should be only
one time that execPlan gets set (per execution cycle of that node), and
I think only one SubPlanState that it could point to.  The references
aren't SubPlan nodes, just Params.

            regards, tom lane



Re: BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash

От
Andres Freund
Дата:
Hi,

On 2023-02-24 09:44:12 -0800, Andres Freund wrote:
> I'd happy if you had a go at it.  I might take a stab at moving the the
> argument evaluation inline, after this goes in.

Turned out to be pretty simple (at least for now, we'll see what I missed):
https://www.postgresql.org/message-id/20230225214401.346ancgjqc3zmvek%40awork3.anarazel.de

Greetings,

Andres Freund



Andres Freund <andres@anarazel.de> writes:
> On 2023-02-24 09:44:12 -0800, Andres Freund wrote:
>> I'd happy if you had a go at it.  I might take a stab at moving the the
>> argument evaluation inline, after this goes in.

> Turned out to be pretty simple (at least for now, we'll see what I missed):
> https://www.postgresql.org/message-id/20230225214401.346ancgjqc3zmvek%40awork3.anarazel.de

Looks plausible in a very quick once-over.  If you want to add it to
the upcoming CF, I'll do a more careful review later (or somebody
else can).

            regards, tom lane