Обсуждение: v12: ERROR: subplan "InitPlan 2 (returns $4)" was not initialized

Поиск
Список
Период
Сортировка

v12: ERROR: subplan "InitPlan 2 (returns $4)" was not initialized

От
Justin Pryzby
Дата:
Reduced from sqlsmith, this query fails under debug_parallel_query=1.

The elog was added at: 55416b26a98fcf354af88cdd27fc2e045453b68a
But (I'm not sure) the faulty commit may be 8edd0e7946 (Suppress Append
and MergeAppend plan nodes that have a single child).

postgres=# SET force_parallel_mode =1; CREATE TABLE x (i int) PARTITION BY RANGE (i); CREATE TABLE x1 PARTITION OF x
DEFAULT;
 
 select * from pg_class,
  lateral (select pg_catalog.bit_and(1)
      from pg_class as sample_1
      where case when EXISTS (
            select 1 from x
              where EXISTS (
                select 1 from pg_catalog.pg_depend
                  where (sample_1.reltuples is NULL)
                  )) then 1 end
           is NULL)x
where false;

-- 
Justin



Re: v12: ERROR: subplan "InitPlan 2 (returns $4)" was not initialized

От
Tom Lane
Дата:
Justin Pryzby <pryzby@telsasoft.com> writes:
> postgres=# SET force_parallel_mode =1; CREATE TABLE x (i int) PARTITION BY RANGE (i); CREATE TABLE x1 PARTITION OF x
DEFAULT; 
>  select * from pg_class,
>   lateral (select pg_catalog.bit_and(1)
>       from pg_class as sample_1
>       where case when EXISTS (
>             select 1 from x
>               where EXISTS (
>                 select 1 from pg_catalog.pg_depend
>                   where (sample_1.reltuples is NULL)
>                   )) then 1 end
>            is NULL)x
> where false;

Interesting.  The proximate cause is that we end up with a subplan
that is marked parallel_safe, but it has an initplan that is not
parallel_safe.  The parallel worker receives, and tries to initialize,
the parallel_safe subplan, and falls over because of its reference
to the unsafe subplan -- which was not transmitted to the worker.

Actually, because of the policy installed by commit ab77a5a45, the
mere fact of having an initplan should be enough to disqualify
the first subplan from being marked parallel-safe.

I dug around and found the culprit: setrefs.c's
clean_up_removed_plan_level() moves initplans down from a parent
to a child plan node, but it forgot the possibility that the
child plan node had been marked parallel_safe before that and
must not be anymore.

The v1 patch attached is enough to fix the immediate issue,
but there's another thing not to like, which is that we're also
discarding the costs associated with the initplans.  That's
strictly cosmetic given that all the planning decisions are
already made, but it still seems potentially annoying if you're
trying to understand EXPLAIN output.  So I'm inclined to instead
do something like v2 attached, which deals with that as well.
(On the other hand, we aren't bothering to fix up costs when
we move initplans around in materialize_finished_plan or
standard_planner ... so maybe that should be left for a patch
that fixes those things too.)

Another thing worth wondering about is whether we can't loosen
commit ab77a5a45's policy that having an initplan is enough
to make you parallel-unsafe.  In the wake of later fixes,
notably 5e6d8d2bb, it seems like maybe we could allow that
as long as the initplans themselves are parallel-safe.  That
wouldn't be material for back-patching though, so I'll worry
about it later.

Not sure what if anything to do about a test case.  I'm not
excited about memorializing the specific case found by sqlsmith,
because it seems only very accidental that it exposes this
problem.  I found that there are existing regression tests
that exercise the situation where clean_up_removed_plan_level
generates an incorrectly-marked plan, but there is accidentally
no bad effect.  (The planner itself isn't going to be making
any further decisions with the bogus info; it's only
ExecSerializePlan that pays attention to the flag, and we'd only
notice in this specific cross-reference situation.)  Also, any
change we make along the lines speculated about in the previous
para would be highly likely to break a test case, in the sense
that it'd no longer exercise the previously-failing scenario.
So on the whole I'm inclined not to bother with a new test case.

            regards, tom lane

diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index b56666398e..042036b11b 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1546,6 +1546,10 @@ clean_up_removed_plan_level(Plan *parent, Plan *child)
     child->initPlan = list_concat(parent->initPlan,
                                   child->initPlan);

+    /* If we moved down any initplans, child is no longer parallel-safe */
+    if (child->initPlan)
+        child->parallel_safe = false;
+
     /*
      * We also have to transfer the parent's column labeling info into the
      * child, else columns sent to client will be improperly labeled if this
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index b56666398e..9a7c8ea07b 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1542,7 +1542,31 @@ trivial_subqueryscan(SubqueryScan *plan)
 static Plan *
 clean_up_removed_plan_level(Plan *parent, Plan *child)
 {
-    /* We have to be sure we don't lose any initplans */
+    ListCell   *lc;
+
+    /*
+     * We have to be sure we don't lose any initplans, so move any that were
+     * attached to the parent plan to the child.  If we do move any, the child
+     * is no longer parallel-safe.  As a cosmetic matter, also add the
+     * initplans' run costs to the child's costs (compare
+     * SS_charge_for_initplans).
+     */
+    foreach(lc, parent->initPlan)
+    {
+        SubPlan    *initsubplan = (SubPlan *) lfirst(lc);
+        Cost        initplan_cost;
+
+        child->parallel_safe = false;
+        initplan_cost = initsubplan->startup_cost + initsubplan->per_call_cost;
+        child->startup_cost += initplan_cost;
+        child->total_cost += initplan_cost;
+    }
+
+    /*
+     * Attach plans this way so that parent's initplans are processed before
+     * any pre-existing initplans of the child.  Probably doesn't matter, but
+     * let's preserve the ordering just in case.
+     */
     child->initPlan = list_concat(parent->initPlan,
                                   child->initPlan);


Re: v12: ERROR: subplan "InitPlan 2 (returns $4)" was not initialized

От
Richard Guo
Дата:

On Wed, Apr 12, 2023 at 3:59 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
The v1 patch attached is enough to fix the immediate issue,
but there's another thing not to like, which is that we're also
discarding the costs associated with the initplans.  That's
strictly cosmetic given that all the planning decisions are
already made, but it still seems potentially annoying if you're
trying to understand EXPLAIN output.  So I'm inclined to instead
do something like v2 attached, which deals with that as well.
(On the other hand, we aren't bothering to fix up costs when
we move initplans around in materialize_finished_plan or
standard_planner ... so maybe that should be left for a patch
that fixes those things too.)

+1 to the v2 patch.

* Should we likewise set the parallel_safe flag for topmost plan in
SS_attach_initplans?

* In standard_planner around line 443, we move any initPlans from
top_plan to the new added Gather node.  But since we know that the
top_plan is parallel_safe here, shouldn't it have no initPlans?

Thanks
Richard 

Re: v12: ERROR: subplan "InitPlan 2 (returns $4)" was not initialized

От
Tom Lane
Дата:
Richard Guo <guofenglinux@gmail.com> writes:
> On Wed, Apr 12, 2023 at 3:59 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The v1 patch attached is enough to fix the immediate issue,
>> but there's another thing not to like, which is that we're also
>> discarding the costs associated with the initplans.  That's
>> strictly cosmetic given that all the planning decisions are
>> already made, but it still seems potentially annoying if you're
>> trying to understand EXPLAIN output.  So I'm inclined to instead
>> do something like v2 attached, which deals with that as well.
>> (On the other hand, we aren't bothering to fix up costs when
>> we move initplans around in materialize_finished_plan or
>> standard_planner ... so maybe that should be left for a patch
>> that fixes those things too.)

> +1 to the v2 patch.

Thanks for looking at this.  After sleeping on it, I'm inclined
to use the v1 patch in the back branches and do the cost fixups
only in HEAD.

> * Should we likewise set the parallel_safe flag for topmost plan in
> SS_attach_initplans?

SS_attach_initplans is assuming that costs and parallel safety
already got dealt with, either by SS_charge_for_initplans or by
equivalent processing during create_plan.  I did have an Assert
there about parallel_safe already being off in a draft version
of this patch, but dropped it after realizing that it'd have to
go away anyway when we fix things to allow parallel-safe initplans.
(I have a draft patch for that that I'll post separately.)

We could improve the comment for SS_attach_initplans to explicitly
mention parallel safety, though.  Also, I'd better go look at the
create_plan code paths to make sure they are indeed accounting
for this.

> * In standard_planner around line 443, we move any initPlans from
> top_plan to the new added Gather node.  But since we know that the
> top_plan is parallel_safe here, shouldn't it have no initPlans?

Right.  Again, I did have such an Assert there in a draft version,
then decided it wasn't useful to change temporarily.  However, the
follow-on patch removes that stanza altogether, and I suppose it
might as well remove an Assert as what's there now.  I'll make it so.

            regards, tom lane



Re: v12: ERROR: subplan "InitPlan 2 (returns $4)" was not initialized

От
David Rowley
Дата:
On Thu, 13 Apr 2023 at 00:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thanks for looking at this.  After sleeping on it, I'm inclined
> to use the v1 patch in the back branches and do the cost fixups
> only in HEAD.

I'm also fine with v1 for the back branches.

David