Bogus lateral-reference-propagation logic in create_lateral_join_info

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Bogus lateral-reference-propagation logic in create_lateral_join_info
Дата
Msg-id 3951.1549403812@sss.pgh.pa.us
обсуждение исходный текст
Ответы Re: Bogus lateral-reference-propagation logic in create_lateral_join_info  (David Rowley <david.rowley@2ndquadrant.com>)
Список pgsql-hackers
While poking at bug #15613 (in which FDWs are failing to mark created
Paths with correct outer-reference sets), I thought it'd be a good idea
to add some asserts to Path creation saying that every Path should be
parameterized by at least whatever the relation's required LATERAL
references are.  I did that as per the added assertions in relnode.c
below.  I didn't expect any failures in the existing regression tests,
since we know those don't exercise bug #15613, but darn if the addition
to get_appendrel_parampathinfo didn't blow up in the core tests.

A bit of excavation later, it turns out that this is a bug since day 1
in create_lateral_join_info.  It needs to propagate lateral_relids
and related fields from appendrel parents to children, but it was not,
in the original code, accounting for the possibility of grandchildren.
Those need to get the lateral fields propagated from their topmost
ancestor too, but they weren't.  This leads to having an intermediate
appendrel that is marked with some lateral_relids when its children
are not, causing add_paths_to_append_rel to believe that unparameterized
paths can be built, triggering the new assertion.  I'm not sure if there
are any worse consequences; the regression test case that's triggering
the Assert seems to work otherwise, so we might be accidentally failing
to fail.  But it's not supposed to be like that.

Commit 0a480502b hacked this code up to deal with grandchildren for
the case of partitioned tables, but the regression test that's falling
over involves nested UNION ALL subqueries, which are not that.  Rather
than add another RTE-kind exception, though, I think we ought to rewrite
it completely and get rid of the nested loops in favor of one traversal
of the append_rel_list.  This does require assuming that the
append_rel_list has ancestor entries before descendant entries, but
that's okay because of the way the list is built.  (I note that 0a480502b
is effectively assuming that ancestors appear before children in the RTE
list, which is no safer an assumption.)

Also, I'd really like to know why I had to put in the exception seen
in the loop for AppendRelInfos that do not point to a valid parent.
It seems to me that that is almost certainly working around a bug in
the partitioning logic.  (Without it, the partition_prune regression test
crashes.)  Or would somebody like to own up to having created that state
of affairs intentionally?  If so why?

Anyway, I propose to commit and back-patch the initsplan.c part of the
attached.  The added asserts in relnode.c should probably not go in
until we have a bug #15613 fix that will prevent postgres_fdw from
triggering them, so I'll do that later.

            regards, tom lane

diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index d6ffa78..22a6da3 100644
*** a/src/backend/optimizer/plan/initsplan.c
--- b/src/backend/optimizer/plan/initsplan.c
*************** void
*** 419,424 ****
--- 419,425 ----
  create_lateral_join_info(PlannerInfo *root)
  {
      bool        found_laterals = false;
+     Relids        prev_parents PG_USED_FOR_ASSERTS_ONLY = NULL;
      Index        rti;
      ListCell   *lc;

*************** create_lateral_join_info(PlannerInfo *ro
*** 627,679 ****
       * every child anyway, and there's no value in forcing extra
       * reparameterize_path() calls.  Similarly, a lateral reference to the
       * parent prevents use of otherwise-movable join rels for each child.
       */
!     for (rti = 1; rti < root->simple_rel_array_size; rti++)
      {
!         RelOptInfo *brel = root->simple_rel_array[rti];
!         RangeTblEntry *brte = root->simple_rte_array[rti];
!
!         /*
!          * Skip empty slots. Also skip non-simple relations i.e. dead
!          * relations.
!          */
!         if (brel == NULL || !IS_SIMPLE_REL(brel))
!             continue;

          /*
!          * In the case of table inheritance, the parent RTE is directly linked
!          * to every child table via an AppendRelInfo.  In the case of table
!          * partitioning, the inheritance hierarchy is expanded one level at a
!          * time rather than flattened.  Therefore, an other member rel that is
!          * a partitioned table may have children of its own, and must
!          * therefore be marked with the appropriate lateral info so that those
!          * children eventually get marked also.
           */
!         Assert(brte);
!         if (brel->reloptkind == RELOPT_OTHER_MEMBER_REL &&
!             (brte->rtekind != RTE_RELATION ||
!              brte->relkind != RELKIND_PARTITIONED_TABLE))
              continue;

!         if (brte->inh)
!         {
!             foreach(lc, root->append_rel_list)
!             {
!                 AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc);
!                 RelOptInfo *childrel;

!                 if (appinfo->parent_relid != rti)
!                     continue;
!                 childrel = root->simple_rel_array[appinfo->child_relid];
!                 Assert(childrel->reloptkind == RELOPT_OTHER_MEMBER_REL);
!                 Assert(childrel->direct_lateral_relids == NULL);
!                 childrel->direct_lateral_relids = brel->direct_lateral_relids;
!                 Assert(childrel->lateral_relids == NULL);
!                 childrel->lateral_relids = brel->lateral_relids;
!                 Assert(childrel->lateral_referencers == NULL);
!                 childrel->lateral_referencers = brel->lateral_referencers;
!             }
!         }
      }
  }

--- 628,667 ----
       * every child anyway, and there's no value in forcing extra
       * reparameterize_path() calls.  Similarly, a lateral reference to the
       * parent prevents use of otherwise-movable join rels for each child.
+      *
+      * It's possible for child rels to have their own children, in which case
+      * the topmost parent's lateral info must be propagated all the way down.
+      * This code handles that case correctly so long as append_rel_list has
+      * entries for child relationships before grandchild relationships, which
+      * is an okay assumption right now, but we'll need to be careful to
+      * preserve it.  The assertions below check for incorrect ordering.
       */
!     foreach(lc, root->append_rel_list)
      {
!         AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc);
!         RelOptInfo *parentrel = root->simple_rel_array[appinfo->parent_relid];
!         RelOptInfo *childrel = root->simple_rel_array[appinfo->child_relid];

          /*
!          * Apparently append_rel_list can contain bogus parent rels nowadays
           */
!         if (parentrel == NULL)
              continue;

!         /* Verify that children are processed before grandchildren */
! #ifdef USE_ASSERT_CHECKING
!         prev_parents = bms_add_member(prev_parents, appinfo->parent_relid);
!         Assert(!bms_is_member(appinfo->child_relid, prev_parents));
! #endif

!         /* OK, propagate info down */
!         Assert(childrel->reloptkind == RELOPT_OTHER_MEMBER_REL);
!         Assert(childrel->direct_lateral_relids == NULL);
!         childrel->direct_lateral_relids = parentrel->direct_lateral_relids;
!         Assert(childrel->lateral_relids == NULL);
!         childrel->lateral_relids = parentrel->lateral_relids;
!         Assert(childrel->lateral_referencers == NULL);
!         childrel->lateral_referencers = parentrel->lateral_referencers;
      }
  }

diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index f04c6b7..4130514 100644
*** a/src/backend/optimizer/util/relnode.c
--- b/src/backend/optimizer/util/relnode.c
*************** get_baserel_parampathinfo(PlannerInfo *r
*** 1225,1230 ****
--- 1225,1233 ----
      double        rows;
      ListCell   *lc;

+     /* If rel has LATERAL refs, every path for it should account for them */
+     Assert(bms_is_subset(baserel->lateral_relids, required_outer));
+
      /* Unparameterized paths have no ParamPathInfo */
      if (bms_is_empty(required_outer))
          return NULL;
*************** get_joinrel_parampathinfo(PlannerInfo *r
*** 1320,1325 ****
--- 1323,1331 ----
      double        rows;
      ListCell   *lc;

+     /* If rel has LATERAL refs, every path for it should account for them */
+     Assert(bms_is_subset(joinrel->lateral_relids, required_outer));
+
      /* Unparameterized paths have no ParamPathInfo or extra join clauses */
      if (bms_is_empty(required_outer))
          return NULL;
*************** get_appendrel_parampathinfo(RelOptInfo *
*** 1511,1516 ****
--- 1517,1525 ----
  {
      ParamPathInfo *ppi;

+     /* If rel has LATERAL refs, every path for it should account for them */
+     Assert(bms_is_subset(appendrel->lateral_relids, required_outer));
+
      /* Unparameterized paths have no ParamPathInfo */
      if (bms_is_empty(required_outer))
          return NULL;

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: Fixing findDependentObjects()'s dependency on scan order(regressions in DROP diagnostic messages)
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: Protect syscache from bloating with negative cache entries