Re: Fix bogus Asserts in calc_non_nestloop_required_outer

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Fix bogus Asserts in calc_non_nestloop_required_outer
Дата
Msg-id 2060574.1704753544@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Fix bogus Asserts in calc_non_nestloop_required_outer  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Fix bogus Asserts in calc_non_nestloop_required_outer  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Sat, Jan 6, 2024 at 4:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The argument for the patch as proposed is that we should make the
>> mergejoin and hashjoin code paths do what the nestloop path is doing.
>> However, as I replied further down in that other thread, I'm not
>> exactly convinced that the nestloop path is doing the right thing.
>> I've not had time to look closer at that, though.

> I don't really understand what you were saying in your response there,
> or what you're saying here. It seems to me that if the path is
> parameterized by top relids, and the assertion is verifying that a
> certain set of non-toprelids i.e. otherrels isn't present, then
> obviously the assertion is going to pass, but it's not checking what
> it purports to be checking.

I think we're talking at cross-purposes.  What I was wondering about
(at least further down in the thread) was whether we shouldn't be
checking *both* the "real" and the "parent" relids to make sure they
don't overlap the parameterization sets.  But it's probably overkill.
After reading the code again I agree that the parent relids are more
useful to check.

However, I still don't like Richard's patch too much as-is, because
the Asserts are difficult to read/understand and even more difficult
to compare to the other code path.  I think we want to keep the
nestloop and not-nestloop paths as visually similar as possible,
so I propose we do it more like the attached (where I also borrowed
some of your wording for the comments).

A variant of this could be to ifdef out all the added code in
non-Assert builds:

    Relids        outer_paramrels = PATH_REQ_OUTER(outer_path);
    Relids        inner_paramrels = PATH_REQ_OUTER(inner_path);
    Relids        required_outer;
#ifdef USE_ASSERT_CHECKING
    Relids        innerrelids;
    Relids        outerrelids;

    /*
     * Any parameterization of the input paths refers to topmost parents of
     * the relevant relations, because reparameterize_path_by_child() hasn't
     * been called yet.  So we must consider topmost parents of the relations
     * being joined, too, while checking for disallowed parameterization
     * cases.
     */
    if (inner_path->parent->top_parent_relids)
        innerrelids = inner_path->parent->top_parent_relids;
    else
        innerrelids = inner_path->parent->relids;

    if (outer_path->parent->top_parent_relids)
        outerrelids = outer_path->parent->top_parent_relids;
    else
        outerrelids = outer_path->parent->relids;

    /* neither path can require rels from the other */
    Assert(!bms_overlap(outer_paramrels, innerrelids));
    Assert(!bms_overlap(inner_paramrels, outerrelids));
#endif
    /* form the union ... */

but I'm not sure that's better.  Probably any reasonable compiler
will throw away the whole calculation upon noticing the outputs
are unused.

            regards, tom lane

diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 3fd5a24fad..e9def9d540 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -730,8 +730,11 @@ try_nestloop_path(PlannerInfo *root,
         return;

     /*
-     * Paths are parameterized by top-level parents, so run parameterization
-     * tests on the parent relids.
+     * Any parameterization of the input paths refers to topmost parents of
+     * the relevant relations, because reparameterize_path_by_child() hasn't
+     * been called yet.  So we must consider topmost parents of the relations
+     * being joined, too, while determining parameterization of the result and
+     * checking for disallowed parameterization cases.
      */
     if (innerrel->top_parent_relids)
         innerrelids = innerrel->top_parent_relids;
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index f7ac71087a..8dbf790e89 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -2360,6 +2360,9 @@ create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
  * calc_nestloop_required_outer
  *      Compute the required_outer set for a nestloop join path
  *
+ * Note: when considering a child join, the inputs nonetheless use top-level
+ * parent relids
+ *
  * Note: result must not share storage with either input
  */
 Relids
@@ -2394,11 +2397,30 @@ calc_non_nestloop_required_outer(Path *outer_path, Path *inner_path)
 {
     Relids        outer_paramrels = PATH_REQ_OUTER(outer_path);
     Relids        inner_paramrels = PATH_REQ_OUTER(inner_path);
+    Relids        innerrelids PG_USED_FOR_ASSERTS_ONLY;
+    Relids        outerrelids PG_USED_FOR_ASSERTS_ONLY;
     Relids        required_outer;

+    /*
+     * Any parameterization of the input paths refers to topmost parents of
+     * the relevant relations, because reparameterize_path_by_child() hasn't
+     * been called yet.  So we must consider topmost parents of the relations
+     * being joined, too, while checking for disallowed parameterization
+     * cases.
+     */
+    if (inner_path->parent->top_parent_relids)
+        innerrelids = inner_path->parent->top_parent_relids;
+    else
+        innerrelids = inner_path->parent->relids;
+
+    if (outer_path->parent->top_parent_relids)
+        outerrelids = outer_path->parent->top_parent_relids;
+    else
+        outerrelids = outer_path->parent->relids;
+
     /* neither path can require rels from the other */
-    Assert(!bms_overlap(outer_paramrels, inner_path->parent->relids));
-    Assert(!bms_overlap(inner_paramrels, outer_path->parent->relids));
+    Assert(!bms_overlap(outer_paramrels, innerrelids));
+    Assert(!bms_overlap(inner_paramrels, outerrelids));
     /* form the union ... */
     required_outer = bms_union(outer_paramrels, inner_paramrels);
     /* we do not need an explicit test for empty; bms_union gets it right */

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Korotkov
Дата:
Сообщение: Re: Removing unneeded self joins
Следующее
От: Jelte Fennema-Nio
Дата:
Сообщение: Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs