Re: BUG #18429: Inconsistent results on similar queries with join lateral

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BUG #18429: Inconsistent results on similar queries with join lateral
Дата
Msg-id 265068.1713049119@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: BUG #18429: Inconsistent results on similar queries with join lateral  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: BUG #18429: Inconsistent results on similar queries with join lateral  (Richard Guo <guofenglinux@gmail.com>)
Список pgsql-bugs
I wrote:
> It looks like the problem is that the old join_clause_is_movable
> logic is incorrectly deciding that the WHERE condition can be
> pushed down to the sub-select relation.

Nope, that's not accurate: I soon found that the
join_clause_is_movable functions aren't being invoked at all for the
troublesome clause.  It turns out that that clause is generated by
generate_join_implied_equalities (after having been, rather uselessly,
decomposed into an EquivalenceClass), and the reason it gets into the
scan-level condition is that get_baserel_parampathinfo puts it there
without any movability checking.  That happens because
get_baserel_parampathinfo believes this:

     * Add in joinclauses generated by EquivalenceClasses, too.  (These
     * necessarily satisfy join_clause_is_movable_into.)

But this clause *doesn't* satisfy that function; in the back branches
it'll fail the test against nullable_relids.  So I said to myself,
great, we can fix this by filtering the output clauses with
join_clause_is_movable_into.  That takes care of the submitted bug
all right, but it turns out that it also causes some existing
regression test cases to give wrong answers.  And that is because
of a different problem: some clauses generated by
generate_join_implied_equalities don't satisfy the "Clause must
physically reference at least one target rel" heuristic in
join_clause_is_movable_into.  Specifically that fails if we have a
clause generated for a child appendrel member (ie, one arm of a
flattened UNION ALL construct) whose EC member expression is Var-free.

So this seems like a bit of a mess.  We can fix the submitted bug with
the kluge of only testing the nullable_relids condition, as I've done
in the attached patch for v15.  (As join_clause_is_movable_into says,
this condition is conservative and might sometimes reject a clause
that could be evaluated safely, but that's fine for
get_baserel_parampathinfo's purposes.)  But I'm worried about the
physically-reference business, because this function isn't the only
one that assumes that all output from generate_join_implied_equalities
will satisfy join_clause_is_movable_into: there are other functions
that actually assert that.  How come we've not seen assertion
failures, or reports of wrong query results similar to this one?
I can believe that the constant-EC-member situation is impossible
when considering a join as inner_rel, so get_joinrel_parampathinfo's
first usage is probably safe.  But it sure looks like the
generate_join_implied_equalities_for_ecs usage is not.  I wonder
if we can replace the test on clause_relids with something a bit
more reliable, or skip it when considering EC-derived clauses.

Anyway, if we go no further than this then we need the first
attached patch in <= v15, while in v16/HEAD we should at least
modify the comment to reflect reality, as I've done in the
second patch.

Thoughts?  Can anybody devise a test case that triggers the
Asserts I mentioned?

            regards, tom lane

diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 3c75fd56f2..0b130c771f 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -1333,14 +1333,25 @@ get_baserel_parampathinfo(PlannerInfo *root, RelOptInfo *baserel,
     }

     /*
-     * Add in joinclauses generated by EquivalenceClasses, too.  (These
-     * necessarily satisfy join_clause_is_movable_into.)
+     * Add in joinclauses generated by EquivalenceClasses, too.  We do not
+     * check these against join_clause_is_movable_into, because its heuristic
+     * about physically referencing the target rel can fail in certain cases
+     * involving child rels whose EC member is a constant.  However, we must
+     * check its condition about the target rel not being nullable below the
+     * clause.  (The other conditions it checks should be true by construction
+     * for an EC-derived clause.)
      */
-    pclauses = list_concat(pclauses,
-                           generate_join_implied_equalities(root,
-                                                            joinrelids,
-                                                            required_outer,
-                                                            baserel));
+    foreach(lc, generate_join_implied_equalities(root,
+                                                 joinrelids,
+                                                 required_outer,
+                                                 baserel))
+    {
+        RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
+
+        /* Target rel must not be nullable below the clause */
+        if (!bms_overlap(baserel->relids, rinfo->nullable_relids))
+            pclauses = lappend(pclauses, rinfo);
+    }

     /* Estimate the number of rows returned by the parameterized scan */
     rows = get_parameterized_baserel_size(root, baserel, pclauses);
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 867c6d20cc..b356153900 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -5949,6 +5949,37 @@ select * from
  3 | 3
 (6 rows)

+-- check for generation of join EC conditions at wrong level (bug #18429)
+explain (costs off)
+select * from (
+  select arrayd.ad, coalesce(c.hundred, 0) as h
+  from unnest(array[1]) as arrayd(ad)
+  left join lateral (
+    select hundred from tenk1 where unique2 = arrayd.ad
+  ) c on true
+) c2
+where c2.h * c2.ad = c2.h * (c2.ad + 1);
+                                              QUERY PLAN
+-------------------------------------------------------------------------------------------------------
+ Nested Loop Left Join
+   Filter: ((COALESCE(tenk1.hundred, 0) * arrayd.ad) = (COALESCE(tenk1.hundred, 0) * (arrayd.ad + 1)))
+   ->  Function Scan on unnest arrayd
+   ->  Index Scan using tenk1_unique2 on tenk1
+         Index Cond: (unique2 = arrayd.ad)
+(5 rows)
+
+select * from (
+  select arrayd.ad, coalesce(c.hundred, 0) as h
+  from unnest(array[1]) as arrayd(ad)
+  left join lateral (
+    select hundred from tenk1 where unique2 = arrayd.ad
+  ) c on true
+) c2
+where c2.h * c2.ad = c2.h * (c2.ad + 1);
+ ad | h
+----+---
+(0 rows)
+
 -- check the number of columns specified
 SELECT * FROM (int8_tbl i cross join int4_tbl j) ss(a,b,c,d);
 ERROR:  join expression "ss" has 3 columns available but 4 columns specified
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index 1113e98445..11a857041a 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -2029,6 +2029,25 @@ select * from
    (select q1.v)
   ) as q2;

+-- check for generation of join EC conditions at wrong level (bug #18429)
+explain (costs off)
+select * from (
+  select arrayd.ad, coalesce(c.hundred, 0) as h
+  from unnest(array[1]) as arrayd(ad)
+  left join lateral (
+    select hundred from tenk1 where unique2 = arrayd.ad
+  ) c on true
+) c2
+where c2.h * c2.ad = c2.h * (c2.ad + 1);
+select * from (
+  select arrayd.ad, coalesce(c.hundred, 0) as h
+  from unnest(array[1]) as arrayd(ad)
+  left join lateral (
+    select hundred from tenk1 where unique2 = arrayd.ad
+  ) c on true
+) c2
+where c2.h * c2.ad = c2.h * (c2.ad + 1);
+
 -- check the number of columns specified
 SELECT * FROM (int8_tbl i cross join int4_tbl j) ss(a,b,c,d);

diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 0c125e42e8..7da2f913ce 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -1594,8 +1594,11 @@ get_baserel_parampathinfo(PlannerInfo *root, RelOptInfo *baserel,
     }

     /*
-     * Add in joinclauses generated by EquivalenceClasses, too.  (These
-     * necessarily satisfy join_clause_is_movable_into.)
+     * Add in joinclauses generated by EquivalenceClasses, too.  We do not
+     * check these against join_clause_is_movable_into, because its heuristic
+     * about physically referencing the target rel can fail in certain cases
+     * involving child rels whose EC member is a constant.  We must trust the
+     * EC mechanism to not produce joinclauses that don't belong here.
      */
     pclauses = list_concat(pclauses,
                            generate_join_implied_equalities(root,

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: Re: FSM Corruption (was: Could not read block at end of the relation)
Следующее
От: PG Bug reporting form
Дата:
Сообщение: BUG #18432: Polymorphic, table-returning PL/pgSQL function fails with an erroneous "schema mismatch" error