Обсуждение: Buglets in equivclass.c

Поиск
Список
Период
Сортировка

Buglets in equivclass.c

От
Tom Lane
Дата:
While hacking on a patch that touches equivclass.c, I came across
a couple of things that seemed wrong, and are fixed by the attached
proposed patch.

First, get_eclass_for_sort_expr() computes expr_relids and nullable_relids
too soon.  This is a waste of a not-really-trivial number of cycles in
the common cases where it finds an existing eclass or is told not to
make a new one.  More subtly, the bitmapsets are computed in the caller's
context.  If we do use them, they will be attached to an EquivalenceClass
that lives in the potentially-longer-lived root->planner_cxt, allowing
the EC's pointers to them to become dangling.  This would be a live bug
if get_eclass_for_sort_expr() could be called with create_it = true during
GEQO planning.  So far as I can find, it is not; but both its API spec and
its internal comments certainly give the impression that that's allowed.

Second, generate_join_implied_equalities() uses inner_rel->relids to
look up relevant eclasses, but given the surrounding code it seems like
it ought to be using nominal_inner_relids.  The code accidentally works
because a child RelOptInfo will always have exactly the same
eclass_indexes as its top parent; but if it did not, we'd risk either
missing some relevant eclasses or hitting the assertion that claims
that all the eclasses we find overlap nominal_join_relids.  (I noticed
this while speculating that maybe we needn't bother maintaining
eclass_indexes for child RelOptInfos.  This code is one place that
would fail if we didn't.)

I'm unsure whether to back-patch either of these.  They both seem to be
just latent bugs so far as the core code is concerned, but the first one
in particular seems like something that could bite extension code.
Thoughts?

            regards, tom lane

diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index b68a5a0ec7..d3d826b790 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -634,12 +634,6 @@ get_eclass_for_sort_expr(PlannerInfo *root,
      */
     expr = canonicalize_ec_expression(expr, opcintype, collation);

-    /*
-     * Get the precise set of nullable relids appearing in the expression.
-     */
-    expr_relids = pull_varnos((Node *) expr);
-    nullable_relids = bms_intersect(nullable_relids, expr_relids);
-
     /*
      * Scan through the existing EquivalenceClasses for a match
      */
@@ -716,6 +710,12 @@ get_eclass_for_sort_expr(PlannerInfo *root,
     if (newec->ec_has_volatile && sortref == 0) /* should not happen */
         elog(ERROR, "volatile EquivalenceClass has no sortref");

+    /*
+     * Get the precise set of nullable relids appearing in the expression.
+     */
+    expr_relids = pull_varnos((Node *) expr);
+    nullable_relids = bms_intersect(nullable_relids, expr_relids);
+
     newem = add_eq_member(newec, copyObject(expr), expr_relids,
                           nullable_relids, false, opcintype);

@@ -1171,9 +1171,9 @@ generate_join_implied_equalities(PlannerInfo *root,
     }

     /*
-     * Get all eclasses in common between inner_rel's relids and outer_relids
+     * Get all eclasses that mention both inner and outer sides of the join
      */
-    matching_ecs = get_common_eclass_indexes(root, inner_rel->relids,
+    matching_ecs = get_common_eclass_indexes(root, nominal_inner_relids,
                                              outer_relids);

     i = -1;

Re: Buglets in equivclass.c

От
David Rowley
Дата:
On Mon, 5 Oct 2020 at 06:29, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> While hacking on a patch that touches equivclass.c, I came across
> a couple of things that seemed wrong, and are fixed by the attached
> proposed patch.
>
> First, get_eclass_for_sort_expr() computes expr_relids and nullable_relids
> too soon.  This is a waste of a not-really-trivial number of cycles in
> the common cases where it finds an existing eclass or is told not to
> make a new one.  More subtly, the bitmapsets are computed in the caller's
> context.  If we do use them, they will be attached to an EquivalenceClass
> that lives in the potentially-longer-lived root->planner_cxt, allowing
> the EC's pointers to them to become dangling.  This would be a live bug
> if get_eclass_for_sort_expr() could be called with create_it = true during
> GEQO planning.  So far as I can find, it is not; but both its API spec and
> its internal comments certainly give the impression that that's allowed.

hmm, yeah, it seems fairly low-risk to be moving that down to after we
switch the memory context. Seems like the sort of thing that could
easily become an actual bug one day.

> Second, generate_join_implied_equalities() uses inner_rel->relids to
> look up relevant eclasses, but given the surrounding code it seems like
> it ought to be using nominal_inner_relids.  The code accidentally works
> because a child RelOptInfo will always have exactly the same
> eclass_indexes as its top parent; but if it did not, we'd risk either
> missing some relevant eclasses or hitting the assertion that claims
> that all the eclasses we find overlap nominal_join_relids.  (I noticed
> this while speculating that maybe we needn't bother maintaining
> eclass_indexes for child RelOptInfos.  This code is one place that
> would fail if we didn't.)

Oops. That certainly should be using nominal_inner_relids rather than
innerrel->relids. I agree it's not a live bug since child ECs are just
a copy of their parents, currently.

> I'm unsure whether to back-patch either of these.  They both seem to be
> just latent bugs so far as the core code is concerned, but the first one
> in particular seems like something that could bite extension code.
> Thoughts?

That's a good question. I'm leaning towards backpatching both of them.
The 2nd is new as of v13, so it does not seem unreasonable that
someone has just not yet stumbled on it with some extension that adds
extra child ECs. I can imagine a use case for that, by getting rid of
needless equality quals that duplicate the partition constraint.  The
fix for the first just seems neater/faster/correct, so I can't really
see any reason not to backpatch it.

David



Re: Buglets in equivclass.c

От
Tom Lane
Дата:
David Rowley <dgrowleyml@gmail.com> writes:
> On Mon, 5 Oct 2020 at 06:29, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm unsure whether to back-patch either of these.  They both seem to be
>> just latent bugs so far as the core code is concerned, but the first one
>> in particular seems like something that could bite extension code.
>> Thoughts?

> That's a good question. I'm leaning towards backpatching both of them.
> The 2nd is new as of v13, so it does not seem unreasonable that
> someone has just not yet stumbled on it with some extension that adds
> extra child ECs. I can imagine a use case for that, by getting rid of
> needless equality quals that duplicate the partition constraint.  The
> fix for the first just seems neater/faster/correct, so I can't really
> see any reason not to backpatch it.

Yeah, that's my conclusion too after sleeping on it.  Pushed,
thanks for reviewing.

            regards, tom lane