Re: Use opresulttype instead of calling SearchSysCache1() in match_orclause_to_indexcol()

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Use opresulttype instead of calling SearchSysCache1() in match_orclause_to_indexcol()
Дата
Msg-id 368803.1763314038@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Use opresulttype instead of calling SearchSysCache1() in match_orclause_to_indexcol()  (Tender Wang <tndrwang@gmail.com>)
Список pgsql-hackers
Tender Wang <tndrwang@gmail.com> writes:
> Tom Lane <tgl@sss.pgh.pa.us> 于2025年11月16日周日 04:45写道:
>> Yeah.  In fact, I think it's outright wrong to do that here.
>> It'd result in building a SAOP expression that lacks the RelabelType,
>> which seems incorrect since the operator is one that expects the
>> relabeled type.
>>
>> The RelabelType-stripping logic for the constExpr seems unnecessary as
>> well, if not outright wrong.  It won't matter for an actual Const,
>> because eval_const_expressions would have flattened the relabeled type
>> into the Const node.  However, if we have some non-Const right-hand
>> sides, the effect of stripping RelabelTypes could easily be to fail the
>> transformation unnecessarily.  That'd happen if the parser had coerced
>> all the RHS values to be the same type for application of the operator,
>> but then we stripped some RelabelTypes and mistakenly decided that
>> the resulting RHSes didn't match in type.

> Thank you for pointing that out. I hadn’t been aware of these problems
> earlier.

I made a test script (attached) that demonstrates that these problems
are real.  In HEAD, if you look at the logged plan tree for the first
query, you'll see that we have a SAOP with operator texteq whose first
input is a bare varchar-type Var, unlike what you get with a plain
indexqual such as "vc1 = '23'".  Now texteq() doesn't care, but there
are polymorphic functions that do care because they look at the
exposed types of their input arguments.  Also, HEAD fails to optimize
the second test case into a SAOP because it's fooled itself by
stripping the RelabelType from the outer-side Var.

>> I'm not very convinced that the type_is_rowtype checks are correct
>> either.  I can see that we'd better forbid RECORD, because we've got
>> no way to be sure that all the RHSes are actually the same record
>> type.  But I don't see why it's necessary or appropriate to forbid
>> named composite types.  I didn't change that here; maybe we should
>> look into the discussion leading up to d4378c000.

> Agree.

I dug into the history a little and could not find anything except
[1], which says

    I have made some changes (attachment).
    * if the operator expression left or right side type category is
    {array | domain | composite}, then don't do the transformation.
    (i am not 10% sure with composite)

with no further justification than that.  There were even messages
later in the thread questioning the need for it, but nobody did
anything about it.  The transformation does work perfectly well
if enabled, as shown by the second part of the attached test script.

So I end with v3, now with a full-dress commit message.

            regards, tom lane

[1] https://www.postgresql.org/message-id/CACJufxFrZS07oBHMk1_c8P3A84VZ3ysXiZV8NeU6gAnvu%2BHsVA%40mail.gmail.com

From bacf73d6abd09748e02cb2985229ff799857b7fd Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sun, 16 Nov 2025 12:16:26 -0500
Subject: [PATCH v3] Clean up match_orclause_to_indexcol().

Remove bogus stripping of RelabelTypes: that can result in building
an output SAOP tree with incorrect exposed exprType for the operands,
which might confuse polymorphic operators.  Moreover it demonstrably
prevents folding some OR-trees to SAOPs when the RHS expressions
have different base types that were coerced to the same type by
RelabelTypes.

Reduce prohibition on type_is_rowtype to just disallow type RECORD.
We need that because otherwise we would happily fold multiple RECORD
Consts into a RECORDARRAY Const even if they aren't the same record
type.  (We could allow that perhaps, if we checked that they all have
the same typmod, but the case doesn't seem worth that much effort.)
However, there is no reason at all to disallow the transformation
for named composite types, nor domains over them: as long as we can
find a suitable array type we're good.

Remove some assertions that seem rather out of place (it's not
this code's duty to verify that the RestrictInfo structure is
sane).  Rewrite some comments.

The issues with RelabelType stripping seem severe enough to
back-patch this into v18 where the code was introduced.

Author: Tender Wang <tndrwang@gmail.com>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAHewXN=aH7GQBk4fXU-WaEeVmQWUmBAeNyBfJ3VKzPphyPKUkQ@mail.gmail.com
Backpatch-through: 18
---
 src/backend/optimizer/path/indxpath.c | 115 ++++++++++----------------
 1 file changed, 45 insertions(+), 70 deletions(-)

diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index edc6d2ac1d3..c62e3f87724 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -3290,8 +3290,8 @@ match_rowcompare_to_indexcol(PlannerInfo *root,
  *
  * In this routine, we attempt to transform a list of OR-clause args into a
  * single SAOP expression matching the target index column.  On success,
- * return an IndexClause, containing the transformed expression or NULL,
- * if failed.
+ * return an IndexClause containing the transformed expression.
+ * Return NULL if the transformation fails.
  */
 static IndexClause *
 match_orclause_to_indexcol(PlannerInfo *root,
@@ -3299,24 +3299,21 @@ match_orclause_to_indexcol(PlannerInfo *root,
                            int indexcol,
                            IndexOptInfo *index)
 {
-    ListCell   *lc;
     BoolExpr   *orclause = (BoolExpr *) rinfo->orclause;
-    Node       *indexExpr = NULL;
     List       *consts = NIL;
-    ScalarArrayOpExpr *saopexpr = NULL;
+    Node       *indexExpr = NULL;
     Oid            matchOpno = InvalidOid;
-    IndexClause *iclause;
     Oid            consttype = InvalidOid;
     Oid            arraytype = InvalidOid;
     Oid            inputcollid = InvalidOid;
     bool        firstTime = true;
     bool        haveNonConst = false;
     Index        indexRelid = index->rel->relid;
+    ScalarArrayOpExpr *saopexpr;
+    IndexClause *iclause;
+    ListCell   *lc;

-    Assert(IsA(orclause, BoolExpr));
-    Assert(orclause->boolop == OR_EXPR);
-
-    /* Ignore index if it doesn't support SAOP clauses */
+    /* Forget it if index doesn't support SAOP clauses */
     if (!index->amsearcharray)
         return NULL;

@@ -3329,55 +3326,32 @@ match_orclause_to_indexcol(PlannerInfo *root,
      */
     foreach(lc, orclause->args)
     {
-        RestrictInfo *subRinfo;
+        RestrictInfo *subRinfo = (RestrictInfo *) lfirst(lc);
         OpExpr       *subClause;
         Oid            opno;
         Node       *leftop,
                    *rightop;
         Node       *constExpr;

-        if (!IsA(lfirst(lc), RestrictInfo))
+        /* If it's not a RestrictInfo (i.e. it's a sub-AND), we can't use it */
+        if (!IsA(subRinfo, RestrictInfo))
             break;

-        subRinfo = (RestrictInfo *) lfirst(lc);
-
-        /* Only operator clauses can match  */
+        /* Only operator clauses can match */
         if (!IsA(subRinfo->clause, OpExpr))
             break;

         subClause = (OpExpr *) subRinfo->clause;
         opno = subClause->opno;

-        /* Only binary operators can match  */
+        /* Only binary operators can match */
         if (list_length(subClause->args) != 2)
             break;

-        /*
-         * The parameters below must match between sub-rinfo and its parent as
-         * make_restrictinfo() fills them with the same values, and further
-         * modifications are also the same for the whole subtree.  However,
-         * still make a sanity check.
-         */
-        Assert(subRinfo->is_pushed_down == rinfo->is_pushed_down);
-        Assert(subRinfo->is_clone == rinfo->is_clone);
-        Assert(subRinfo->security_level == rinfo->security_level);
-        Assert(bms_equal(subRinfo->incompatible_relids, rinfo->incompatible_relids));
-        Assert(bms_equal(subRinfo->outer_relids, rinfo->outer_relids));
-
-        /*
-         * Also, check that required_relids in sub-rinfo is subset of parent's
-         * required_relids.
-         */
-        Assert(bms_is_subset(subRinfo->required_relids, rinfo->required_relids));
-
-        /* Only the operator returning a boolean suit the transformation. */
-        if (get_op_rettype(opno) != BOOLOID)
-            break;
-
         /*
          * Check for clauses of the form: (indexkey operator constant) or
-         * (constant operator indexkey).  See match_clause_to_indexcol's notes
-         * about const-ness.
+         * (constant operator indexkey).  These tests should agree with
+         * match_opclause_to_indexcol.
          */
         leftop = (Node *) linitial(subClause->args);
         rightop = (Node *) lsecond(subClause->args);
@@ -3406,22 +3380,6 @@ match_orclause_to_indexcol(PlannerInfo *root,
             break;
         }

-        /*
-         * Ignore any RelabelType node above the operands.  This is needed to
-         * be able to apply indexscanning in binary-compatible-operator cases.
-         * Note: we can assume there is at most one RelabelType node;
-         * eval_const_expressions() will have simplified if more than one.
-         */
-        if (IsA(constExpr, RelabelType))
-            constExpr = (Node *) ((RelabelType *) constExpr)->arg;
-        if (IsA(indexExpr, RelabelType))
-            indexExpr = (Node *) ((RelabelType *) indexExpr)->arg;
-
-        /* Forbid transformation for composite types, records. */
-        if (type_is_rowtype(exprType(constExpr)) ||
-            type_is_rowtype(exprType(indexExpr)))
-            break;
-
         /*
          * Save information about the operator, type, and collation for the
          * first matching qual.  Then, check that subsequent quals match the
@@ -3439,54 +3397,71 @@ match_orclause_to_indexcol(PlannerInfo *root,
              * the expression collation matches the index collation.  Also,
              * there must be an array type to construct an array later.
              */
-            if (!IndexCollMatchesExprColl(index->indexcollations[indexcol], inputcollid) ||
+            if (!IndexCollMatchesExprColl(index->indexcollations[indexcol],
+                                          inputcollid) ||
                 !op_in_opfamily(matchOpno, index->opfamily[indexcol]) ||
                 !OidIsValid(arraytype))
                 break;
+
+            /*
+             * Disallow if either type is RECORD, mainly because we can't be
+             * positive that all the RHS expressions are the same record type.
+             */
+            if (consttype == RECORDOID || exprType(indexExpr) == RECORDOID)
+                break;
+
             firstTime = false;
         }
         else
         {
-            if (opno != matchOpno ||
+            if (matchOpno != opno ||
                 inputcollid != subClause->inputcollid ||
                 consttype != exprType(constExpr))
                 break;
         }

         /*
-         * Check if our list of constants in match_clause_to_indexcol's
-         * understanding of const-ness have something other than Const.
+         * The righthand inputs don't necessarily have to be plain Consts, but
+         * make_SAOP_expr needs to know if any are not.
          */
         if (!IsA(constExpr, Const))
             haveNonConst = true;
+
         consts = lappend(consts, constExpr);
     }

     /*
      * Handle failed conversion from breaking out of the loop because of an
-     * unsupported qual.  Free the consts list and return NULL to indicate the
-     * conversion failed.
+     * unsupported qual.  Also check that we have an indexExpr, just in case
+     * the OR list was somehow empty (it shouldn't be).  Return NULL to
+     * indicate the conversion failed.
      */
-    if (lc != NULL)
+    if (lc != NULL || indexExpr == NULL)
     {
-        list_free(consts);
+        list_free(consts);        /* might as well */
         return NULL;
     }

+    /*
+     * Build the new SAOP node.  We use the indexExpr from the last OR arm;
+     * since all the arms passed match_index_to_operand, it shouldn't matter
+     * which one we use.  But using "inputcollid" twice is a bit of a cheat:
+     * we might end up with an array Const node that is labeled with a
+     * collation despite its elements being of a noncollatable type.  But
+     * nothing is likely to complain about that, so we don't bother being more
+     * accurate.
+     */
     saopexpr = make_SAOP_expr(matchOpno, indexExpr, consttype, inputcollid,
                               inputcollid, consts, haveNonConst);
+    Assert(saopexpr != NULL);

     /*
-     * Finally, build an IndexClause based on the SAOP node.  Use
-     * make_simple_restrictinfo() to get RestrictInfo with clean selectivity
-     * estimations, because they may differ from the estimation made for an OR
-     * clause.  Although it is not a lossy expression, keep the original rinfo
-     * in iclause->rinfo as prescribed.
+     * Finally, build an IndexClause based on the SAOP node.  It's not lossy.
      */
     iclause = makeNode(IndexClause);
     iclause->rinfo = rinfo;
     iclause->indexquals = list_make1(make_simple_restrictinfo(root,
-                                                              &saopexpr->xpr));
+                                                              (Expr *) saopexpr));
     iclause->lossy = false;
     iclause->indexcol = indexcol;
     iclause->indexcols = NIL;
--
2.43.7

drop table if exists t1, t2, tcomp;

create table t1 (vc1 varchar(10) primary key);

insert into t1 select generate_series(1,100000)::text;

create table t2 (vc1 varchar(10) primary key, x varchar(10));

insert into t2 select x::text, (x+1)::text from generate_series(1,100000) x;

vacuum analyze t1, t2;

explain
select * from t1 where vc1 = '23' or vc1 = '24' or vc1 = '42';

set debug_print_parse = 1;
set debug_print_plan = 1;

select * from t1 where vc1 = '23' or vc1 = '24' or vc1 = '42';

explain
select t1.* from t1, t2
where t2.vc1 = '66' and (t1.vc1 = t2.x or t1.vc1 = '99');

create type complex as (r float8, i float8);

create table tcomp (f1 complex primary key);

insert into tcomp select (x,x+1)::complex from generate_series(1,100000) x;

vacuum analyze tcomp;

explain
select * from tcomp where f1 = (1,2)::complex;

select * from tcomp where f1 = (1,2)::complex;

explain
select * from tcomp where f1 = (1,2)::complex or f1 = (3,4)::complex;

select * from tcomp where f1 = (1,2)::complex or f1 = (3,4)::complex;

В списке pgsql-hackers по дате отправления: