Re: Teach predtest about IS [NOT] proofs

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Teach predtest about IS [NOT] proofs
Дата
Msg-id 3797929.1705946261@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Teach predtest about IS [NOT] proofs  (James Coleman <jtc331@gmail.com>)
Ответы Re: Teach predtest about IS [NOT] proofs  (James Coleman <jtc331@gmail.com>)
Список pgsql-hackers
James Coleman <jtc331@gmail.com> writes:
> 0001 does the initial pure refactor. 0003 makes a lot of modifications
> to what we can prove about implication and refutation. Finally, 0003
> isn't intended to be committed, but attempts to validate more
> holistically that none of the changes creates any invalid proofs
> beyond the mostly happy-path tests added in 0004.

> I ended up not tackling changing how test_predtest tests run for now.
> That's plausibly still useful, and I'd be happy to add that if you
> generally agree with the direction of the patch and with that
> abstraction being useful.

> I added some additional verifications to the test_predtest module to
> prevent additional obvious flaws.

I looked through 0001 and made some additional cosmetic changes,
mostly to get comments closer to the associated code; I also
ran pgindent on it (see v5-0001 attached).  That seems pretty
committable to me at this point.  I also like your 0002 additions to
test_predtest.c (although why the mixture of ERROR and WARNING?
ISTM they should all be WARNING, so we can press on with the test).

One other thought is that maybe separating out
predicate_implied_not_null_by_clause should be part of 0001?

I'm less excited about the rest of v4-0002.

@@ -740,6 +747,16 @@ predicate_refuted_by_recurse(Node *clause, Node *predicate,
                                              !weak))
                 return true;

+            /*
+             * Because weak refutation expands the allowed outcomes for B
+             * from "false" to "false or null", we can additionally prove
+             * weak refutation in the case that strong refutation is proven.
+             */
+            if (weak && not_arg &&
+                predicate_implied_by_recurse(predicate, not_arg,
+                                             true))
+                return true;
+
             switch (pclass)
             {
                 case CLASS_AND:

I don't buy this bit at all.  If the prior recursive call fails to
prove weak refutation in a case where strong refutation holds, how is
that not a bug lower down?  Moreover, in order to mask such a bug,
you're doubling the time taken by failed proofs, which is an
unfortunate thing --- we don't like spending a lot of time on
something that fails to improve the plan.

@@ -1138,32 +1155,114 @@ predicate_implied_by_simple_clause(Expr *predicate, Node *clause,
                     Assert(list_length(op->args) == 2);
                     rightop = lsecond(op->args);

-                    /*
-                     * We might never see a null Const here, but better check
-                     * anyway
-                     */
-                    if (rightop && IsA(rightop, Const) &&
-                        !((Const *) rightop)->constisnull)
+                    if (rightop && IsA(rightop, Const))
                     {
+                        Const    *constexpr = (Const *) rightop;
                         Node       *leftop = linitial(op->args);

-                        if (DatumGetBool(((Const *) rightop)->constvalue))
-                        {
-                            /* X = true implies X */
-                            if (equal(predicate, leftop))
-                                return true;
-                        }
+                        if (constexpr->constisnull)
+                            return false;
+
+                        if (DatumGetBool(constexpr->constvalue))
+                            return equal(predicate, leftop);
                         else
-                        {
-                            /* X = false implies NOT X */
-                            if (is_notclause(predicate) &&
-                                equal(get_notclausearg(predicate), leftop))
-                                return true;
-                        }
+                            return is_notclause(predicate) &&
+                                equal(get_notclausearg(predicate), leftop);
                     }
                 }
             }
             break;

I don't understand what this bit is doing ... and the fact that
the patch removes all the existing comments and adds none isn't
helping that.  What it seems to mostly be doing is adding early
"return false"s, which I'm not sure is a good thing, because
it seems possible that operator_predicate_proof could apply here.

+                    case IS_UNKNOWN:
+                        /*
+                         * When the clause is in the form "foo IS UNKNOWN" then
+                         * we can prove weak implication of a predicate that
+                         * is strict for "foo" and negated. This doesn't work
+                         * for strong implication since if "foo" is "null" then
+                         * the predicate will evaluate to "null" rather than
+                         * "true".
+                         */

The phrasing of this comment seems randomly inconsistent with others
making similar arguments.

+                    case IS_TRUE:
                         /*
-                         * If the predicate is of the form "foo IS NOT NULL",
-                         * and we are considering strong implication, we can
-                         * conclude that the predicate is implied if the
-                         * clause is strict for "foo", i.e., it must yield
-                         * false or NULL when "foo" is NULL.  In that case
-                         * truth of the clause ensures that "foo" isn't NULL.
-                         * (Again, this is a safe conclusion because "foo"
-                         * must be immutable.)  This doesn't work for weak
-                         * implication, though.  Also, "row IS NOT NULL" does
-                         * not act in the simple way we have in mind.
+                         * X implies X is true
+                         *
+                         * We can only prove strong implication here since
+                         * `null is true` is false rather than null.
                          */

This hardly seems like an improvement on the comment.  (Also, here and
elsewhere, could we avoid using two different types of quotes?)

+                                /* X is unknown weakly implies X is not true */
+                                if (weak && clausebtest->booltesttype == IS_UNKNOWN &&
+                                    equal(clausebtest->arg, predbtest->arg))
+                                    return true;

Maybe I'm confused, but why is it only weak?

+                            /*
+                             * When we know what the predicate is in the form
+                             * "foo IS UNKNOWN" then we can prove strong and
+                             * weak refutation together. This is because the
+                             * limits imposed by weak refutation (allowing
+                             * "false" instead of just "null") is equivalently
+                             * helpful since "foo" being "false" also refutes
+                             * the predicate. Hence we pass weak=false here
+                             * always.
+                             */

This comment doesn't make sense to me either.

+    /* TODO: refactor this into switch statements also? */

Let's drop the TODO comments.

+    /*
+     * We can recurse into "not foo" without any additional processing because
+     * "not (null)" evaluates to null. That doesn't work for allow_false,
+     * however, since "not (false)" is true rather than null.
+     */
+    if (is_notclause(clause) &&
+        clause_is_strict_for((Node *) get_notclausearg(clause), subexpr, false))
+        return true;

Not exactly convinced by this.  The way the comment is written, I'd
expect to not call clause_is_strict_for at all if allow_false.  If
it's okay to call it anyway and pass allow_false = false, you need
to defend that, which this comment isn't doing.

            regards, tom lane

diff --git a/src/backend/optimizer/util/predtest.c b/src/backend/optimizer/util/predtest.c
index c37b416e24..3a352ca37f 100644
--- a/src/backend/optimizer/util/predtest.c
+++ b/src/backend/optimizer/util/predtest.c
@@ -1087,38 +1087,12 @@ arrayexpr_cleanup_fn(PredIterInfo info)
 }


-/*----------
+/*
  * predicate_implied_by_simple_clause
  *      Does the predicate implication test for a "simple clause" predicate
  *      and a "simple clause" restriction.
  *
  * We return true if able to prove the implication, false if not.
- *
- * We have several strategies for determining whether one simple clause
- * implies another:
- *
- * A simple and general way is to see if they are equal(); this works for any
- * kind of expression, and for either implication definition.  (Actually,
- * there is an implied assumption that the functions in the expression are
- * immutable --- but this was checked for the predicate by the caller.)
- *
- * Another way that always works is that for boolean x, "x = TRUE" is
- * equivalent to "x", likewise "x = FALSE" is equivalent to "NOT x".
- * These can be worth checking because, while we preferentially simplify
- * boolean comparisons down to "x" and "NOT x", the other form has to be
- * dealt with anyway in the context of index conditions.
- *
- * If the predicate is of the form "foo IS NOT NULL", and we are considering
- * strong implication, we can conclude that the predicate is implied if the
- * clause is strict for "foo", i.e., it must yield false or NULL when "foo"
- * is NULL.  In that case truth of the clause ensures that "foo" isn't NULL.
- * (Again, this is a safe conclusion because "foo" must be immutable.)
- * This doesn't work for weak implication, though.
- *
- * Finally, if both clauses are binary operator expressions, we may be able
- * to prove something using the system's knowledge about operators; those
- * proof rules are encapsulated in operator_predicate_proof().
- *----------
  */
 static bool
 predicate_implied_by_simple_clause(Expr *predicate, Node *clause,
@@ -1127,97 +1101,126 @@ predicate_implied_by_simple_clause(Expr *predicate, Node *clause,
     /* Allow interrupting long proof attempts */
     CHECK_FOR_INTERRUPTS();

-    /* First try the equal() test */
+    /*
+     * A simple and general rule is that a clause implies itself, hence we
+     * check if they are equal(); this works for any kind of expression, and
+     * for either implication definition.  (Actually, there is an implied
+     * assumption that the functions in the expression are immutable --- but
+     * this was checked for the predicate by the caller.)
+     */
     if (equal((Node *) predicate, clause))
         return true;

-    /* Next see if clause is boolean equality to a constant */
-    if (is_opclause(clause) &&
-        ((OpExpr *) clause)->opno == BooleanEqualOperator)
+    /* Next we have some clause-type-specific strategies */
+    switch (nodeTag(clause))
     {
-        OpExpr       *op = (OpExpr *) clause;
-        Node       *rightop;
-
-        Assert(list_length(op->args) == 2);
-        rightop = lsecond(op->args);
-        /* We might never see a null Const here, but better check anyway */
-        if (rightop && IsA(rightop, Const) &&
-            !((Const *) rightop)->constisnull)
-        {
-            Node       *leftop = linitial(op->args);
-
-            if (DatumGetBool(((Const *) rightop)->constvalue))
+        case T_OpExpr:
             {
-                /* X = true implies X */
-                if (equal(predicate, leftop))
-                    return true;
+                OpExpr       *op = (OpExpr *) clause;
+
+                /*----------
+                 * For boolean x, "x = TRUE" is equivalent to "x", likewise
+                 * "x = FALSE" is equivalent to "NOT x".  These can be worth
+                 * checking because, while we preferentially simplify boolean
+                 * comparisons down to "x" and "NOT x", the other form has to
+                 * be dealt with anyway in the context of index conditions.
+                 *
+                 * We could likewise check whether the predicate is boolean
+                 * equality to a constant; but there are no known use-cases
+                 * for that at the moment, assuming that the predicate has
+                 * been through constant-folding.
+                 *----------
+                 */
+                if (op->opno == BooleanEqualOperator)
+                {
+                    Node       *rightop;
+
+                    Assert(list_length(op->args) == 2);
+                    rightop = lsecond(op->args);
+
+                    /* We might never see null Consts here, but better check */
+                    if (rightop && IsA(rightop, Const) &&
+                        !((Const *) rightop)->constisnull)
+                    {
+                        Node       *leftop = linitial(op->args);
+
+                        if (DatumGetBool(((Const *) rightop)->constvalue))
+                        {
+                            /* X = true implies X */
+                            if (equal(predicate, leftop))
+                                return true;
+                        }
+                        else
+                        {
+                            /* X = false implies NOT X */
+                            if (is_notclause(predicate) &&
+                                equal(get_notclausearg(predicate), leftop))
+                                return true;
+                        }
+                    }
+                }
             }
-            else
+            break;
+        default:
+            break;
+    }
+
+    /* ... and some predicate-type-specific ones */
+    switch (nodeTag(predicate))
+    {
+        case T_NullTest:
             {
-                /* X = false implies NOT X */
-                if (is_notclause(predicate) &&
-                    equal(get_notclausearg(predicate), leftop))
-                    return true;
+                NullTest   *predntest = (NullTest *) predicate;
+
+                switch (predntest->nulltesttype)
+                {
+                    case IS_NOT_NULL:
+
+                        /*
+                         * If the predicate is of the form "foo IS NOT NULL",
+                         * and we are considering strong implication, we can
+                         * conclude that the predicate is implied if the
+                         * clause is strict for "foo", i.e., it must yield
+                         * false or NULL when "foo" is NULL.  In that case
+                         * truth of the clause ensures that "foo" isn't NULL.
+                         * (Again, this is a safe conclusion because "foo"
+                         * must be immutable.)  This doesn't work for weak
+                         * implication, though.  Also, "row IS NOT NULL" does
+                         * not act in the simple way we have in mind.
+                         */
+                        if (!weak &&
+                            !predntest->argisrow &&
+                            clause_is_strict_for(clause,
+                                                 (Node *) predntest->arg,
+                                                 true))
+                            return true;
+                        break;
+                    case IS_NULL:
+                        break;
+                }
             }
-        }
+            break;
+        default:
+            break;
     }

     /*
-     * We could likewise check whether the predicate is boolean equality to a
-     * constant; but there are no known use-cases for that at the moment,
-     * assuming that the predicate has been through constant-folding.
+     * Finally, if both clauses are binary operator expressions, we may be
+     * able to prove something using the system's knowledge about operators;
+     * those proof rules are encapsulated in operator_predicate_proof().
      */
-
-    /* Next try the IS NOT NULL case */
-    if (!weak &&
-        predicate && IsA(predicate, NullTest))
-    {
-        NullTest   *ntest = (NullTest *) predicate;
-
-        /* row IS NOT NULL does not act in the simple way we have in mind */
-        if (ntest->nulltesttype == IS_NOT_NULL &&
-            !ntest->argisrow)
-        {
-            /* strictness of clause for foo implies foo IS NOT NULL */
-            if (clause_is_strict_for(clause, (Node *) ntest->arg, true))
-                return true;
-        }
-        return false;            /* we can't succeed below... */
-    }
-
-    /* Else try operator-related knowledge */
     return operator_predicate_proof(predicate, clause, false, weak);
 }

-/*----------
+/*
  * predicate_refuted_by_simple_clause
  *      Does the predicate refutation test for a "simple clause" predicate
  *      and a "simple clause" restriction.
  *
  * We return true if able to prove the refutation, false if not.
  *
- * Unlike the implication case, checking for equal() clauses isn't helpful.
- * But relation_excluded_by_constraints() checks for self-contradictions in a
- * list of clauses, so that we may get here with predicate and clause being
- * actually pointer-equal, and that is worth eliminating quickly.
- *
- * When the predicate is of the form "foo IS NULL", we can conclude that
- * the predicate is refuted if the clause is strict for "foo" (see notes for
- * implication case), or is "foo IS NOT NULL".  That works for either strong
- * or weak refutation.
- *
- * A clause "foo IS NULL" refutes a predicate "foo IS NOT NULL" in all cases.
- * If we are considering weak refutation, it also refutes a predicate that
- * is strict for "foo", since then the predicate must yield false or NULL
- * (and since "foo" appears in the predicate, it's known immutable).
- *
- * (The main motivation for covering these IS [NOT] NULL cases is to support
- * using IS NULL/IS NOT NULL as partition-defining constraints.)
- *
- * Finally, if both clauses are binary operator expressions, we may be able
- * to prove something using the system's knowledge about operators; those
- * proof rules are encapsulated in operator_predicate_proof().
- *----------
+ * The main motivation for covering IS [NOT] NULL cases to support using
+ * IS NULL/IS NOT NULL as partition-defining constraints.
  */
 static bool
 predicate_refuted_by_simple_clause(Expr *predicate, Node *clause,
@@ -1226,61 +1229,150 @@ predicate_refuted_by_simple_clause(Expr *predicate, Node *clause,
     /* Allow interrupting long proof attempts */
     CHECK_FOR_INTERRUPTS();

-    /* A simple clause can't refute itself */
-    /* Worth checking because of relation_excluded_by_constraints() */
+    /*
+     * A simple clause can't refute itself, so unlike the implication case,
+     * checking for equal() clauses isn't helpful.
+     *
+     * But relation_excluded_by_constraints() checks for self-contradictions
+     * in a list of clauses, so that we may get here with predicate and clause
+     * being actually pointer-equal, and that is worth eliminating quickly.
+     */
     if ((Node *) predicate == clause)
         return false;

-    /* Try the predicate-IS-NULL case */
-    if (predicate && IsA(predicate, NullTest) &&
-        ((NullTest *) predicate)->nulltesttype == IS_NULL)
+    /* Next we have some clause-type-specific strategies */
+    switch (nodeTag(clause))
     {
-        Expr       *isnullarg = ((NullTest *) predicate)->arg;
-
-        /* row IS NULL does not act in the simple way we have in mind */
-        if (((NullTest *) predicate)->argisrow)
-            return false;
+        case T_NullTest:
+            {
+                NullTest   *clausentest = (NullTest *) clause;

-        /* strictness of clause for foo refutes foo IS NULL */
-        if (clause_is_strict_for(clause, (Node *) isnullarg, true))
-            return true;
+                /* row IS NULL does not act in the simple way we have in mind */
+                if (clausentest->argisrow)
+                    return false;

-        /* foo IS NOT NULL refutes foo IS NULL */
-        if (clause && IsA(clause, NullTest) &&
-            ((NullTest *) clause)->nulltesttype == IS_NOT_NULL &&
-            !((NullTest *) clause)->argisrow &&
-            equal(((NullTest *) clause)->arg, isnullarg))
-            return true;
+                switch (clausentest->nulltesttype)
+                {
+                    case IS_NULL:
+                        {
+                            switch (nodeTag(predicate))
+                            {
+                                case T_NullTest:
+                                    {
+                                        NullTest   *predntest = (NullTest *) predicate;
+
+                                        /*
+                                         * row IS NULL does not act in the
+                                         * simple way we have in mind
+                                         */
+                                        if (predntest->argisrow)
+                                            return false;
+
+                                        /*
+                                         * foo IS NULL refutes foo IS NOT
+                                         * NULL, at least in the non-row case,
+                                         * for both strong and weak refutation
+                                         */
+                                        if (predntest->nulltesttype == IS_NOT_NULL &&
+                                            equal(predntest->arg, clausentest->arg))
+                                            return true;
+                                    }
+                                default:
+                                    break;
+                            }

-        return false;            /* we can't succeed below... */
+                            /*
+                             * foo IS NULL weakly refutes any predicate that
+                             * is strict for foo, since then the predicate
+                             * must yield false or NULL (and since foo appears
+                             * in the predicate, it's known immutable).
+                             */
+                            if (weak &&
+                                clause_is_strict_for((Node *) predicate,
+                                                     (Node *) clausentest->arg,
+                                                     true))
+                                return true;
+
+                            return false;    /* we can't succeed below... */
+                        }
+                        break;
+                    case IS_NOT_NULL:
+                        break;
+                }
+            }
+        default:
+            break;
     }

-    /* Try the clause-IS-NULL case */
-    if (clause && IsA(clause, NullTest) &&
-        ((NullTest *) clause)->nulltesttype == IS_NULL)
+    /* ... and some predicate-type-specific ones */
+    switch (nodeTag(predicate))
     {
-        Expr       *isnullarg = ((NullTest *) clause)->arg;
+        case T_NullTest:
+            {
+                NullTest   *predntest = (NullTest *) predicate;

-        /* row IS NULL does not act in the simple way we have in mind */
-        if (((NullTest *) clause)->argisrow)
-            return false;
+                /* row IS NULL does not act in the simple way we have in mind */
+                if (predntest->argisrow)
+                    return false;

-        /* foo IS NULL refutes foo IS NOT NULL */
-        if (predicate && IsA(predicate, NullTest) &&
-            ((NullTest *) predicate)->nulltesttype == IS_NOT_NULL &&
-            !((NullTest *) predicate)->argisrow &&
-            equal(((NullTest *) predicate)->arg, isnullarg))
-            return true;
+                switch (predntest->nulltesttype)
+                {
+                    case IS_NULL:
+                        {
+                            switch (nodeTag(clause))
+                            {
+                                case T_NullTest:
+                                    {
+                                        NullTest   *clausentest = (NullTest *) clause;
+
+                                        /*
+                                         * row IS NULL does not act in the
+                                         * simple way we have in mind
+                                         */
+                                        if (clausentest->argisrow)
+                                            return false;
+
+                                        /*
+                                         * foo IS NOT NULL refutes foo IS NULL
+                                         * for both strong and weak refutation
+                                         */
+                                        if (clausentest->nulltesttype == IS_NOT_NULL &&
+                                            equal(clausentest->arg, predntest->arg))
+                                            return true;
+                                    }
+                                    break;
+                                default:
+                                    break;
+                            }

-        /* foo IS NULL weakly refutes any predicate that is strict for foo */
-        if (weak &&
-            clause_is_strict_for((Node *) predicate, (Node *) isnullarg, true))
-            return true;
+                            /*
+                             * When the predicate is of the form "foo IS
+                             * NULL", we can conclude that the predicate is
+                             * refuted if the clause is strict for "foo" (see
+                             * notes for implication case).  That works for
+                             * either strong or weak refutation.
+                             */
+                            if (clause_is_strict_for(clause,
+                                                     (Node *) predntest->arg,
+                                                     true))
+                                return true;
+                        }
+                        break;
+                    case IS_NOT_NULL:
+                        break;
+                }

-        return false;            /* we can't succeed below... */
+                return false;    /* we can't succeed below... */
+            }
+            break;
+        default:
+            break;
     }

-    /* Else try operator-related knowledge */
+    /*
+     * Finally, if both clauses are binary operator expressions, we may be
+     * able to prove something using the system's knowledge about operators.
+     */
     return operator_predicate_proof(predicate, clause, true, weak);
 }


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: Network failure may prevent promotion
Следующее
От: Tom Lane
Дата:
Сообщение: Re: psql: Allow editing query results with \gedit