Обсуждение: constraint exclusion and nulls in IN (..) clause
Hi. When addressing a review comment on the fast partition pruning thread [1], I noticed that specifying null in the IN-list will cause constraint exclusion to wrongly fail to refute a table's check predicate. create table foo (a int check (a = 1)); insert into foo values (null), (1); -- ExecEvalScalarArrayOp() won't return true for any record in foo select * from foo where a in (null, 2); a --- (0 rows) -- The null in the IN-list prevents constraint exclusion to exclude foo -- from being scanned in the first place explain (costs off) select * from foo where a in (null, 2); QUERY PLAN --------------------------------------------- Seq Scan on foo Filter: (a = ANY ('{NULL,2}'::integer[])) (2 rows) I propose a patch that teaches predtest.c to disregard any null values in a SAOP (i.e., the IN (..) expression) when performing constraint exclusion using that SAOP, because they cause predicate_refuted_by_recurse()'s logic to fail to conclude the refutation. There is a rule that all items of an OR clause (SAOP is treated as one) must refute the predicate. But the OpExpr constructed with null as its constant argument won't refute anything and thus will cause the whole OR clause to fail to refute the predicate. -- With the patch explain (costs off) select * from foo where a in (null, 2); QUERY PLAN -------------------------- Result One-Time Filter: false (2 rows) explain (costs off) select * from foo where a in (null, 2, 1); QUERY PLAN ----------------------------------------------- Seq Scan on foo Filter: (a = ANY ('{NULL,2,1}'::integer[])) (2 rows) Thoughts? Thanks, Amit [1] https://www.postgresql.org/message-id/64241fee-09af-fe4b-a0ab-7cd739965041%40lab.ntt.co.jp
Вложения
On Thu, Feb 1, 2018 at 12:26 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > Hi. > > When addressing a review comment on the fast partition pruning thread [1], > I noticed that specifying null in the IN-list will cause constraint > exclusion to wrongly fail to refute a table's check predicate. > > create table foo (a int check (a = 1)); > insert into foo values (null), (1); > > -- ExecEvalScalarArrayOp() won't return true for any record in foo > select * from foo where a in (null, 2); > a > --- > (0 rows) AFAIU, that's true only when = operator is strict. For a non-strict operator which treats two NULL values as equivalent it would return row with NULL value. > > -- The null in the IN-list prevents constraint exclusion to exclude foo > -- from being scanned in the first place > explain (costs off) select * from foo where a in (null, 2); > QUERY PLAN > --------------------------------------------- > Seq Scan on foo > Filter: (a = ANY ('{NULL,2}'::integer[])) > (2 rows) > > I propose a patch that teaches predtest.c to disregard any null values in > a SAOP (i.e., the IN (..) expression) when performing constraint exclusion > using that SAOP, because they cause predicate_refuted_by_recurse()'s logic > to fail to conclude the refutation. There is a rule that all items of an > OR clause (SAOP is treated as one) must refute the predicate. But the > OpExpr constructed with null as its constant argument won't refute > anything and thus will cause the whole OR clause to fail to refute the > predicate. A cursory look through constraint exclusion code esp. operator_predicate_proof() doesn't show any special handling for strict/non-strict operators. Probably that's why that function refuses to refute/imply anything when it encounters NULLs. 1593 * We have two identical subexpressions, and two other subexpressions that 1594 * are not identical but are both Consts; and we have commuted the 1595 * operators if necessary so that the Consts are on the right. We'll need 1596 * to compare the Consts' values. If either is NULL, fail. 1597 */ 1598 if (pred_const->constisnull) 1599 return false; 1600 if (clause_const->constisnull) 1601 return false; > > -- With the patch > explain (costs off) select * from foo where a in (null, 2); > QUERY PLAN > -------------------------- > Result > One-Time Filter: false > (2 rows) > > explain (costs off) select * from foo where a in (null, 2, 1); > QUERY PLAN > ----------------------------------------------- > Seq Scan on foo > Filter: (a = ANY ('{NULL,2,1}'::integer[])) > (2 rows) > > Thoughts? AFAIU, this doesn't look to be the right way to fix the problem; it assumes that the operators are strict. Sorry, if I have misunderstood the patch and your thoughts behind it. May be constraint exclusion code should be taught to treat strict/non-strict operators separately. I am not sure. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Thanks for the comments. On 2018/02/01 16:40, Ashutosh Bapat wrote: > On Thu, Feb 1, 2018 at 12:26 PM, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> Hi. >> >> When addressing a review comment on the fast partition pruning thread [1], >> I noticed that specifying null in the IN-list will cause constraint >> exclusion to wrongly fail to refute a table's check predicate. >> >> create table foo (a int check (a = 1)); >> insert into foo values (null), (1); >> >> -- ExecEvalScalarArrayOp() won't return true for any record in foo >> select * from foo where a in (null, 2); >> a >> --- >> (0 rows) > > AFAIU, that's true only when = operator is strict. For a non-strict > operator which treats two NULL values as equivalent it would return > row with NULL value. That's true. >> -- The null in the IN-list prevents constraint exclusion to exclude foo >> -- from being scanned in the first place >> explain (costs off) select * from foo where a in (null, 2); >> QUERY PLAN >> --------------------------------------------- >> Seq Scan on foo >> Filter: (a = ANY ('{NULL,2}'::integer[])) >> (2 rows) >> >> I propose a patch that teaches predtest.c to disregard any null values in >> a SAOP (i.e., the IN (..) expression) when performing constraint exclusion >> using that SAOP, because they cause predicate_refuted_by_recurse()'s logic >> to fail to conclude the refutation. There is a rule that all items of an >> OR clause (SAOP is treated as one) must refute the predicate. But the >> OpExpr constructed with null as its constant argument won't refute >> anything and thus will cause the whole OR clause to fail to refute the >> predicate. > > A cursory look through constraint exclusion code esp. > operator_predicate_proof() doesn't show any special handling for > strict/non-strict operators. Probably that's why that function refuses > to refute/imply anything when it encounters NULLs. > 1593 * We have two identical subexpressions, and two other > subexpressions that > 1594 * are not identical but are both Consts; and we have commuted the > 1595 * operators if necessary so that the Consts are on the > right. We'll need > 1596 * to compare the Consts' values. If either is NULL, fail. > 1597 */ > 1598 if (pred_const->constisnull) > 1599 return false; > 1600 if (clause_const->constisnull) > 1601 return false; There does actually exist strictness check in predicate_refuted_by_simple_clause(), but comes into play only if the predicate is a IS NULL clause. In our case, both predicate and clause expressions would be operator clauses for which the strictness doesn't seem to be considered. >> -- With the patch >> explain (costs off) select * from foo where a in (null, 2); >> QUERY PLAN >> -------------------------- >> Result >> One-Time Filter: false >> (2 rows) >> >> explain (costs off) select * from foo where a in (null, 2, 1); >> QUERY PLAN >> ----------------------------------------------- >> Seq Scan on foo >> Filter: (a = ANY ('{NULL,2,1}'::integer[])) >> (2 rows) >> >> Thoughts? > > AFAIU, this doesn't look to be the right way to fix the problem; it > assumes that the operators are strict. Sorry, if I have misunderstood > the patch and your thoughts behind it. May be constraint exclusion > code should be taught to treat strict/non-strict operators separately. > I am not sure. Yeah, the patch in its current form is wrong, because it will give wrong answers if the operator being used in a SAOP is non-strict. I modified the patch to consider operator strictness before doing anything with nulls. Thanks, Amit
Вложения
On Thu, Feb 1, 2018 at 2:23 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > > Yeah, the patch in its current form is wrong, because it will give wrong > answers if the operator being used in a SAOP is non-strict. I modified > the patch to consider operator strictness before doing anything with nulls. That's fine, but I am not sure whether this fits constraint exclusion's charter. Please add this patch to the next commitfest so that we can have a better opinion. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
On 2018/02/05 13:20, Ashutosh Bapat wrote: > On Thu, Feb 1, 2018 at 2:23 PM, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> >> Yeah, the patch in its current form is wrong, because it will give wrong >> answers if the operator being used in a SAOP is non-strict. I modified >> the patch to consider operator strictness before doing anything with nulls. > > That's fine, but I am not sure whether this fits constraint > exclusion's charter. Please add this patch to the next commitfest so > that we can have a better opinion. OK, done. Thanks, Amit
On Sun, Feb 4, 2018 at 11:20 PM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote: > On Thu, Feb 1, 2018 at 2:23 PM, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> Yeah, the patch in its current form is wrong, because it will give wrong >> answers if the operator being used in a SAOP is non-strict. I modified >> the patch to consider operator strictness before doing anything with nulls. > > That's fine, but I am not sure whether this fits constraint > exclusion's charter. Please add this patch to the next commitfest so > that we can have a better opinion. It seems like a perfectly reasonable extension of the existing functionality to me. (I have, at present, no opinion on the patch itself.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> Yeah, the patch in its current form is wrong, because it will give wrong > answers if the operator being used in a SAOP is non-strict. I modified > the patch to consider operator strictness before doing anything with nulls. I tried to review this patch without any familiarity to the code. arrayconst_next_fn(): > + /* skip nulls if ok to do so */ > + if (state->opisstrict) > + { > + while (state->elem_nulls[state->next_elem]) > + state->next_elem++; > + } Shouldn't we check if we consumed all elements (state->next_elem >= state->num_elems) inside the while loop? arrayexpr_next_fn(): > + /* skip nulls if ok to do so */ > + if (state->opisstrict) > + { > + Node *node = (Node *) lfirst(state->next); > + > + while (IsA(node, Const) && ((Const *) node)->constisnull) > + state->next = lnext(state->next); > + } I cannot find a way to test this change. Can you imagine a query to exercise it on the regression tests?
Hi. On 2018/03/04 22:12, Emre Hasegeli wrote: >> Yeah, the patch in its current form is wrong, because it will give wrong >> answers if the operator being used in a SAOP is non-strict. I modified >> the patch to consider operator strictness before doing anything with nulls. > > I tried to review this patch without any familiarity to the code. Thanks for the review. > arrayconst_next_fn(): > >> + /* skip nulls if ok to do so */ >> + if (state->opisstrict) >> + { >> + while (state->elem_nulls[state->next_elem]) >> + state->next_elem++; >> + } > > Shouldn't we check if we consumed all elements (state->next_elem >= > state->num_elems) inside the while loop? You're right. Fixed. > arrayexpr_next_fn(): > >> + /* skip nulls if ok to do so */ >> + if (state->opisstrict) >> + { >> + Node *node = (Node *) lfirst(state->next); >> + >> + while (IsA(node, Const) && ((Const *) node)->constisnull) >> + state->next = lnext(state->next); >> + } > > I cannot find a way to test this change. Can you imagine a query to > exercise it on the regression tests? So far, I hadn't either. I figured one out and added it to inherit.sql. Basically, I needed to write the query such that an IN () expression doesn't get const-simplified to a Const containing array, but rather remains an ArrayExpr. So, arrayexpr_*() functions in predtest.c are now exercised. Attached updated patch. Thanks, Amit
Вложения
>> Shouldn't we check if we consumed all elements (state->next_elem >= >> state->num_elems) inside the while loop? > > You're right. Fixed. I don't think the fix is correct. arrayconst_next_fn() can still execute state->next_elem++ without checking if we consumed all elements. I couldn't manage to crash it though. I am also not sure if it is proper to skip some items inside the "next_fn", but I don't know the code to suggest a better alternative. > + /* skip nulls if ok to do so */ > + if (state->opisstrict && state->next) Do we need && state->next in here? It is checked for (state->next == NULL) 2 lines above. > + { > + Node *node = (Node *) lfirst(state->next); > + > + while (node && IsA(node, Const) && ((Const *) node)->constisnull) Should we spell the first check like node != NULL as the rest of the code does. > + { > + state->next = lnext(state->next); > + if (state->next) > + node = (Node *) lfirst(state->next); > + } > + } I think this function is also not correct as it can call lnext(state->next) without checking. > So far, I hadn't either. I figured one out and added it to inherit.sql. > Basically, I needed to write the query such that an IN () expression > doesn't get const-simplified to a Const containing array, but rather > remains an ArrayExpr. So, arrayexpr_*() functions in predtest.c are now > exercised. Nice. I noticed that this part of the code was not covered at all by the tests before.
Hi. Thanks for reviewing again. On 2018/03/05 23:04, Emre Hasegeli wrote: >>> Shouldn't we check if we consumed all elements (state->next_elem >= >>> state->num_elems) inside the while loop? >> >> You're right. Fixed. > > I don't think the fix is correct. arrayconst_next_fn() can still > execute state->next_elem++ without checking if we consumed all > elements. I couldn't manage to crash it though. I couldn't get it to crash either, but it's wrong anyway. What happens is the calling code would perform constraint exclusion with a garbage clause (that is one containing a garbage value picked from elem_values when next_elem has exceeded num_elems) and that may alter the result of partition pruning. Verified that with the following example. create table p (a int) partition by list (a); create table p1 partition of p for values in (1); create table p2 partition of p for values in (2); -- before fixing explain (costs off) select * from p where a in (2, null); QUERY PLAN --------------------------------------------------- Append -> Seq Scan on p1 Filter: (a = ANY ('{2,NULL}'::integer[])) -> Seq Scan on p2 Filter: (a = ANY ('{2,NULL}'::integer[])) (5 rows) So, it looks as if the proposed patch has no effect at all. -- after fixing explain (costs off) select * from p where a in (2, null); QUERY PLAN ---------------------------------------------------------- Append -> Seq Scan on p2 Filter: (a = ANY ('{2,NULL}'::integer[])) Was a bit surprised that the calling code didn't crash while handling a garbage clause. > I am also not sure if it is proper to skip some items inside the > "next_fn", but I don't know the code to suggest a better alternative. Patch teaches it to ignore nulls when it's known that the operator being used is strict. It is harmless and has the benefit that constraint exclusion gives an answer that is consistent with what actually running such a qual against a table's rows would do. Consider this example. create table foo (a int check (a = 1)); insert into foo values (1), (null); select * from foo where a in (2, null); a --- (0 rows) set constraint_exclusion to on; -- you'd think constraint exclusion would avoid the scan explain select * from foo where a in (2, null); QUERY PLAN ----------------------------------------------------- Seq Scan on foo (cost=0.00..41.88 rows=13 width=4) Filter: (a = ANY ('{2,NULL}'::integer[])) (2 rows) But it didn't, because the null in that list caused constraint exclusion to mistakenly decide that the clause as a whole doesn't refute the constraint (check (a = 1)). -- with the patch explain select * from foo where a in (2, null); QUERY PLAN ------------------------------------------ Result (cost=0.00..0.00 rows=0 width=4) One-Time Filter: false (2 rows) After ignoring null, only (a = 2) is left to consider and that does refute (a = 1), so pruning works as desired. BTW, if you compose the qual using an OR directly, it works as desired even with HEAD: explain select * from foo where a = 2 or a = null; QUERY PLAN ------------------------------------------ Result (cost=0.00..0.00 rows=0 width=4) One-Time Filter: false (2 rows) That's because a = null is const-simplified in an earlier planning phase to false, so (a = 2 or false) reduces to just a = 2, which does refute foo's constraint (a = 1). I think the same thing should happen when constraint exclusion internally turns an IN (..) clause into a set of OR clauses. Skipping nulls in these next_fn functions is, in a way, same as const-simplifying them away. >> + /* skip nulls if ok to do so */ >> + if (state->opisstrict && state->next) > > Do we need && state->next in here? It is checked for (state->next == > NULL) 2 lines above. Yeah, it wasn't needed. >> + { >> + Node *node = (Node *) lfirst(state->next); >> + >> + while (node && IsA(node, Const) && ((Const *) node)->constisnull) > > Should we spell the first check like node != NULL as the rest of the code does. OK. >> + { >> + state->next = lnext(state->next); >> + if (state->next) >> + node = (Node *) lfirst(state->next); >> + } >> + } > > I think this function is also not correct as it can call > lnext(state->next) without checking. Yeah. Rearranged the code to fix that. Attached updated patch. Thanks again. Regards, Amit
Вложения
> Patch teaches it to ignore nulls when it's known that the operator being > used is strict. It is harmless and has the benefit that constraint > exclusion gives an answer that is consistent with what actually running > such a qual against a table's rows would do. Yes, I understood that. I just meant that I don't know if it is the best way to skip NULLs inside "next_fn". Maybe the caller of the "next_fn"s should skip them. Anyway, the committer can judge this better. > Yeah. Rearranged the code to fix that. This version looks correct to me. > + state->next = (state->next != NULL) ? lnext(state->next) : NULL; > + node = (state->next != NULL) ? lfirst(state->next) : NULL; I think it is unnecessary to check for (state->next != NULL) two times. We can put those in a single if.
On 2018/03/06 18:46, Emre Hasegeli wrote: >> Patch teaches it to ignore nulls when it's known that the operator being >> used is strict. It is harmless and has the benefit that constraint >> exclusion gives an answer that is consistent with what actually running >> such a qual against a table's rows would do. > > Yes, I understood that. I just meant that I don't know if it is the > best way to skip NULLs inside "next_fn". Maybe the caller of the > "next_fn"s should skip them. Anyway, the committer can judge this > better. I think putting that inside those next_fn functions tends to centralize the logic and would run only only for the intended cases. >> Yeah. Rearranged the code to fix that. > > This version looks correct to me. > >> + state->next = (state->next != NULL) ? lnext(state->next) : NULL; >> + node = (state->next != NULL) ? lfirst(state->next) : NULL; > > I think it is unnecessary to check for (state->next != NULL) two > times. We can put those in a single if. Hmm, state->next refers to two different pointer values on line 1 and line 2. It may end up being set to NULL on line 1. Am I missing something? Thanks, Amit
> Hmm, state->next refers to two different pointer values on line 1 and line > 2. It may end up being set to NULL on line 1. Am I missing something? True, lnext(state->next) can set it to NULL. I confused by the below code on the same function doing the steps in reverse order. With this cleared, I have nothing else to say, so I am setting this to ready for committer.
On 2018/03/06 19:16, Emre Hasegeli wrote: >> Hmm, state->next refers to two different pointer values on line 1 and line >> 2. It may end up being set to NULL on line 1. Am I missing something? > > True, lnext(state->next) can set it to NULL. I confused by the below > code on the same function doing the steps in reverse order. With this > cleared, I have nothing else to say, so I am setting this to ready for > committer. Thanks. :) Regards, Amit
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes: > [ v4-0001-Disregard-nulls-in-SAOP-rightarg-array-list-durin.patch ] This patch seems pretty wrong to me. The proposed proof rule is wrong for !useOr expressions (that is, "scalar op ALL (array)"); you can't just ignore null items in that case. It's also wrong when we need strong refutation. I was about to point to the comment for predicate_refuted_by, * An important fine point is that truth of the clauses must imply that * the predicate returns FALSE, not that it does not return TRUE. This * is normally used to try to refute CHECK constraints, and the only * thing we can assume about a CHECK constraint is that it didn't return * FALSE --- a NULL result isn't a violation per the SQL spec. (Someday and reject the patch as unfixably wrong, when I noticed that in fact somebody had added support for weak refutation to this code. Now I'm just desperately unhappy about the lack of comment-updating work in that commit (b08df9cab, looks like). It's not acceptable to falsify comments and not update them. I have a nasty feeling that there are outright bugs in the logic, too. I'm inclined to think that you've attacked the problem at the wrong place anyway. The notion that one arm of an OR reducing to NULL might not prevent proving what we want is by no means specific to ScalarArrayOp. I think it'd make more sense to see about incorporating that idea in predicate_implied_by_simple_clause/predicate_refuted_by_simple_clause. It might be necessary to change the boolean results in the recursive logic to three-way, not sure. I'm setting this patch back to Waiting On Author pending fixes for the above issues. In the meantime I see a separate work item here to do code review for b08df9cab. regards, tom lane
I wrote: > I think it'd make more sense to see about incorporating that idea in > predicate_implied_by_simple_clause/predicate_refuted_by_simple_clause. After further thought, it seems like the place to deal with this is really operator_predicate_proof(), as in the attached draft patch against HEAD. This passes the smell test for me, in the sense that it's an arguably correct and general extension of the proof rules, but it could use more testing. TBH, the change in the existing regression test case in inherit.sql makes me itch. We've got create table list_parted ( a varchar ) partition by list (a); ... create table part_null_xy partition of list_parted for values in (null, 'xy'); ... explain (costs off) select * from list_parted where a = 'ab' or a in (null, 'cd'); Now, the fact that "null" is included in this query's IN clause is a complete no-op, because the IN is using a strict equality operator. So it's nice that the planner can see that and realize that there's no point in scanning part_null_xy ... but this means that the syntax that's been chosen for list partitioning is seriously misleading. "in (null, 'xy')" in the CREATE TABLE command has nothing to do with the semantics of that identical clause in any other context, and indeed it seems chosen in a way to confuse even (or especially?) seasoned experts. I suppose it's too late to do anything about that now, but it sure seems like NULL should've been handled some other way. regards, tom lane diff --git a/src/backend/optimizer/util/predtest.c b/src/backend/optimizer/util/predtest.c index fb963d0..446207d 100644 *** a/src/backend/optimizer/util/predtest.c --- b/src/backend/optimizer/util/predtest.c *************** static Node *extract_not_arg(Node *claus *** 100,106 **** static Node *extract_strong_not_arg(Node *clause); static bool clause_is_strict_for(Node *clause, Node *subexpr); static bool operator_predicate_proof(Expr *predicate, Node *clause, ! bool refute_it); static bool operator_same_subexprs_proof(Oid pred_op, Oid clause_op, bool refute_it); static bool operator_same_subexprs_lookup(Oid pred_op, Oid clause_op, --- 100,106 ---- static Node *extract_strong_not_arg(Node *clause); static bool clause_is_strict_for(Node *clause, Node *subexpr); static bool operator_predicate_proof(Expr *predicate, Node *clause, ! bool refute_it, bool weak); static bool operator_same_subexprs_proof(Oid pred_op, Oid clause_op, bool refute_it); static bool operator_same_subexprs_lookup(Oid pred_op, Oid clause_op, *************** predicate_implied_by_simple_clause(Expr *** 1137,1143 **** } /* Else try operator-related knowledge */ ! return operator_predicate_proof(predicate, clause, false); } /*---------- --- 1137,1143 ---- } /* Else try operator-related knowledge */ ! return operator_predicate_proof(predicate, clause, false, weak); } /*---------- *************** predicate_refuted_by_simple_clause(Expr *** 1232,1238 **** } /* Else try operator-related knowledge */ ! return operator_predicate_proof(predicate, clause, true); } --- 1232,1238 ---- } /* Else try operator-related knowledge */ ! return operator_predicate_proof(predicate, clause, true, weak); } *************** static const StrategyNumber BT_refute_ta *** 1498,1506 **** * values, since then the operators aren't being given identical inputs. But * we only support that for btree operators, for which we can assume that all * non-null inputs result in non-null outputs, so that it doesn't matter which ! * two non-null constants we consider. Currently the code below just reports ! * "proof failed" if either constant is NULL, but in some cases we could be ! * smarter (and that likely would require checking strong vs. weak proofs). * * We can make proofs involving several expression forms (here "foo" and "bar" * represent subexpressions that are identical according to equal()): --- 1498,1505 ---- * values, since then the operators aren't being given identical inputs. But * we only support that for btree operators, for which we can assume that all * non-null inputs result in non-null outputs, so that it doesn't matter which ! * two non-null constants we consider. If either constant is NULL, we have ! * to think harder, but sometimes the proof still works, as explained below. * * We can make proofs involving several expression forms (here "foo" and "bar" * represent subexpressions that are identical according to equal()): *************** static const StrategyNumber BT_refute_ta *** 1528,1534 **** * and we dare not make deductions with those. */ static bool ! operator_predicate_proof(Expr *predicate, Node *clause, bool refute_it) { OpExpr *pred_opexpr, *clause_opexpr; --- 1527,1534 ---- * and we dare not make deductions with those. */ static bool ! operator_predicate_proof(Expr *predicate, Node *clause, ! bool refute_it, bool weak) { OpExpr *pred_opexpr, *clause_opexpr; *************** operator_predicate_proof(Expr *predicate *** 1675,1691 **** * We have two identical subexpressions, and two other subexpressions that * are not identical but are both Consts; and we have commuted the * operators if necessary so that the Consts are on the right. We'll need ! * to compare the Consts' values. If either is NULL, fail. ! * ! * Future work: in some cases the desired proof might hold even with NULL ! * constants. But beware that we've not yet identified the operators as ! * btree ops, so for instance it'd be quite unsafe to assume they are ! * strict without checking. */ - if (pred_const->constisnull) - return false; if (clause_const->constisnull) return false; /* * Lookup the constant-comparison operator using the system catalogs and --- 1675,1720 ---- * We have two identical subexpressions, and two other subexpressions that * are not identical but are both Consts; and we have commuted the * operators if necessary so that the Consts are on the right. We'll need ! * to compare the Consts' values. If either is NULL, we can't do that, so ! * usually the proof fails ... but in some cases we can claim success. */ if (clause_const->constisnull) + { + /* If clause_op isn't strict, we can't prove anything */ + if (!op_strict(clause_op)) + return false; + + /* + * At this point we know that the clause returns NULL. For proof + * types that assume truth of the clause, this means the proof is + * vacuously true (a/k/a "false implies anything"). That's all proof + * types except weak implication. + */ + if (!(weak && !refute_it)) + return true; + + /* + * For weak implication, it's still possible for the proof to succeed, + * if the predicate can also be proven NULL. In that case we've got + * NULL => NULL which is valid for this proof type. + */ + if (pred_const->constisnull && op_strict(pred_op)) + return true; + /* Else the proof fails */ + return false; + } + if (pred_const->constisnull) + { + /* + * If the pred_op is strict, we know the predicate yields NULL, which + * means the proof succeeds for either weak implication or weak + * refutation. + */ + if (weak && op_strict(pred_op)) + return true; + /* Else the proof fails */ return false; + } /* * Lookup the constant-comparison operator using the system catalogs and diff --git a/src/test/modules/test_predtest/expected/test_predtest.out b/src/test/modules/test_predtest/expected/test_predtest.out index 9dd8aec..5574e03 100644 *** a/src/test/modules/test_predtest/expected/test_predtest.out --- b/src/test/modules/test_predtest/expected/test_predtest.out *************** w_i_holds | f *** 781,793 **** s_r_holds | f w_r_holds | f - -- XXX ideally, we could prove this case too, for strong implication select * from test_predtest($$ select x <= 5, x in (1,3,5,null) from integers $$); -[ RECORD 1 ]-----+-- ! strong_implied_by | f weak_implied_by | f strong_refuted_by | f weak_refuted_by | f --- 781,792 ---- s_r_holds | f w_r_holds | f select * from test_predtest($$ select x <= 5, x in (1,3,5,null) from integers $$); -[ RECORD 1 ]-----+-- ! strong_implied_by | t weak_implied_by | f strong_refuted_by | f weak_refuted_by | f diff --git a/src/test/modules/test_predtest/sql/test_predtest.sql b/src/test/modules/test_predtest/sql/test_predtest.sql index 298b8bf..2734735 100644 *** a/src/test/modules/test_predtest/sql/test_predtest.sql --- b/src/test/modules/test_predtest/sql/test_predtest.sql *************** select x <= 5, x in (1,3,5,7) *** 306,312 **** from integers $$); - -- XXX ideally, we could prove this case too, for strong implication select * from test_predtest($$ select x <= 5, x in (1,3,5,null) from integers --- 306,311 ---- diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out index a79f891..d56e5d2 100644 *** a/src/test/regress/expected/inherit.out --- b/src/test/regress/expected/inherit.out *************** explain (costs off) select * from list_p *** 1715,1725 **** Append -> Seq Scan on part_ab_cd Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[]))) ! -> Seq Scan on part_ef_gh ! Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[]))) ! -> Seq Scan on part_null_xy ! Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[]))) ! (7 rows) explain (costs off) select * from list_parted where a = 'ab'; QUERY PLAN --- 1715,1721 ---- Append -> Seq Scan on part_ab_cd Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[]))) ! (3 rows) explain (costs off) select * from list_parted where a = 'ab'; QUERY PLAN
On 2018/03/10 13:40, Tom Lane wrote: > I wrote: >> I think it'd make more sense to see about incorporating that idea in >> predicate_implied_by_simple_clause/predicate_refuted_by_simple_clause. > > After further thought, it seems like the place to deal with this is > really operator_predicate_proof(), as in the attached draft patch > against HEAD. This passes the smell test for me, in the sense that > it's an arguably correct and general extension of the proof rules, > but it could use more testing. Thanks for the patch. I agree it handles the case I presented my patch for in a more principled manner. So, I've marked the CF entry for my patch as Rejected. Looking at the patch, I guess it'd be right to think that the case for which the following block of code will be executed only arises for OpExpr's generated by predtest.c (that is, when iterating an SAOP): if (clause_const->constisnull) + { + /* If clause_op isn't strict, we can't prove anything */ + if (!op_strict(clause_op)) + return false; + <snip> That's because any other expr = null would've been reduced to constant-false at some earlier point. Then, in the same block I see that there is: + /* + * For weak implication, it's still possible for the proof to succeed, + * if the predicate can also be proven NULL. In that case we've got + * NULL => NULL which is valid for this proof type. + */ + if (pred_const->constisnull && op_strict(pred_op)) + return true; which, afaics, will never be able to return true, because pred_const->constisnull will never be true here. That's because the following code that will run before the above code will determine the result of operator_predicate_proof() in that case: if (equal(pred_leftop, clause_leftop)) { if (equal(pred_rightop, clause_rightop)) { /* We have x op1 y and x op2 y */ return operator_same_subexprs_proof(pred_op, clause_op, refute_it); I wonder if the result would be valid in that case, because the proof cache should only be looked up for non-null sub-expressions, no? > TBH, the change in the existing regression test case in inherit.sql > makes me itch. We've got > > create table list_parted ( > a varchar > ) partition by list (a); > ... > create table part_null_xy partition of list_parted for values in (null, 'xy'); > ... > explain (costs off) select * from list_parted where a = 'ab' or a in (null, 'cd'); > > Now, the fact that "null" is included in this query's IN clause is a > complete no-op, because the IN is using a strict equality operator. > So it's nice that the planner can see that and realize that there's > no point in scanning part_null_xy ... but this means that the syntax > that's been chosen for list partitioning is seriously misleading. > "in (null, 'xy')" in the CREATE TABLE command has nothing to do with > the semantics of that identical clause in any other context, and indeed > it seems chosen in a way to confuse even (or especially?) seasoned > experts. Hmm, yeah, it is confusing. Actually, what it really internally becomes is the following expression: (a IS NULL) OR (a = 'xy') or if the CREATE TABLE had "in (null, 'xy', 'yz'), the expression will become: (a IS NULL) OR (a = ANY (ARRAY['xy'::text, 'yz'::text])) which, as you say, is not the same thing as how the original expression is interpreted elsewhere. > I suppose it's too late to do anything about that now, but it sure > seems like NULL should've been handled some other way. Maybe, write something in the documentation about that? Thanks, Amit
On 2018/03/14 17:16, Amit Langote wrote: > On 2018/03/10 13:40, Tom Lane wrote: >> I wrote: >>> I think it'd make more sense to see about incorporating that idea in >>> predicate_implied_by_simple_clause/predicate_refuted_by_simple_clause. >> >> After further thought, it seems like the place to deal with this is >> really operator_predicate_proof(), as in the attached draft patch >> against HEAD. This passes the smell test for me, in the sense that >> it's an arguably correct and general extension of the proof rules, >> but it could use more testing. > > Thanks for the patch. I agree it handles the case I presented my patch > for in a more principled manner. So, I've marked the CF entry for my > patch as Rejected. Oops, sorry I hadn't actually seen the CF entry before hitting send on this email. Seeing that Tom intends to attach his patch with this CF entry, I will leave the entry alone for now. Thanks, Amit
I wrote: > After further thought, it seems like the place to deal with this is > really operator_predicate_proof(), as in the attached draft patch > against HEAD. This passes the smell test for me, in the sense that > it's an arguably correct and general extension of the proof rules, > but it could use more testing. Was anyone planning to do more work or testing on this? Or should I just push it so we can close the CF entry? regards, tom lane
> After further thought, it seems like the place to deal with this is > really operator_predicate_proof(), as in the attached draft patch > against HEAD. This passes the smell test for me, in the sense that > it's an arguably correct and general extension of the proof rules, > but it could use more testing. I am not sure if we are covering the case when clause_const and pred_const are both NULL. In this case, we should be able to return true only by checking op_strict(pred_op) or maybe even without checking that. Am I mistaken?
Emre Hasegeli <emre@hasegeli.com> writes: > I am not sure if we are covering the case when clause_const and > pred_const are both NULL. In this case, we should be able to return > true only by checking op_strict(pred_op) or maybe even without > checking that. Am I mistaken? Yeah, that's there. We need both operators to be strict, I think; otherwise we can't really assume anything about what they'd return for NULL inputs. But if they are, we have NULL => NULL which is valid for all proof cases. regards, tom lane
Yeah, that's there. We need both operators to be strict, I think;
otherwise we can't really assume anything about what they'd return
for NULL inputs. But if they are, we have NULL => NULL which is
valid for all proof cases.
I understand. I don’t see any problems in this case.
On 2018/03/21 23:00, Tom Lane wrote: > Emre Hasegeli <emre@hasegeli.com> writes: >> I am not sure if we are covering the case when clause_const and >> pred_const are both NULL. In this case, we should be able to return >> true only by checking op_strict(pred_op) or maybe even without >> checking that. Am I mistaken? > > Yeah, that's there. We need both operators to be strict, I think; > otherwise we can't really assume anything about what they'd return > for NULL inputs. But if they are, we have NULL => NULL which is > valid for all proof cases. Thank you for closing the CF entry. I too wasn't sure if the patch's modifications to operator_predicate_proof() led to correct handling for the case where both clause_const and pred_const are both NULL consts. ISTM that the result in that case becomes what operator_same_subexprs_proof() would return for the pred_op (possibly commuted) and clause_op pair. But maybe we don't end up in that case much. Regards, Amit
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes: > I too wasn't sure if the patch's modifications to > operator_predicate_proof() led to correct handling for the case where both > clause_const and pred_const are both NULL consts. ISTM that the result in > that case becomes what operator_same_subexprs_proof() would return for the > pred_op (possibly commuted) and clause_op pair. But maybe we don't end up > in that case much. Probably not. It's hard to get to this type of case at all, because in most cases, if we have a null constant as a subexpression, previous const-simplification would have replaced the whole expression with null. I think in practice it's only interesting when the upper levels of predtest.c have constructed an operator expression out of parts of the given clauses, which is exactly what happens for cases like IN-lists. regards, tom lane