Обсуждение: default range partition and constraint exclusion
Hi. While working on the patch for partition pruning for declarative partitioned tables, I noticed that default range partition will fail to be included in a plan in certain cases due to pruning by constraint exclusion. Consider a multi-column range-partitioned table: create table mc2p (a int, b int) partition by range (a, b); create table mc2p_default partition of mc2p default; create table mc2p0 partition of mc2p for values from (minvalue, minvalue) to (1, 1); create table mc2p2 partition of mc2p for values from (1, 1) to (maxvalue, maxvalue); -- add a row with null b and check that it enters the default partition insert into mc2p values (2); INSERT 0 1 select tableoid::regclass, * from mc2p; tableoid | a | b --------------+---+--- mc2p_default | 2 | (1 row) -- but selecting like this doesn't work select tableoid::regclass, * from mc2p where a = 2; tableoid | a | b ----------+---+--- (0 rows) because: explain (costs off) select tableoid::regclass, * from mc2p where a = 2; QUERY PLAN -------------------------------------- Result -> Append -> Seq Scan on mc2p2 Filter: (a = 2) (4 rows) If you look at the default partition's constraint, which is as follows: NOT ( ((a < 1) OR ((a = 1) AND (b < 1))) OR ((a > 1) OR ((a = 1) AND (b >= 1))) ) you'll notice that it doesn't explicitly say that the default partition allows rows where a is null or b is null or both are null. Given that, constraint exclusion will end up concluding that the default partition's constraint is refuted by a = 2. The attached will make the constraint to look like: NOT ( a IS NOT NULL OR b IS NOT NULL ((a < 1) OR ((a = 1) AND (b < 1))) OR ((a > 1) OR ((a = 1) AND (b >= 1))) ) Now since b IS NULL (which, btw, is NOT (b IS NOT NULL)) fails to be refuted, as a whole, the whole constraint is not refuted. So, we get the correct result: select tableoid::regclass, * from mc2p where a = 2; tableoid | a | b --------------+---+--- mc2p_default | 2 | (1 row) explain (costs off) select tableoid::regclass, * from mc2p where a = 2; QUERY PLAN -------------------------------------- Result -> Append -> Seq Scan on mc2p2 Filter: (a = 2) -> Seq Scan on mc2p_default Filter: (a = 2) (6 rows) Attached patches. Thoughts? Thanks, Amit
Вложения
On Fri, Nov 17, 2017 at 12:57 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > While working on the patch for partition pruning for declarative > partitioned tables, I noticed that default range partition will fail to be > included in a plan in certain cases due to pruning by constraint exclusion. > you'll notice that it doesn't explicitly say that the default partition > allows rows where a is null or b is null or both are null. Given that, > constraint exclusion will end up concluding that the default partition's > constraint is refuted by a = 2. > > The attached will make the constraint to look like: Uh, if the constraint exclusion logic we're using is drawing false conclusions, we need to fix it so it doesn't, not change the constraint so that the wrong logic gives the right answer. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2017/11/18 8:28, Robert Haas wrote: > On Fri, Nov 17, 2017 at 12:57 AM, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> While working on the patch for partition pruning for declarative >> partitioned tables, I noticed that default range partition will fail to be >> included in a plan in certain cases due to pruning by constraint exclusion. >> you'll notice that it doesn't explicitly say that the default partition >> allows rows where a is null or b is null or both are null. Given that, >> constraint exclusion will end up concluding that the default partition's >> constraint is refuted by a = 2. >> >> The attached will make the constraint to look like: > > Uh, if the constraint exclusion logic we're using is drawing false > conclusions, we need to fix it so it doesn't, not change the > constraint so that the wrong logic gives the right answer. I did actually consider it, but ended up concluding that constraint exclusion is doing all it can using the provided list partition constraint clauses. If all predicate_refuted_by() receives is the expression tree (AND/OR) with individual nodes being strict clauses involving partition keys (and nothing about the nullness of the keys), the downstream code is just playing by the rules as explained in the header comment of predicate_refuted_by_recurse() in concluding that query's restriction clause a = 2 refutes it. You may notice in the example I gave that all rows with non-null partition keys have a non-default range partitions to choose from. So, the only rows that exist in the default partition are those that contain null in either or both partition keys. Constraint that's currently generated for the default partition only describes rows that contain non-null partition keys. The default partition in the example contains no such rows, so constraint exclusion is right in excluding that it is excluded by a clause like a = 2. By the way, I made a mistake when listing the constraint that the proposed patch will produce; it's actually: NOT ( a IS NOT NULL AND b IS NOT NULL AND ( ((a < 1) OR ((a = 1) AND (b < 1))) OR ((a > 1) OR ((a = 1) AND (b >= 1))) ) ) Am I missing something? Thanks, Amit
On Tue, Nov 21, 2017 at 4:36 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: >>> The attached will make the constraint to look like: >> >> Uh, if the constraint exclusion logic we're using is drawing false >> conclusions, we need to fix it so it doesn't, not change the >> constraint so that the wrong logic gives the right answer. > > I did actually consider it, but ended up concluding that constraint > exclusion is doing all it can using the provided list partition constraint > clauses. > > If all predicate_refuted_by() receives is the expression tree (AND/OR) > with individual nodes being strict clauses involving partition keys (and > nothing about the nullness of the keys), the downstream code is just > playing by the rules as explained in the header comment of > predicate_refuted_by_recurse() in concluding that query's restriction > clause a = 2 refutes it. Oh, wait a minute. Actually, I think predicate_refuted_by() is doing the right thing here. Isn't the problem that mc2p2 shouldn't be accepting a (2, null) tuple at all? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2017/11/22 6:31, Robert Haas wrote: > On Tue, Nov 21, 2017 at 4:36 AM, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: >>>> The attached will make the constraint to look like: >>> >>> Uh, if the constraint exclusion logic we're using is drawing false >>> conclusions, we need to fix it so it doesn't, not change the >>> constraint so that the wrong logic gives the right answer. >> >> I did actually consider it, but ended up concluding that constraint >> exclusion is doing all it can using the provided list partition constraint >> clauses. >> >> If all predicate_refuted_by() receives is the expression tree (AND/OR) >> with individual nodes being strict clauses involving partition keys (and >> nothing about the nullness of the keys), the downstream code is just >> playing by the rules as explained in the header comment of >> predicate_refuted_by_recurse() in concluding that query's restriction >> clause a = 2 refutes it. > > Oh, wait a minute. Actually, I think predicate_refuted_by() is doing > the right thing here. Isn't the problem that mc2p2 shouldn't be > accepting a (2, null) tuple at all? It doesn't. But, for a query, it does contain (2, <unknown>) tuples, where <unknown> would always be non-null. So, it should be scanned in the plan for the query that has only a = 2 as restriction and no restriction on b. That seems to work. \d+ mc2p2 ... Partition of: mc2p FOR VALUES FROM (1, 1) TO (MAXVALUE, MAXVALUE) Partition constraint: ((a IS NOT NULL) AND (b IS NOT NULL) AND ((a > 1) OR ((a = 1) AND (b >= 1)))) insert into mc2p2 values (2, null); ERROR: new row for relation "mc2p2" violates partition constraint DETAIL: Failing row contains (2, null). explain select * from mc2p where a = 2; QUERY PLAN -------------------------------------------------------------Append (cost=0.00..38.25 rows=11 width=8) -> Seq Scan onmc2p2 (cost=0.00..38.25 rows=11 width=8) Filter: (a = 2) (3 rows) My complaint is about m2p_default not being included in the plan for a query with only a = 2 restriction. The aforementioned <unknown> includes null and it's only m2p_default that has such tuples, so it should be in the plan for queries that don't explicitly prevent null values for b by including b is not null in the query. With the patch: explain (costs off) select * from mc2p where a = 2; QUERY PLAN -------------------------------------------------------------------Append -> Seq Scan on mc2p_default Filter: ((bIS NULL) AND (a = 2)) (3 rows) explain (costs off) select * from mc2p where a = 2 and b is null; QUERY PLAN -------------------------------------------------------------------Append -> Seq Scan on mc2p_default Filter: ((bIS NULL) AND (a = 2)) (3 rows) explain (costs off) select * from mc2p where a = 2 and b is not null; QUERY PLAN -------------------------------------------------------------Append -> Seq Scan on mc2p2 Filter: ((b IS NOT NULL)AND (a = 2)) (3 rows) Thanks, Amit
On Wed, Nov 22, 2017 at 4:21 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: >>> If all predicate_refuted_by() receives is the expression tree (AND/OR) >>> with individual nodes being strict clauses involving partition keys (and >>> nothing about the nullness of the keys), the downstream code is just >>> playing by the rules as explained in the header comment of >>> predicate_refuted_by_recurse() in concluding that query's restriction >>> clause a = 2 refutes it. >> >> Oh, wait a minute. Actually, I think predicate_refuted_by() is doing >> the right thing here. Isn't the problem that mc2p2 shouldn't be >> accepting a (2, null) tuple at all? > > It doesn't. But, for a query, it does contain (2, <unknown>) tuples, > where <unknown> would always be non-null. So, it should be scanned in the > plan for the query that has only a = 2 as restriction and no restriction > on b. That seems to work. OK, so I am still confused about whether the constraint is wrong or the constraint exclusion logic is wrong. One of them, at least, has to be wrong, and we have to fix whichever one is wrong. Fixing broken constraint exclusion logic by hacking up the constraint, or conversely fixing a broken constraint by hacking up the constraint exclusion logic, wouldn't be right. I think my last email was confused: I thought that the (2, null) tuple was ending up in mc2p2, but it's really ending up in mc2p_default, whose constraint currently looks like this: NOT ( ((a < 1) OR ((a = 1) AND (b < 1))) OR ((a > 1) OR ((a = 1) AND (b >= 1))) ) Now where exactly is constraint exclusion going wrong here? a = 2 refutes a < 1 and a = 1, which means that (a < 1) OR ((a = 1) AND (b < 1)) must be false and that (a = 1) AND (b >= 1) must also be false. But (a > 1) could be either true or null, which means (a > 1) OR ((a = 1) AND (b >= 1)) can be true or null, which means the whole thing can be false or null, which means that it is not refuted by a = 2. It should be possible to dig down in there step by step and figure out where the wheels are coming off -- have you tried to do that? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
At Fri, 24 Nov 2017 10:49:07 -0500, Robert Haas <robertmhaas@gmail.com> wrote in <CA+TgmoarK4aCcSjYheH7QDchb7uJRpiKkGpPo7O9kMBNf13N3w@mail.gmail.com> > On Wed, Nov 22, 2017 at 4:21 AM, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: > >>> If all predicate_refuted_by() receives is the expression tree (AND/OR) > >>> with individual nodes being strict clauses involving partition keys (and > >>> nothing about the nullness of the keys), the downstream code is just > >>> playing by the rules as explained in the header comment of > >>> predicate_refuted_by_recurse() in concluding that query's restriction > >>> clause a = 2 refutes it. > >> > >> Oh, wait a minute. Actually, I think predicate_refuted_by() is doing > >> the right thing here. Isn't the problem that mc2p2 shouldn't be > >> accepting a (2, null) tuple at all? > > > > It doesn't. But, for a query, it does contain (2, <unknown>) tuples, > > where <unknown> would always be non-null. So, it should be scanned in the > > plan for the query that has only a = 2 as restriction and no restriction > > on b. That seems to work. > > OK, so I am still confused about whether the constraint is wrong or > the constraint exclusion logic is wrong. One of them, at least, has > to be wrong, and we have to fix whichever one is wrong. Fixing broken > constraint exclusion logic by hacking up the constraint, or conversely > fixing a broken constraint by hacking up the constraint exclusion > logic, wouldn't be right. > > I think my last email was confused: I thought that the (2, null) tuple > was ending up in mc2p2, but it's really ending up in mc2p_default, > whose constraint currently looks like this: > > NOT ( > ((a < 1) OR ((a = 1) AND (b < 1))) > OR > ((a > 1) OR ((a = 1) AND (b >= 1))) > ) > > Now where exactly is constraint exclusion going wrong here? a = 2 > refutes a < 1 and a = 1, which means that (a < 1) OR ((a = 1) AND (b < > 1)) must be false and that (a = 1) AND (b >= 1) must also be false. > But (a > 1) could be either true or null, which means (a > 1) OR ((a = a > 1 is true when a = 2, so the second term is true? > 1) AND (b >= 1)) can be true or null, which means the whole thing can > be false or null, which means that it is not refuted by a = 2. It Then the whole thing is false. > should be possible to dig down in there step by step and figure out > where the wheels are coming off -- have you tried to do that? | select NOT ( | ((a < 1) OR ((a = 1) AND (b < 1))) | OR | ((a > 1) OR ((a = 1) AND (b >= 1))) | ) | from (values (2::int, null::int)) as t(a, b); | ?column? | ---------- | f The problem here I think is that get_qual_for_range() for default partition returns an inconsistent qual with what partition get_partition_for_tuple chooses for keys containing nulls. It chooses default partition if any of the key values is null, without referring the constraint expression. The current behavior is apparently odd. | select pg_get_partition_constraintdef('mc2p2'::regclass); | pg_get_partition_constraintdef | ---------------------------------------------------------------------------- | ((a IS NOT NULL) AND (b IS NOT NULL) AND ((a > 1) OR ((a = 1) AND (b >= 1)))) | select pg_get_partition_constraintdef('mc2p_default'::regclass); | pg_get_partition_constraintdef | | --------------------------------------------------------------------------- | (NOT (((a < 1) OR ((a = 1) AND (b < 1))) OR ((a > 1) OR ((a = 1) AND (b >= 1))))) | insert into mc2p2 values (2); | ERROR: new row for relation "mc2p2" violates partition constraint | DETAIL: Failing row contains (2, null). This is the correct behavior. | insert into mc2p_default values (2); | ERROR: new row for relation "mc2p_default" violates partition constraint | DETAIL: Failing row contains (2, null). This is the correct behavior in terms of constraint, but incorrect in terms of partition routing. But interestingly, the following *works*, in a way contradicting to the constraint. | insert into mc2p values (2); | INSERT 0 1 | | select * from mc2p_default; | a | b | ---+--- | 2 | | (1 row) After applying the patch upthread, get_qual_for_range() returns the consistent qual and "insert into mc2p_default values (2)" is accepted correctly and everything become consistent. This is the story in my understanding. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
On Mon, Nov 27, 2017 at 4:04 AM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > This is the story in my understanding. Thanks, that's helpful. Sorry I didn't have time yet to study this in detail myself. If we're routing tuples to a partition for which the partition constraint is evaluating to null, that's OK, but if we're routing tuples to a partition for which the partition constraint is evaluating to false, that's a bug, and the right solution is to correct either the tuple routing or the partition constraint. In this case, it looks like the tuple routing is working as expected, so that would say we ought to fix the constraint. I am out of time for today but will try to look at this some more tomorrow. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Horiguchi-san, thanks for the clarifying comment. On 2017/11/27 18:04, Kyotaro HORIGUCHI wrote: > At Fri, 24 Nov 2017 10:49:07 -0500, Robert Haas wrote >> OK, so I am still confused about whether the constraint is wrong or >> the constraint exclusion logic is wrong. One of them, at least, has >> to be wrong, and we have to fix whichever one is wrong. Fixing broken >> constraint exclusion logic by hacking up the constraint, or conversely >> fixing a broken constraint by hacking up the constraint exclusion >> logic, wouldn't be right. >> >> I think my last email was confused: I thought that the (2, null) tuple >> was ending up in mc2p2, but it's really ending up in mc2p_default, >> whose constraint currently looks like this: >> >> NOT ( >> ((a < 1) OR ((a = 1) AND (b < 1))) >> OR >> ((a > 1) OR ((a = 1) AND (b >= 1))) >> ) >> >> Now where exactly is constraint exclusion going wrong here? a = 2 >> refutes a < 1 and a = 1, which means that (a < 1) OR ((a = 1) AND (b < >> 1)) must be false and that (a = 1) AND (b >= 1) must also be false. >> But (a > 1) could be either true or null, which means (a > 1) OR ((a = > > a > 1 is true when a = 2, so the second term is true? Yes. >> 1) AND (b >= 1)) can be true or null, which means the whole thing can >> be false or null, which means that it is not refuted by a = 2. It > > Then the whole thing is false. Yes, too. >> should be possible to dig down in there step by step and figure out >> where the wheels are coming off -- have you tried to do that? > > | select NOT ( > | ((a < 1) OR ((a = 1) AND (b < 1))) > | OR > | ((a > 1) OR ((a = 1) AND (b >= 1))) > | ) > | from (values (2::int, null::int)) as t(a, b); > | ?column? > | ---------- > | f > > The problem here I think is that get_qual_for_range() for default > partition returns an inconsistent qual with what partition > get_partition_for_tuple chooses for keys containing nulls. It > chooses default partition if any of the key values is null, > without referring the constraint expression. Right. > The current behavior is apparently odd. > > | select pg_get_partition_constraintdef('mc2p2'::regclass); > | pg_get_partition_constraintdef > | ---------------------------------------------------------------------------- > | ((a IS NOT NULL) AND (b IS NOT NULL) AND ((a > 1) OR ((a = 1) AND (b >= 1)))) > > > | select pg_get_partition_constraintdef('mc2p_default'::regclass); > | pg_get_partition_constraintdef > | > | --------------------------------------------------------------------------- > | (NOT (((a < 1) OR ((a = 1) AND (b < 1))) OR ((a > 1) OR ((a = 1) AND (b >= 1))))) > > > | insert into mc2p2 values (2); > | ERROR: new row for relation "mc2p2" violates partition constraint > | DETAIL: Failing row contains (2, null). > > This is the correct behavior. Yes, a non-default range partition does not accept nulls in any of the partition keys. > | insert into mc2p_default values (2); > | ERROR: new row for relation "mc2p_default" violates partition constraint > | DETAIL: Failing row contains (2, null). > > This is the correct behavior in terms of constraint, but > incorrect in terms of partition routing. > > But interestingly, the following *works*, in a way contradicting > to the constraint. > > | insert into mc2p values (2); > | INSERT 0 1 > | > | select * from mc2p_default; > | a | b > | ---+--- > | 2 | > | (1 row) That is, the default partition's constraint, as currently generated, is wrong. > After applying the patch upthread, get_qual_for_range() returns > the consistent qual and "insert into mc2p_default values (2)" is > accepted correctly and everything become consistent. Thanks for testing. Thanks, Amit
On Mon, Nov 27, 2017 at 4:01 PM, Robert Haas <robertmhaas@gmail.com> wrote: > I am out of time for today but will try to look at this some more tomorrow. Upon closer study this seems to definitely be a correct fix, so I have committed it. Apologies for my earlier confusion. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2017/11/29 0:52, Robert Haas wrote: > On Mon, Nov 27, 2017 at 4:01 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> I am out of time for today but will try to look at this some more tomorrow. > > Upon closer study this seems to definitely be a correct fix, so I have > committed it. Apologies for my earlier confusion. Thank you. Regards, Amit