Обсуждение: default range partition and constraint exclusion

Поиск
Список
Период
Сортировка

default range partition and constraint exclusion

От
Amit Langote
Дата:
Hi.

While working on the patch for partition pruning for declarative
partitioned tables, I noticed that default range partition will fail to be
included in a plan in certain cases due to pruning by constraint exclusion.

Consider a multi-column range-partitioned table:

create table mc2p (a int, b int) partition by range (a, b);
create table mc2p_default partition of mc2p default;
create table mc2p0 partition of mc2p
    for values from (minvalue, minvalue) to (1, 1);
create table mc2p2 partition of mc2p
    for values from (1, 1) to (maxvalue, maxvalue);

-- add a row with null b and check that it enters the default partition
insert into mc2p values (2);
INSERT 0 1

select tableoid::regclass, * from mc2p;
   tableoid   | a | b
--------------+---+---
 mc2p_default | 2 |
(1 row)

-- but selecting like this doesn't work
select tableoid::regclass, * from mc2p where a = 2;
 tableoid | a | b
----------+---+---
(0 rows)

because:

explain (costs off) select tableoid::regclass, * from mc2p where a = 2;
              QUERY PLAN
--------------------------------------
 Result
   ->  Append
         ->  Seq Scan on mc2p2
               Filter: (a = 2)
(4 rows)


If you look at the default partition's constraint, which is as follows:

NOT (
      ((a < 1) OR ((a = 1) AND (b < 1)))
        OR
      ((a > 1) OR ((a = 1) AND (b >= 1)))
    )

you'll notice that it doesn't explicitly say that the default partition
allows rows where a is null or b is null or both are null.  Given that,
constraint exclusion will end up concluding that the default partition's
constraint is refuted by a = 2.

The attached will make the constraint to look like:

NOT (
      a IS NOT NULL
       OR
      b IS NOT NULL
      ((a < 1) OR ((a = 1) AND (b < 1)))
        OR
      ((a > 1) OR ((a = 1) AND (b >= 1)))
    )

Now since b IS NULL (which, btw, is NOT (b IS NOT NULL)) fails to be
refuted, as a whole, the whole constraint is not refuted.  So, we get the
correct result:

select tableoid::regclass, * from mc2p where a = 2;
   tableoid   | a | b
--------------+---+---
 mc2p_default | 2 |
(1 row)

explain (costs off) select tableoid::regclass, * from mc2p where a = 2;
              QUERY PLAN
--------------------------------------
 Result
   ->  Append
         ->  Seq Scan on mc2p2
               Filter: (a = 2)
         ->  Seq Scan on mc2p_default
               Filter: (a = 2)
(6 rows)


Attached patches.  Thoughts?

Thanks,
Amit

Вложения

Re: default range partition and constraint exclusion

От
Robert Haas
Дата:
On Fri, Nov 17, 2017 at 12:57 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> While working on the patch for partition pruning for declarative
> partitioned tables, I noticed that default range partition will fail to be
> included in a plan in certain cases due to pruning by constraint exclusion.
> you'll notice that it doesn't explicitly say that the default partition
> allows rows where a is null or b is null or both are null.  Given that,
> constraint exclusion will end up concluding that the default partition's
> constraint is refuted by a = 2.
>
> The attached will make the constraint to look like:

Uh, if the constraint exclusion logic we're using is drawing false
conclusions, we need to fix it so it doesn't, not change the
constraint so that the wrong logic gives the right answer.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: default range partition and constraint exclusion

От
Amit Langote
Дата:
On 2017/11/18 8:28, Robert Haas wrote:
> On Fri, Nov 17, 2017 at 12:57 AM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> While working on the patch for partition pruning for declarative
>> partitioned tables, I noticed that default range partition will fail to be
>> included in a plan in certain cases due to pruning by constraint exclusion.
>> you'll notice that it doesn't explicitly say that the default partition
>> allows rows where a is null or b is null or both are null.  Given that,
>> constraint exclusion will end up concluding that the default partition's
>> constraint is refuted by a = 2.
>>
>> The attached will make the constraint to look like:
> 
> Uh, if the constraint exclusion logic we're using is drawing false
> conclusions, we need to fix it so it doesn't, not change the
> constraint so that the wrong logic gives the right answer.

I did actually consider it, but ended up concluding that constraint
exclusion is doing all it can using the provided list partition constraint
clauses.

If all predicate_refuted_by() receives is the expression tree (AND/OR)
with individual nodes being strict clauses involving partition keys (and
nothing about the nullness of the keys), the downstream code is just
playing by the rules as explained in the header comment of
predicate_refuted_by_recurse() in concluding that query's restriction
clause a = 2 refutes it.

You may notice in the example I gave that all rows with non-null partition
keys have a non-default range partitions to choose from.  So, the only
rows that exist in the default partition are those that contain null in
either or both partition keys.

Constraint that's currently generated for the default partition only
describes rows that contain non-null partition keys.  The default
partition in the example contains no such rows, so constraint exclusion is
right in excluding that it is excluded by a clause like a = 2.

By the way, I made a mistake when listing the constraint that the proposed
patch will produce; it's actually:

NOT (     a IS NOT NULL      AND     b IS NOT NULL      AND     (       ((a < 1) OR ((a = 1) AND (b < 1)))        OR
  ((a > 1) OR ((a = 1) AND (b >= 1)))     )   )
 

Am I missing something?

Thanks,
Amit



Re: default range partition and constraint exclusion

От
Robert Haas
Дата:
On Tue, Nov 21, 2017 at 4:36 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
>>> The attached will make the constraint to look like:
>>
>> Uh, if the constraint exclusion logic we're using is drawing false
>> conclusions, we need to fix it so it doesn't, not change the
>> constraint so that the wrong logic gives the right answer.
>
> I did actually consider it, but ended up concluding that constraint
> exclusion is doing all it can using the provided list partition constraint
> clauses.
>
> If all predicate_refuted_by() receives is the expression tree (AND/OR)
> with individual nodes being strict clauses involving partition keys (and
> nothing about the nullness of the keys), the downstream code is just
> playing by the rules as explained in the header comment of
> predicate_refuted_by_recurse() in concluding that query's restriction
> clause a = 2 refutes it.

Oh, wait a minute.  Actually, I think predicate_refuted_by() is doing
the right thing here.  Isn't the problem that mc2p2 shouldn't be
accepting a (2, null) tuple at all?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: default range partition and constraint exclusion

От
Amit Langote
Дата:
On 2017/11/22 6:31, Robert Haas wrote:
> On Tue, Nov 21, 2017 at 4:36 AM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>>>> The attached will make the constraint to look like:
>>>
>>> Uh, if the constraint exclusion logic we're using is drawing false
>>> conclusions, we need to fix it so it doesn't, not change the
>>> constraint so that the wrong logic gives the right answer.
>>
>> I did actually consider it, but ended up concluding that constraint
>> exclusion is doing all it can using the provided list partition constraint
>> clauses.
>>
>> If all predicate_refuted_by() receives is the expression tree (AND/OR)
>> with individual nodes being strict clauses involving partition keys (and
>> nothing about the nullness of the keys), the downstream code is just
>> playing by the rules as explained in the header comment of
>> predicate_refuted_by_recurse() in concluding that query's restriction
>> clause a = 2 refutes it.
> 
> Oh, wait a minute.  Actually, I think predicate_refuted_by() is doing
> the right thing here.  Isn't the problem that mc2p2 shouldn't be
> accepting a (2, null) tuple at all?

It doesn't.  But, for a query, it does contain (2, <unknown>) tuples,
where <unknown> would always be non-null.  So, it should be scanned in the
plan for the query that has only a = 2 as restriction and no restriction
on b.  That seems to work.

\d+ mc2p2
...
Partition of: mc2p FOR VALUES FROM (1, 1) TO (MAXVALUE, MAXVALUE)
Partition constraint: ((a IS NOT NULL) AND (b IS NOT NULL) AND ((a > 1) OR
((a = 1) AND (b >= 1))))

insert into mc2p2 values (2, null);
ERROR:  new row for relation "mc2p2" violates partition constraint
DETAIL:  Failing row contains (2, null).

explain select * from mc2p where a = 2;                        QUERY PLAN
-------------------------------------------------------------Append  (cost=0.00..38.25 rows=11 width=8)  ->  Seq Scan
onmc2p2  (cost=0.00..38.25 rows=11 width=8)        Filter: (a = 2)
 
(3 rows)


My complaint is about m2p_default not being included in the plan for a
query with only a = 2 restriction.  The aforementioned <unknown> includes
null and it's only m2p_default that has such tuples, so it should be in
the plan for queries that don't explicitly prevent null values for b by
including b is not null in the query.

With the patch:

explain (costs off) select * from mc2p where a = 2;                           QUERY PLAN
-------------------------------------------------------------------Append  ->  Seq Scan on mc2p_default        Filter:
((bIS NULL) AND (a = 2))
 
(3 rows)

explain (costs off) select * from mc2p where a = 2 and b is null;                           QUERY PLAN
-------------------------------------------------------------------Append  ->  Seq Scan on mc2p_default        Filter:
((bIS NULL) AND (a = 2))
 
(3 rows)

explain (costs off) select * from mc2p where a = 2 and b is not null;                        QUERY PLAN
-------------------------------------------------------------Append  ->  Seq Scan on mc2p2        Filter: ((b IS NOT
NULL)AND (a = 2))
 
(3 rows)

Thanks,
Amit



Re: default range partition and constraint exclusion

От
Robert Haas
Дата:
On Wed, Nov 22, 2017 at 4:21 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
>>> If all predicate_refuted_by() receives is the expression tree (AND/OR)
>>> with individual nodes being strict clauses involving partition keys (and
>>> nothing about the nullness of the keys), the downstream code is just
>>> playing by the rules as explained in the header comment of
>>> predicate_refuted_by_recurse() in concluding that query's restriction
>>> clause a = 2 refutes it.
>>
>> Oh, wait a minute.  Actually, I think predicate_refuted_by() is doing
>> the right thing here.  Isn't the problem that mc2p2 shouldn't be
>> accepting a (2, null) tuple at all?
>
> It doesn't.  But, for a query, it does contain (2, <unknown>) tuples,
> where <unknown> would always be non-null.  So, it should be scanned in the
> plan for the query that has only a = 2 as restriction and no restriction
> on b.  That seems to work.

OK, so I am still confused about whether the constraint is wrong or
the constraint exclusion logic is wrong.  One of them, at least, has
to be wrong, and we have to fix whichever one is wrong.  Fixing broken
constraint exclusion logic by hacking up the constraint, or conversely
fixing a broken constraint by hacking up the constraint exclusion
logic, wouldn't be right.

I think my last email was confused: I thought that the (2, null) tuple
was ending up in mc2p2, but it's really ending up in mc2p_default,
whose constraint currently looks like this:

NOT (     ((a < 1) OR ((a = 1) AND (b < 1)))       OR     ((a > 1) OR ((a = 1) AND (b >= 1)))   )

Now where exactly is constraint exclusion going wrong here?  a = 2
refutes a < 1 and a = 1, which means that (a < 1) OR ((a = 1) AND (b <
1)) must be false and that (a = 1) AND (b >= 1) must also be false.
But (a > 1) could be either true or null, which means (a > 1) OR ((a =
1) AND (b >= 1)) can be true or null, which means the whole thing can
be false or null, which means that it is not refuted by a = 2.  It
should be possible to dig down in there step by step and figure out
where the wheels are coming off -- have you tried to do that?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: default range partition and constraint exclusion

От
Kyotaro HORIGUCHI
Дата:
At Fri, 24 Nov 2017 10:49:07 -0500, Robert Haas <robertmhaas@gmail.com> wrote in
<CA+TgmoarK4aCcSjYheH7QDchb7uJRpiKkGpPo7O9kMBNf13N3w@mail.gmail.com>
> On Wed, Nov 22, 2017 at 4:21 AM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
> >>> If all predicate_refuted_by() receives is the expression tree (AND/OR)
> >>> with individual nodes being strict clauses involving partition keys (and
> >>> nothing about the nullness of the keys), the downstream code is just
> >>> playing by the rules as explained in the header comment of
> >>> predicate_refuted_by_recurse() in concluding that query's restriction
> >>> clause a = 2 refutes it.
> >>
> >> Oh, wait a minute.  Actually, I think predicate_refuted_by() is doing
> >> the right thing here.  Isn't the problem that mc2p2 shouldn't be
> >> accepting a (2, null) tuple at all?
> >
> > It doesn't.  But, for a query, it does contain (2, <unknown>) tuples,
> > where <unknown> would always be non-null.  So, it should be scanned in the
> > plan for the query that has only a = 2 as restriction and no restriction
> > on b.  That seems to work.
> 
> OK, so I am still confused about whether the constraint is wrong or
> the constraint exclusion logic is wrong.  One of them, at least, has
> to be wrong, and we have to fix whichever one is wrong.  Fixing broken
> constraint exclusion logic by hacking up the constraint, or conversely
> fixing a broken constraint by hacking up the constraint exclusion
> logic, wouldn't be right.
> 
> I think my last email was confused: I thought that the (2, null) tuple
> was ending up in mc2p2, but it's really ending up in mc2p_default,
> whose constraint currently looks like this:
> 
> NOT (
>       ((a < 1) OR ((a = 1) AND (b < 1)))
>         OR
>       ((a > 1) OR ((a = 1) AND (b >= 1)))
>     )
> 
> Now where exactly is constraint exclusion going wrong here?  a = 2
> refutes a < 1 and a = 1, which means that (a < 1) OR ((a = 1) AND (b <
> 1)) must be false and that (a = 1) AND (b >= 1) must also be false.
> But (a > 1) could be either true or null, which means (a > 1) OR ((a =

a > 1 is true when a = 2, so the second term is true?

> 1) AND (b >= 1)) can be true or null, which means the whole thing can
> be false or null, which means that it is not refuted by a = 2.  It

Then the whole thing is false.

> should be possible to dig down in there step by step and figure out
> where the wheels are coming off -- have you tried to do that?

| select  NOT (
|              ((a < 1) OR ((a = 1) AND (b < 1)))
|                OR
|              ((a > 1) OR ((a = 1) AND (b >= 1)))
|             )
| from (values (2::int, null::int)) as t(a, b);
|  ?column? 
| ----------
|  f

The problem here I think is that get_qual_for_range() for default
partition returns an inconsistent qual with what partition
get_partition_for_tuple chooses for keys containing nulls. It
chooses default partition if any of the key values is null,
without referring the constraint expression.

The current behavior is apparently odd.

| select pg_get_partition_constraintdef('mc2p2'::regclass);
|                         pg_get_partition_constraintdef                     
| ----------------------------------------------------------------------------
|  ((a IS NOT NULL) AND (b IS NOT NULL) AND ((a > 1) OR ((a = 1) AND (b >= 1))))


| select pg_get_partition_constraintdef('mc2p_default'::regclass);
|                           pg_get_partition_constraintdef                   
|    
| ---------------------------------------------------------------------------
|  (NOT (((a < 1) OR ((a = 1) AND (b < 1))) OR ((a > 1) OR ((a = 1) AND (b >= 1)))))


| insert into mc2p2 values (2);
| ERROR:  new row for relation "mc2p2" violates partition constraint
| DETAIL:  Failing row contains (2, null).

This is the correct behavior.

| insert into mc2p_default values (2);
| ERROR:  new row for relation "mc2p_default" violates partition constraint
| DETAIL:  Failing row contains (2, null).

This is the correct behavior in terms of constraint, but
incorrect in terms of partition routing.

But interestingly, the following *works*, in a way contradicting
to the constraint.

| insert into mc2p values (2);
| INSERT 0 1
|
| select * from mc2p_default;
|  a | b 
| ---+---
|  2 |  
| (1 row)

After applying the patch upthread, get_qual_for_range() returns
the consistent qual and "insert into mc2p_default values (2)" is
accepted correctly and everything become consistent.

This is the story in my understanding.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: default range partition and constraint exclusion

От
Robert Haas
Дата:
On Mon, Nov 27, 2017 at 4:04 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> This is the story in my understanding.

Thanks, that's helpful.  Sorry I didn't have time yet to study this in
detail myself.  If we're routing tuples to a partition for which the
partition constraint is evaluating to null, that's OK, but if we're
routing tuples to a partition for which the partition constraint is
evaluating to false, that's a bug, and the right solution is to
correct either the tuple routing or the partition constraint.  In this
case, it looks like the tuple routing is working as expected, so that
would say we ought to fix the constraint.

I am out of time for today but will try to look at this some more tomorrow.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: default range partition and constraint exclusion

От
Amit Langote
Дата:
Horiguchi-san, thanks for the clarifying comment.

On 2017/11/27 18:04, Kyotaro HORIGUCHI wrote:
> At Fri, 24 Nov 2017 10:49:07 -0500, Robert Haas wrote
>> OK, so I am still confused about whether the constraint is wrong or
>> the constraint exclusion logic is wrong.  One of them, at least, has
>> to be wrong, and we have to fix whichever one is wrong.  Fixing broken
>> constraint exclusion logic by hacking up the constraint, or conversely
>> fixing a broken constraint by hacking up the constraint exclusion
>> logic, wouldn't be right.
>>
>> I think my last email was confused: I thought that the (2, null) tuple
>> was ending up in mc2p2, but it's really ending up in mc2p_default,
>> whose constraint currently looks like this:
>>
>> NOT (
>>       ((a < 1) OR ((a = 1) AND (b < 1)))
>>         OR
>>       ((a > 1) OR ((a = 1) AND (b >= 1)))
>>     )
>>
>> Now where exactly is constraint exclusion going wrong here?  a = 2
>> refutes a < 1 and a = 1, which means that (a < 1) OR ((a = 1) AND (b <
>> 1)) must be false and that (a = 1) AND (b >= 1) must also be false.
>> But (a > 1) could be either true or null, which means (a > 1) OR ((a =
> 
> a > 1 is true when a = 2, so the second term is true?

Yes.

>> 1) AND (b >= 1)) can be true or null, which means the whole thing can
>> be false or null, which means that it is not refuted by a = 2.  It
> 
> Then the whole thing is false.

Yes, too.

>> should be possible to dig down in there step by step and figure out
>> where the wheels are coming off -- have you tried to do that?
> 
> | select  NOT (
> |              ((a < 1) OR ((a = 1) AND (b < 1)))
> |                OR
> |              ((a > 1) OR ((a = 1) AND (b >= 1)))
> |             )
> | from (values (2::int, null::int)) as t(a, b);
> |  ?column? 
> | ----------
> |  f
> 
> The problem here I think is that get_qual_for_range() for default
> partition returns an inconsistent qual with what partition
> get_partition_for_tuple chooses for keys containing nulls. It
> chooses default partition if any of the key values is null,
> without referring the constraint expression.

Right.

> The current behavior is apparently odd.
> 
> | select pg_get_partition_constraintdef('mc2p2'::regclass);
> |                         pg_get_partition_constraintdef                     
> | ----------------------------------------------------------------------------
> |  ((a IS NOT NULL) AND (b IS NOT NULL) AND ((a > 1) OR ((a = 1) AND (b >= 1))))
> 
> 
> | select pg_get_partition_constraintdef('mc2p_default'::regclass);
> |                           pg_get_partition_constraintdef                   
> |    
> | ---------------------------------------------------------------------------
> |  (NOT (((a < 1) OR ((a = 1) AND (b < 1))) OR ((a > 1) OR ((a = 1) AND (b >= 1)))))
> 
> 
> | insert into mc2p2 values (2);
> | ERROR:  new row for relation "mc2p2" violates partition constraint
> | DETAIL:  Failing row contains (2, null).
> 
> This is the correct behavior.

Yes, a non-default range partition does not accept nulls in any of the
partition keys.

> | insert into mc2p_default values (2);
> | ERROR:  new row for relation "mc2p_default" violates partition constraint
> | DETAIL:  Failing row contains (2, null).
> 
> This is the correct behavior in terms of constraint, but
> incorrect in terms of partition routing.
>
> But interestingly, the following *works*, in a way contradicting
> to the constraint.
> 
> | insert into mc2p values (2);
> | INSERT 0 1
> |
> | select * from mc2p_default;
> |  a | b 
> | ---+---
> |  2 |  
> | (1 row)

That is, the default partition's constraint, as currently generated, is wrong.

> After applying the patch upthread, get_qual_for_range() returns
> the consistent qual and "insert into mc2p_default values (2)" is
> accepted correctly and everything become consistent.

Thanks for testing.

Thanks,
Amit



Re: default range partition and constraint exclusion

От
Robert Haas
Дата:
On Mon, Nov 27, 2017 at 4:01 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> I am out of time for today but will try to look at this some more tomorrow.

Upon closer study this seems to definitely be a correct fix, so I have
committed it.  Apologies for my earlier confusion.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: default range partition and constraint exclusion

От
Amit Langote
Дата:
On 2017/11/29 0:52, Robert Haas wrote:
> On Mon, Nov 27, 2017 at 4:01 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> I am out of time for today but will try to look at this some more tomorrow.
> 
> Upon closer study this seems to definitely be a correct fix, so I have
> committed it.  Apologies for my earlier confusion.

Thank you.

Regards,
Amit