Обсуждение: [HACKERS] Constraint exclusion for partitioned tables

Поиск
Список
Период
Сортировка

[HACKERS] Constraint exclusion for partitioned tables

От
Ashutosh Bapat
Дата:
Hi,
In relation_excluded_by_constraints(), we do not apply constraint
exclusion if rte->inh is true.

    /* Only plain relations have constraints */
    if (rte->rtekind != RTE_RELATION || rte->inh)
        return false;

Thus every partitioned table will not benefit from the constraint
exclusion, even when constraint_exclusion = on. Hence for a
partitioned table
 \d+ t1
                                    Table "public.t1"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats
target | Description
--------+---------+-----------+----------+---------+---------+--------------+-------------
 a      | integer |           | not null |         | plain   |              |
 b      | integer |           |          |         | plain   |              |
Partition key: RANGE (a)
Check constraints:
    "t1_b_check" CHECK (b > 100)
Partitions: t1p1 FOR VALUES FROM (0) TO (100),
            t1p2 FOR VALUES FROM (100) TO (200)

while executing a query "select * from t1 where b < 100"
set_rel_size() doesn't mark t1 as dummy. It gets marked dummy only
after all the children have been deemed dummy by constraint exclusion.
This means that we will unnecessarily examine children when the parent
itself is known dummy.

I am guessing that for normal inheritance, a constraint on parent
doesn't necessarily imply the same constraint on the child (Amit
Langote gives me an example of NOT NULL constraint). But in case of
partitioned table, every constraint on the parent is applicable to the
child as well. So, we can apply constraint exclusion on partitioned
relation. Here's patch to do that.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] Constraint exclusion for partitioned tables

От
Robert Haas
Дата:
On Thu, Apr 6, 2017 at 6:47 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> I am guessing that for normal inheritance, a constraint on parent
> doesn't necessarily imply the same constraint on the child (Amit
> Langote gives me an example of NOT NULL constraint).

CHECK constraints that apply to the parent would apply to all
children, unless they are NO INHERIT, so even for regular inheritance,
it might still be possible to prove something by ignoring things that
won't necessarily cascade.

For partitioning, it may be that we've got enough restrictions in
place on what can happen that we can assume everything can cascade.
Actually, I hope that's true, since the partitioned table has no
storage of its own.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Constraint exclusion for partitioned tables

От
Jeevan Chalke
Дата:
Hi,

I had a look at these changes and here are my observations:

1. Patch applies cleanly with "git apply'.
2. make / make install / make check-world all are good.

This patch clearly improves the planning time with given conditions.

To verify that, I have created a table like:
create table foo(a int, b int check (b > 100), c text) partition by range(a);
And then used following query to get planning time:
select * from foo where b < 100;

And on my local setup, I have observed that,
For 16 partitions, planning time was 0.234692 ms, which reduced to 0.112948 ms with this patch.
For 128 partitions, planning time was 1.62305 ms, which reduced to 0.654252 ms with this patch.
For 1024 partitions, planning time was 18.720993 ms, which reduced to 9.667395 ms with this patch.

This clearly shows an improvement in planning time.

Patch looks good to me. So passing that to the committer.

Thanks


On Tue, Aug 1, 2017 at 7:17 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Apr 6, 2017 at 6:47 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> I am guessing that for normal inheritance, a constraint on parent
> doesn't necessarily imply the same constraint on the child (Amit
> Langote gives me an example of NOT NULL constraint).

CHECK constraints that apply to the parent would apply to all
children, unless they are NO INHERIT, so even for regular inheritance,
it might still be possible to prove something by ignoring things that
won't necessarily cascade.

For partitioning, it may be that we've got enough restrictions in
place on what can happen that we can assume everything can cascade.
Actually, I hope that's true, since the partitioned table has no
storage of its own.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

 
--
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: [HACKERS] Constraint exclusion for partitioned tables

От
Robert Haas
Дата:
On Tue, Sep 12, 2017 at 7:08 AM, Jeevan Chalke
<jeevan.chalke@enterprisedb.com> wrote:
> This patch clearly improves the planning time with given conditions.
>
> To verify that, I have created a table like:
> create table foo(a int, b int check (b > 100), c text) partition by
> range(a);
> And then used following query to get planning time:
> select * from foo where b < 100;
>
> And on my local setup, I have observed that,
> For 16 partitions, planning time was 0.234692 ms, which reduced to 0.112948
> ms with this patch.
> For 128 partitions, planning time was 1.62305 ms, which reduced to 0.654252
> ms with this patch.
> For 1024 partitions, planning time was 18.720993 ms, which reduced to
> 9.667395 ms with this patch.
>
> This clearly shows an improvement in planning time.

What about the extra cost of checking the parent when it doesn't help?In that case we will have some loss.

I'm inclined to think that's OK, but it's something to think about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Constraint exclusion for partitioned tables

От
Jeevan Chalke
Дата:


On Tue, Sep 12, 2017 at 8:12 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Sep 12, 2017 at 7:08 AM, Jeevan Chalke
<jeevan.chalke@enterprisedb.com> wrote:
> This patch clearly improves the planning time with given conditions.
>
> To verify that, I have created a table like:
> create table foo(a int, b int check (b > 100), c text) partition by
> range(a);
> And then used following query to get planning time:
> select * from foo where b < 100;
>
> And on my local setup, I have observed that,
> For 16 partitions, planning time was 0.234692 ms, which reduced to 0.112948
> ms with this patch.
> For 128 partitions, planning time was 1.62305 ms, which reduced to 0.654252
> ms with this patch.
> For 1024 partitions, planning time was 18.720993 ms, which reduced to
> 9.667395 ms with this patch.
>
> This clearly shows an improvement in planning time.

What about the extra cost of checking the parent when it doesn't help?
 In that case we will have some loss.

I'm inclined to think that's OK, but it's something to think about.

I have updated query like:
select * from foo where b > 100;
Which matches with the CHECK constraint, and here are the result on my local setup:

Time in milliseconds
Partitions | without patch | with patch
-----------|---------------|------------
2          | 0.072551      | 0.074154
4          | 0.102537      | 0.108024
8          | 0.162703      | 0.175017
16         | 0.288589      | 0.305285
128        |  2.7119       | 2.636247
1024       | 29.101347     | 29.48275

So yes, as you said, it will have slight (may be negligible) overhead.

This observation are from local setup and I have also seen a large standard deviation in the runs.

Thanks
 

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



--
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: [HACKERS] Constraint exclusion for partitioned tables

От
Ashutosh Bapat
Дата:
On Wed, Sep 13, 2017 at 12:17 PM, Jeevan Chalke
<jeevan.chalke@enterprisedb.com> wrote:
>
>
> On Tue, Sep 12, 2017 at 8:12 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>
>> On Tue, Sep 12, 2017 at 7:08 AM, Jeevan Chalke
>> <jeevan.chalke@enterprisedb.com> wrote:
>> > This patch clearly improves the planning time with given conditions.
>> >
>> > To verify that, I have created a table like:
>> > create table foo(a int, b int check (b > 100), c text) partition by
>> > range(a);
>> > And then used following query to get planning time:
>> > select * from foo where b < 100;
>> >
>> > And on my local setup, I have observed that,
>> > For 16 partitions, planning time was 0.234692 ms, which reduced to
>> > 0.112948
>> > ms with this patch.
>> > For 128 partitions, planning time was 1.62305 ms, which reduced to
>> > 0.654252
>> > ms with this patch.
>> > For 1024 partitions, planning time was 18.720993 ms, which reduced to
>> > 9.667395 ms with this patch.
>> >
>> > This clearly shows an improvement in planning time.
>>
>> What about the extra cost of checking the parent when it doesn't help?
>>  In that case we will have some loss.
>>
>> I'm inclined to think that's OK, but it's something to think about.
>
>
> I have updated query like:

Thanks a lot Jeevan for all your experiments. They are very useful.

> select * from foo where b > 100;
> Which matches with the CHECK constraint, and here are the result on my local
> setup:

So, in this case, constraint exclusion fails since the WHERE condition
can not be refuted by the constraints.

>
> Time in milliseconds
> Partitions | without patch | with patch
> -----------|---------------|------------
> 2          | 0.072551      | 0.074154
> 4          | 0.102537      | 0.108024
> 8          | 0.162703      | 0.175017
> 16         | 0.288589      | 0.305285
> 128        |  2.7119       | 2.636247
> 1024       | 29.101347     | 29.48275
>
> So yes, as you said, it will have slight (may be negligible) overhead.
>
> This observation are from local setup and I have also seen a large standard
> deviation in the runs.

For a regular table if the constraint exclusion fails, we will waste
those many CPU cycles. But if the relation is excluded we will save
disk I/O or buffer access and time to apply the conditions on all the
rows in the relation. Given the magnitude of difference in the time to
run constraint exclusion and time for all those things, we take the
hit and run constraint exclusion always.

For a partitioned table, this patch saves the time to run constraint
exclusion on all the partitions if constraint exclusion succeeds on
the partitioned table. If constraint exclusion fails, we have wasted
CPU cycles on one run of constraint exclusion. The difference between
the time spent in the two scenarios increases with the number of
partitions. Practically, users will have a handful partitions rather
than a couple and thus running overhead of running constraint
exclusion on partitioned table would be justified given the time it
will save when CE succeeds.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Constraint exclusion for partitioned tables

От
Robert Haas
Дата:
On Fri, Dec 1, 2017 at 12:21 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Wed, Sep 13, 2017 at 4:07 PM, Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com> wrote:
>> For a partitioned table, this patch saves the time to run constraint
>> exclusion on all the partitions if constraint exclusion succeeds on
>> the partitioned table. If constraint exclusion fails, we have wasted
>> CPU cycles on one run of constraint exclusion. The difference between
>> the time spent in the two scenarios increases with the number of
>> partitions. Practically, users will have a handful partitions rather
>> than a couple and thus running overhead of running constraint
>> exclusion on partitioned table would be justified given the time it
>> will save when CE succeeds.
>
> Moved patch to next CF.

Committed after adding a comment.  Generally, code changes should be
accompanied by comment updates.

I tested this and found out that this is quite useful for cases where
multiple levels of partitioning are in use.  Consider creating 100
partitions like this:

#!/usr/bin/perl

use strict;
use warnings;

print "create table foo (a int, b int, c text) partition by list (a);\n";
for $a (1..10)
{
    print "create table foo$a partition of foo for values in ($a)
partition by list (b);\n";
    for $b (1..10)
    {
        print "create table foo${a}_$b partition of foo$a for values
in ($b);\n";
    }
}

Then consider this query: select * from foo where a = 5;

Without this patch, we have to reject 90 leaf partitions individually,
but with the patch, we can reject the intermediate partitioned tables;
each time we do, it substitutes for rejecting 10 children
individually.  This seems to me to be a case that is quite likely to
come up in the real world.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [HACKERS] Constraint exclusion for partitioned tables

От
Ashutosh Bapat
Дата:
On Sat, Dec 2, 2017 at 1:11 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Dec 1, 2017 at 12:21 AM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> On Wed, Sep 13, 2017 at 4:07 PM, Ashutosh Bapat
>> <ashutosh.bapat@enterprisedb.com> wrote:
>>> For a partitioned table, this patch saves the time to run constraint
>>> exclusion on all the partitions if constraint exclusion succeeds on
>>> the partitioned table. If constraint exclusion fails, we have wasted
>>> CPU cycles on one run of constraint exclusion. The difference between
>>> the time spent in the two scenarios increases with the number of
>>> partitions. Practically, users will have a handful partitions rather
>>> than a couple and thus running overhead of running constraint
>>> exclusion on partitioned table would be justified given the time it
>>> will save when CE succeeds.
>>
>> Moved patch to next CF.
>
> Committed after adding a comment.  Generally, code changes should be
> accompanied by comment updates.

Thanks for committing the patch. Sorry for not including the comments.
Your comment looks good.

>
> I tested this and found out that this is quite useful for cases where
> multiple levels of partitioning are in use.  Consider creating 100
> partitions like this:
>
> #!/usr/bin/perl
>
> use strict;
> use warnings;
>
> print "create table foo (a int, b int, c text) partition by list (a);\n";
> for $a (1..10)
> {
>     print "create table foo$a partition of foo for values in ($a)
> partition by list (b);\n";
>     for $b (1..10)
>     {
>         print "create table foo${a}_$b partition of foo$a for values
> in ($b);\n";
>     }
> }
>
> Then consider this query: select * from foo where a = 5;
>
> Without this patch, we have to reject 90 leaf partitions individually,
> but with the patch, we can reject the intermediate partitioned tables;
> each time we do, it substitutes for rejecting 10 children
> individually.  This seems to me to be a case that is quite likely to
> come up in the real world.
>

Right. Thanks for the testing.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company