Обсуждение: Issue for partitioning with extra check constriants

Поиск
Список
Период
Сортировка

Issue for partitioning with extra check constriants

От
Josh Berkus
Дата:
Simon, Greg, etc.,

Just barked my nose against a major performance issue with CE &
partitioning, and was wondering if anyone had poked at it.

The issue is this: when a partitioned table is evaluated by the planner
for constraint exclusion, it evaluates ALL check constraints on each
partition, regardless of whether or not they include a referenced column
in the query (and whether or not they relate to partitioning).  If some
of those check constraints are expensive (like GIS functions) then this
can add considerably (on the order of 2ms per partition) to planning time.

If this is news to anyone, I have a nice test case.

So ... how plausible is it to fix the planner so that it only evaluates
check constraints on a partition if there is a match of referenced
columns?  Are we talking "moderate", "hard" or "nearly impossible"?

--
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

Re: Issue for partitioning with extra check constriants

От
Tom Lane
Дата:
Josh Berkus <josh@agliodbs.com> writes:
> The issue is this: when a partitioned table is evaluated by the planner
> for constraint exclusion, it evaluates ALL check constraints on each
> partition, regardless of whether or not they include a referenced column
> in the query (and whether or not they relate to partitioning).

[ shrug ... ]  We do not promise that the current partitioning scheme
scales to the number of partitions where this is likely to be an
interesting concern.

*After* we have a real partitioning scheme, it might be worth worrying
about this sort of problem, if it's still a problem then.

> Are we talking "moderate", "hard" or "nearly impossible"?

We're talking "wasted effort on a dead-end situation".  The time that
would go into this would be much better spent on real partitioning.

            regards, tom lane

Re: Issue for partitioning with extra check constriants

От
Josh Berkus
Дата:
> [ shrug ... ]  We do not promise that the current partitioning scheme
> scales to the number of partitions where this is likely to be an
> interesting concern.

Actually, you can demonstrate pretty significant response time delays on
only 50 partitions.

> We're talking "wasted effort on a dead-end situation".  The time that
> would go into this would be much better spent on real partitioning.

That only applies if someone is working on "real partitioning".  Is anyone?

--
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

Re: Issue for partitioning with extra check constriants

От
Tom Lane
Дата:
Josh Berkus <josh@agliodbs.com> writes:
>> [ shrug ... ]  We do not promise that the current partitioning scheme
>> scales to the number of partitions where this is likely to be an
>> interesting concern.

> Actually, you can demonstrate pretty significant response time delays on
> only 50 partitions.

And your point is?  The design center for the current setup is maybe 5
or 10 partitions.  We didn't intend it to be used for more partitions
than you might have spindles to spread the data across.

>> We're talking "wasted effort on a dead-end situation".  The time that
>> would go into this would be much better spent on real partitioning.

> That only applies if someone is working on "real partitioning".  Is anyone?

There is discussion going on, and even if there weren't, the argument
still applies.  Time spent on this band-aid would be time taken away
from a real solution.  In case you haven't noticed, we have very finite
amounts of manpower that's competent to do planner surgery.

            regards, tom lane

Re: Issue for partitioning with extra check constriants

От
Josh Berkus
Дата:
> And your point is?  The design center for the current setup is maybe 5
> or 10 partitions.  We didn't intend it to be used for more partitions
> than you might have spindles to spread the data across.

Where did that come from?  It certainly wasn't anywhere when the feature
was introduced.  Simon intended for this version of partitioning to
scale to 100-200 partitions (and it does, provided that you dump all
other table constraints), and partitioning has nothing to do with
spindles.  I think you're getting it mixed up with tablespaces.

The main reason for partitioning is ease of maintenance (VACUUM,
dropping partitions, etc.) not any kind of I/O optimization.

I'd like to add the following statement to our docs on partitioning, in
section 5.9.4:

=====

Constraint exclusion is tested for every CHECK constraint on the
partitions, even CHECK constraints which have nothing to do with the
partitioning scheme.  This can add siginficant extra planner time,
especially if your partitions have CHECK constraints which are costly to
evaluate.  For performance, it can be a good idea to eliminate all extra
CHECK constraints on partitions or to re-implement them as triggers.

=====

>In case you haven't noticed, we have very finite
> amounts of manpower that's competent to do planner surgery.

Point.


--
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

Re: Issue for partitioning with extra check constriants

От
"Joshua D. Drake"
Дата:
On Mon, 2010-10-04 at 11:34 -0700, Josh Berkus wrote:
> > And your point is?  The design center for the current setup is maybe 5
> > or 10 partitions.  We didn't intend it to be used for more partitions
> > than you might have spindles to spread the data across.
>
> Where did that come from?

Yeah that is a bit odd. I don't recall any discussion in regards to such
a weird limitation.

>  It certainly wasn't anywhere when the feature
> was introduced.  Simon intended for this version of partitioning to
> scale to 100-200 partitions (and it does, provided that you dump all
> other table constraints), and partitioning has nothing to do with
> spindles.  I think you're getting it mixed up with tablespaces.

Great! that would be an excellent addition.


>
> The main reason for partitioning is ease of maintenance (VACUUM,
> dropping partitions, etc.) not any kind of I/O optimization.

Well that is certainly "a" main reason but it is not "the" main reason.
We have lots of customers using it to manage very large amounts of data
using the constraint exclusion features (and gaining from the smaller
index sizes).


Jd

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt

Re: Issue for partitioning with extra check constriants

От
Tom Lane
Дата:
Josh Berkus <josh@agliodbs.com> writes:
>> And your point is?  The design center for the current setup is maybe 5
>> or 10 partitions.  We didn't intend it to be used for more partitions
>> than you might have spindles to spread the data across.

> Where did that come from?  It certainly wasn't anywhere when the feature
> was introduced.  Simon intended for this version of partitioning to
> scale to 100-200 partitions (and it does, provided that you dump all
> other table constraints), and partitioning has nothing to do with
> spindles.  I think you're getting it mixed up with tablespaces.

[ shrug... ]  If Simon thought that, he obviously hadn't done any
careful study of the planner's performance.  You can maybe get that far
as long as the partitions have just very simple constraints, but
anything nontrivial won't scale.  As you found out.

            regards, tom lane

Re: Issue for partitioning with extra check constriants

От
"Joshua D. Drake"
Дата:
On Mon, 2010-10-04 at 11:34 -0700, Josh Berkus wrote:
> > And your point is?  The design center for the current setup is maybe 5
> > or 10 partitions.  We didn't intend it to be used for more partitions
> > than you might have spindles to spread the data across.
>
> Where did that come from?

Yeah that is a bit odd. I don't recall any discussion in regards to such
a weird limitation.

>  It certainly wasn't anywhere when the feature
> was introduced.  Simon intended for this version of partitioning to
> scale to 100-200 partitions (and it does, provided that you dump all
> other table constraints), and partitioning has nothing to do with
> spindles.  I think you're getting it mixed up with tablespaces.

Great! that would be an excellent addition.


>
> The main reason for partitioning is ease of maintenance (VACUUM,
> dropping partitions, etc.) not any kind of I/O optimization.

Well that is certainly "a" main reason but it is not "the" main reason.
We have lots of customers using it to manage very large amounts of data
using the constraint exclusion features (and gaining from the smaller
index sizes).


Jd

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt