Обсуждение: statistics target for columns in unique constraint?

Поиск
Список
Период
Сортировка

statistics target for columns in unique constraint?

От
ach
Дата:
I have a unique constraint on two columns of a supermassive table (est. 1.7
bn rows) that are the only way the table's ever queried - and it's
blindingly fast: 51ms to retrieve any single row even non-partitioned.

Anyway:  Right now statistics on the two unique constrained columns are set
to 200 each (database-wide default is 100), and what I'm wondering is, since
the unique constraint already covers the whole table and all rows in
entirety, is it really necessary for statistics to be set that high on
those?  Or does that only serve to slow down inserts to that table?



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/statistics-target-for-columns-in-unique-constraint-tp5755256.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


Re: statistics target for columns in unique constraint?

От
Marti Raudsepp
Дата:
On Mon, May 13, 2013 at 6:01 PM, ach <alanchines@gmail.com> wrote:
> what I'm wondering is, since
> the unique constraint already covers the whole table and all rows in
> entirety, is it really necessary for statistics to be set that high on
> those?

AFAIK if there are exact-matching unique constraints/indexes for a
query's WHERE clause, the planner will deduce that the query only
returns 1 row and won't consult statistics at all.

> Or does that only serve to slow down inserts to that table?

It doesn't slow down inserts directly. Tables are analyzed in the
background by autovacuum. However, I/O traffic from autovacuum analyze
may slow down inserts running concurrently.

Regards,
Marti


Re: statistics target for columns in unique constraint?

От
Mark Kirkwood
Дата:
On 14/05/13 10:10, Marti Raudsepp wrote:
> On Mon, May 13, 2013 at 6:01 PM, ach <alanchines@gmail.com> wrote:
>> what I'm wondering is, since
>> the unique constraint already covers the whole table and all rows in
>> entirety, is it really necessary for statistics to be set that high on
>> those?
>
> AFAIK if there are exact-matching unique constraints/indexes for a
> query's WHERE clause, the planner will deduce that the query only
> returns 1 row and won't consult statistics at all.
>
>> Or does that only serve to slow down inserts to that table?
>
> It doesn't slow down inserts directly. Tables are analyzed in the
> background by autovacuum. However, I/O traffic from autovacuum analyze
> may slow down inserts running concurrently.
>
>

A higher number in stats target means larger stats structures - which in
turn means that the planning stage of *all* queries may be impacted -
e.g takes up more memory, slightly slower as these larger structures are
read, iterated over, free'd etc.

So if your only access is via a defined unique key, then (as Marti
suggests) - a large setting for stats target would seem to be unnecessary.

If you have access to a test environment I'd recommend you model the
effect of reducing stats target down (back to the default of 100 or even
to the old version default of 10).

A little - paranoia - maybe switch on statement logging and ensure that
there are no *other* ways this table is accessed...the fact that the
number was cranked up from the default is a little suspicious!

Regards

Mark




Re: statistics target for columns in unique constraint?

От
ach
Дата:
Thanks guys!  I'm gonna try tuning the statistics back down to 10 on that
table and see what that does to the insertion rates.  Oh and for Mark:  Not
to worry - i'd actually tuned the stats there up myself awhile ago in an
experiment to see if -that- would've sped insertions some; back before i'd
had enough mileage on postgres for it to have occurred to me that might just
have been useless ;-)

One quick follow up since I'm expecting y'all might know:  Do the statistics
targets actually speed performance on an index search itself; the actual
lookup?  Or are the JUST to inform the planner towards the best pathway
decision?  In other words if I have statistics set to 1000, say, in one
case, and the planner chose the exact same path it would have if they'd just
been set to 100, would the lookup return faster when the stats were at 1000?
Or would it actually take the same time either way?  My hunch is it's the
latter...




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/statistics-target-for-columns-in-unique-constraint-tp5755256p5756093.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


Re: statistics target for columns in unique constraint?

От
Kevin Grittner
Дата:
ach <alanchines@gmail.com> wrote:

> One quick follow up since I'm expecting y'all might know:  Do the
> statistics targets actually speed performance on an index search
> itself; the actual lookup?  Or are the JUST to inform the planner
> towards the best pathway decision?

Since the statistics are just a random sampling and generally not
completely up-to-date, they really can't be used for anything other
than *estimating* relative costs in order to try to pick the best
plan.  Once a plan is chosen, its execution time is not influenced
by the statistics.  A higher statistics target can increase
planning time.  In a complex query with many joins and many indexes
on the referenced tables, the increase in planning time can be
significant.  I have seen cases where blindly increasing the
default statistics target resulted in planning time which was
longer than run time -- without any increase in plan quality.

Generally when something is configurable, it's because there can be
benefit to adjusting it.  If there was a single setting which could
not be materially improved upon for some cases, we wouldn't expose
a configuration option.  This is something which is not only
globally adjustable, you can override the setting for individual
columns -- again, we don't go to the trouble of supporting that
without a good reason.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company