Re: How to tell ANALYZE to collect statistics from the whole table?

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: How to tell ANALYZE to collect statistics from the whole table?
Дата
Msg-id 54C43D93.9030608@2ndquadrant.com
обсуждение исходный текст
Ответ на How to tell ANALYZE to collect statistics from the whole table?  (AlexK987 <alex.cue.987@gmail.com>)
Ответы Re: How to tell ANALYZE to collect statistics from the whole table?
Список pgsql-performance
Hi,

On 25.1.2015 00:33, AlexK987 wrote:
> The documentation states that "The extent of analysis can be
> controlled by adjusting the default_statistics_target configuration
> variable". It looks like I can tell Postgres to create more
> histograms with more bins, and more distinct values. This implicitly
> means that Postgres will use a larger random subset to calculate
> statistics.
>
> However, this is not what I want. My data may be quite skewed, and I
> want full control over the size of the sample. I want to explicitly
> tell Postgres to analyze the whole table. How can I accomplish that?

I don't think there's an official way to do that - at least I can't
think of one. The only thing you can do is increasing statistics target
(either globally by setting default_statistics_target, or per column
using ALTER TABLE ... SET STATISTICS).

As you noticed, this however controls two things - sample size and how
detailed the statistics (MCV list / histogram) will be. The statistics
target is used as upper bound for number of MCV items / histogram bins,
and the number of sampled rows is (300 * statistics_target). With
default_statistics_target = 10000 (which si the max allowed value since
9.0), this produces very detailed stats and uses sample of ~3M rows.

It's a bit more complicated though, because there's an algorithm that
decides how many MCV items / histogram buckets to actually create, based
on the data. So you may not get more detailed stats, even when using
larger sample.

That being said, I really doubt increasing the statistics target above
10000 (or even sampling the whole table) will help you in practice.
Might be worth showing an example of a bad estimate with your data, or
maybe a test case to play with.

--
Tomas Vondra                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В списке pgsql-performance по дате отправления:

Предыдущее
От: AlexK987
Дата:
Сообщение: How to tell ANALYZE to collect statistics from the whole table?
Следующее
От: AlexK987
Дата:
Сообщение: Re: How to tell ANALYZE to collect statistics from the whole table?