Re: Recommendations for set statistics

Поиск
Список
Период
Сортировка
От Christopher Browne
Тема Re: Recommendations for set statistics
Дата
Msg-id m3hdh792w7.fsf@knuth.cbbrowne.com
обсуждение исходный текст
Ответ на Recommendations for set statistics  (Sebastian Hennebrueder <usenet@laliluna.de>)
Ответы Re: Recommendations for set statistics  (Josh Berkus <josh@agliodbs.com>)
Список pgsql-performance
After a long battle with technology, usenet@laliluna.de (Sebastian Hennebrueder), an earthling, wrote:
> I could not find any recommandations for the level of set statistics
> and what a specific level does actually mean.
> What is the difference between 1, 50 and 100? What is recommanded for
> a table or column?

The numbers represent the numbers of "bins" used to establish
histograms that estimate how the data looks.

The default is to have 10 bins, and 300 items are sampled at ANALYZE
time per bin.

1 would probably be rather bad, having very little ability to express
the distribution of data.  100 bins would be 10x as expensive to
store than 10, but would provide a much distribution.

It is widely believed that a somewhat larger default than 10 would be
a "good thing," as it seems to be fairly common for 10 to be too small
to allow statistics to be stable.  But nobody has done any formal
evaluation as to whether it would make sense to jump from 10 to:

 - 15?
 - 20?
 - 50?
 - 100?
 - More than that?

If we could show that 90% of the present "wrong results" that come
from the default of 10 could be addressed by an increase to 20 bins,
and the remainder could be left to individual tuning, well, getting
rid of 90% of the "query plan errors" would seem worthwhile.

I'd hope that a moderate (e.g. - from 10 to 20) increase, which would
be pretty cheap, would help a fair bit, but there is no evidence one
way or the other.  Unfortunately, nobody has come up with a decent way
of evaluating how much good a change to the default would actually do.

If you can discover an evaluation scheme, your results are likely to
get an ear.
--
"cbbrowne","@","gmail.com"
http://linuxdatabases.info/info/lsf.html
"In 1555,  Nostradamus wrote: 'Come  the millennium, month 12,  in the
home  of greatest  power,  the village  idiot  will come  forth to  be
acclaimed the leader.'"

В списке pgsql-performance по дате отправления:

Предыдущее
От: John A Meinel
Дата:
Сообщение: Re: Recommendations for set statistics
Следующее
От: Greg Stark
Дата:
Сообщение: Re: Optimize complex join to use where condition before