Re: ANALYZE sampling is too good

Поиск
Список
Период
Сортировка
От Albe Laurenz
Тема Re: ANALYZE sampling is too good
Дата
Msg-id A737B7A37273E048B164557ADEF4A58B17C7DCEF@ntex2010i.host.magwien.gv.at
обсуждение исходный текст
Ответ на Re: ANALYZE sampling is too good  (Greg Stark <stark@mit.edu>)
Ответы Re: ANALYZE sampling is too good
Список pgsql-hackers
Greg Stark wrote:
>> It's also applicable for the other stats; histogram buckets constructed
>> from a 5% sample are more likely to be accurate than those constructed
>> from a 0.1% sample.   Same with nullfrac.  The degree of improved
>> accuracy, would, of course, require some math to determine.
> 
> This "some math" is straightforward basic statistics.  The 95th
> percentile confidence interval for a sample consisting of 300 samples
> from a population of a 1 million would be 5.66%. A sample consisting
> of 1000 samples would have a 95th percentile confidence interval of
> +/- 3.1%.

Doesn't all that assume a normally distributed random variable?

I don't think it can be applied to database table contents
without further analysis.

Yours,
Laurenz Albe

В списке pgsql-hackers по дате отправления:

Предыдущее
От: KONDO Mitsumasa
Дата:
Сообщение: Re: Optimize kernel readahead using buffer access strategy
Следующее
От: Shigeru Hanada
Дата:
Сообщение: Re: Custom Scan APIs (Re: Custom Plan node)