Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuplesinaccurate.

Поиск
Список
Период
Сортировка
От David Gould
Тема Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuplesinaccurate.
Дата
Msg-id 20180302185752.46b82671@engels
обсуждение исходный текст
Ответ на Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Fri, 02 Mar 2018 17:17:29 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:


> But by the same token, analyze only looked at 0.0006 of the pages.  It's
> nice that for you, that's enough to get a robust estimate of the density
> everywhere; but I have a nasty feeling that that won't hold good for
> everybody.

My grasp of statistics is somewhat weak, so please inform me if I've got
this wrong, but every time I've looked into it I've found that one can get
pretty good accuracy and confidence with fairly small samples. Typically 1000
samples will serve no matter the population size if the desired margin of
error is 5%. Even with 99% confidence and a 1% margin of error it takes less
than 20,000 samples. See the table at:

http://www.research-advisors.com/tools/SampleSize.htm

Since we have by default 30000 sample pages and since ANALYZE takes some
trouble to get a random sample I think we really can rely on the results of
extrapolating reltuples from analyze.

-dg

-- 
David Gould                                   daveg@sonic.net
If simplicity worked, the world would be overrun with insects.


В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Steele
Дата:
Сообщение: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: zheap: a new storage format for PostgreSQL