Re: ANALYZE to be ignored by VACUUM

Поиск
Список
Период
Сортировка
От Gregory Stark
Тема Re: ANALYZE to be ignored by VACUUM
Дата
Msg-id 87zltxz5nf.fsf@oxford.xeocode.com
обсуждение исходный текст
Ответ на Re: ANALYZE to be ignored by VACUUM  (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>)
Ответы Re: ANALYZE to be ignored by VACUUM  (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>)
Список pgsql-hackers
"ITAGAKI Takahiro" <itagaki.takahiro@oss.ntt.co.jp> writes:

> 4. ANALYZE finishes in a short time.
>    It is ok that VACUUM takes a long time because it is not a transaction,
>    but ANALYZE should not. It requres cleverer statistics algorithm.
>    Sampling factor 10 is not enough for pg_stats.n_distinct. We seems to
>    estimate n_distinct too low for clustered (ordered) tables.

Unfortunately no constant size sample is going to be enough for reliable
n_distinct estimates. To estimate n_distinct you really have to see a
percentage of the table, and to get good estimates that percentage has to be
fairly large.

There was a paper with a nice algorithm posted a while back which required
only constant memory but it depended on scanning the entire table. I think to
do n_distinct estimates we'll need some statistics which are either gathered
opportunistically whenever a seqscan happens or maintained by an index.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication
support!


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tatsuo Ishii
Дата:
Сообщение: RFP: Recursive query in 8.4
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: Severe regression in autoconf 2.61