Re: RFC: planner statistics in 7.2

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: RFC: planner statistics in 7.2
Дата
Msg-id 23581.987727737@sss.pgh.pa.us
обсуждение исходный текст
Ответ на RFC: planner statistics in 7.2  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: RFC: planner statistics in 7.2  (Philip Warner <pjw@rhyme.com.au>)
Список pgsql-hackers
Philip Warner <pjw@rhyme.com.au> writes:
> At 18:37 19/04/01 -0400, Tom Lane wrote:
>> (2) Statistics should be computed on the basis of a random sample of the
>> target table, rather than a complete scan.  According to the literature
>> I've looked at, sampling a few thousand tuples is sufficient to give good
>> statistics even for extremely large tables; so it should be possible to
>> run ANALYZE in a short amount of time regardless of the table size.

> This sounds great; can the same be done for clustering. ie. pick a random
> sample of index nodes, look at the record pointers and so determine how
> well clustered the table is?

My intention was to use the same tuples sampled for the data histograms
to estimate how well sorted the data is.  However it's not immediately
clear that that'll give a trustworthy estimate; I'm still studying it ...

>> ALTER TABLE tab SET COLUMN col STATS COUNT n

> Sounds fine - user-selectability at the column level seems a good idea.
> Would there be any value in not making it part of a normal SQLxx statement,
> and adding an 'ALTER STATISTICS' command? eg. 

>     ALTER STATISTICS FOR tab[.column] COLLECT n
>     ALTER STATISTICS FOR tab SAMPLE m

Is that more standard than the other syntax?
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: RFC: planner statistics in 7.2y
Следующее
От: Philip Warner
Дата:
Сообщение: Re: RFC: planner statistics in 7.2