Re: Odd statistics behaviour in 7.2

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Odd statistics behaviour in 7.2
Дата
Msg-id 29027.1013617315@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Odd statistics behaviour in 7.2  (Gordon Runkle <gar@integrated-dynamics.com>)
Ответы Re: Odd statistics behaviour in 7.2  ("Gordon A. Runkle" <gar@integrated-dynamics.com>)
Список pgsql-hackers
Gordon Runkle <gar@integrated-dynamics.com> writes:
> I have a table with 1,066,673 rows.  The column I'm interested in has
> this distribution of values:

>  tdnr_ct |   ct   
> ---------+--------
>       16 |      1
>        4 |      1
>        3 |     58
>        2 |  68904
>        1 | 928171

> This means that 'ct' records have 'tdnr_ct' duplicate values.

I'm confused.  You mean that there is one value that appears 16 times,
one that appears 4 times, etc etc, and 928171 values that appear only
once?

> Under v7.2, it only sometimes does.  I've looked at the statistics,
> thanks to what I learned from Tom and Marc's discussion, and I see that
> sometimes when I VACUUM ANALYZE the table, 'n_distinct' for this column
> gets a value of '-1' (desireable), and other times a value such as 59483
> or something.

This seems quite bizarre; given those stats it's hard to see how you
could get anything but -1 or close to it, even with a very unlucky
statistical sampling.  Don't suppose you'd want to trace through the
ANALYZE code and find out why it's computing a bad value?

Alternatively, if you could send me a dump of just the ct column,
I could try to reproduce the behavior here.  (CREATE TABLE foo AS
SELECT ct FROM yourtab and then pg_dump -t foo should do it.)
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: pg_stats explained ... ?
Следующее
От: "Ross J. Reedstrom"
Дата:
Сообщение: Re: benchmarking postgres