Re: Maximum statistics target

Поиск
Список
Период
Сортировка
От Stephen Denne
Тема Re: Maximum statistics target
Дата
Msg-id F0238EBA67824444BC1CB4700960CB4804D2F049@dmpeints002.isotach.com
обсуждение исходный текст
Ответ на Re: Maximum statistics target  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Estimating geometric distributions  ("Stephen Denne" <Stephen.Denne@datamail.co.nz>)
Список pgsql-hackers
Tom Lane wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
> > On Fri, Mar 07, 2008 at 07:25:25PM +0100, Peter Eisentraut wrote:
> >> What's the problem with setting it to ten million if I
> have ten million values
> >> in the table and I am prepared to spend the resources to
> maintain those
> >> statistics?
>
> > That it'll probably take 10 million seconds to calculate the plans
> > using it? I think Tom pointed there are a few places that are O(n^2)
> > the number entries...
>
> I'm not wedded to the number 1000 in particular --- obviously that's
> just a round number.  But it would be good to see some
> performance tests
> with larger settings before deciding that we don't need a limit.

I recently encountered a situation where I would have liked to be able to try a larger limit (amongst other ideas for
improvingmy situation): 

I have a field whose distribution of frequencies of values is roughly geometric, rather than flat.
Total rows = 36 million
relpages=504864
Distinct field values in use = 169
10 values account for 50% of the rows.
41 values account for 90% of the rows.

After setting statistics target to 1000 for that field, and analyzing the table, the statistics row for that field had
75most frequent values and a histogram with 76 entries in it. Estimating 151 values in total. 

For this situation using a larger statistics target should result in more pages being read, and a more accurate record
ofstatistics. It shouldn't result in significantly more work for the planner. 

It wouldn't solve my problem though, which is frequent over-estimation of rows when restricting by this field with
valuesnot known at plan time. 

Regards,
Stephen Denne.

Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any
attachmentsis confidential and may be subject to legal privilege.  If it is not intended for you please advise by reply
immediately,destroy it and do not copy, disclose or use it in any way. 

__________________________________________________________________ This email has been scanned by the DMZGlobal
BusinessQuality              Electronic Messaging Suite. 
Please see http://www.dmzglobal.com/services/bqem.htm for details.
__________________________________________________________________




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Chernow
Дата:
Сообщение: Re: timestamp datatype cleanup
Следующее
От: "Dawid Kuroczko"
Дата:
Сообщение: Re: Lazy constraints / defaults