Re: [osdldbt-general] Re: [GENERAL] how to get accurate

Поиск
Список
Период
Сортировка
От Christopher Browne
Тема Re: [osdldbt-general] Re: [GENERAL] how to get accurate
Дата
Msg-id m3u17kyqe7.fsf@chvatal.cbbrowne.com
обсуждение исходный текст
Ответ на Re: [GENERAL] how to get accurate values in pg_statistic (continued)  (Bruce Momjian <pgman@candle.pha.pa.us>)
Список pgsql-performance
The world rejoiced as maryedie@osdl.org (Mary Edie Meredith) wrote:
> We tried 1000 as the default and found that the plans were good
> plans and were consistent, but the pg_statistics was not exactly the
> same.
>
> We took Tom's' advice and tried SET SEED=0 (actually select setseed
> (0) ).

When you're trying to get strict replicability of results, setting the
seed to some specific value is necessary.

Some useful results could be attained by varying the seed, and seeing
how the plans change.

> We did runs last night on our project machine which produced
> consistent pg_statistics data and (of course) the same plans.

> We will next try runs where we vary the default buckets.  Other than
> 10 and 1000, what numbers would you like us to try besides.
> Previously the number 100 was mentioned.  Are there others?

That presumably depends on what your goal is.

A useful experiment would be to see at what point (e.g. - at what
bucket size) plans tend to "settle down" to the right values.

It might well be that defaulting to 23 buckets (I'm picking that out
of thin air) would cause the plans to typically be stable whatever
seed got used.

A test for this would be to, for each bucket size value, repeatedly
ANALYZE and check query plans.

At bucket size 10, you have seen the query plans vary quite a bit.

At 1000, they seem to stabilize very well.

The geometric centre, between 10 and 1000, is 100, so it would surely
be useful to see if query plans are stable at that bucket size.

The most interesting number to know would be the lowest number of
buckets at which query plans are nearly always stable.  Supposing that
number was 23 (the number I earlier pulled out of the air), then that
can be used as evidence that the default value for SET STATISTICS
should be changed from 10 to 23.
--
wm(X,Y):-write(X),write('@'),write(Y). wm('aa454','freenet.carleton.ca').
http://www3.sympatico.ca/cbbrowne/nonrdbms.html
Sturgeon's Law: 90% of *EVERYTHING* is crud.

В списке pgsql-performance по дате отправления:

Предыдущее
От: Josh Berkus
Дата:
Сообщение: Re: Reading data in bulk - help?
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [GENERAL] how to get accurate values in pg_statistic (continued)