Re: Add min and max execute statement time in pg_stat_statement

Поиск
Список
Период
Сортировка
От David G Johnston
Тема Re: Add min and max execute statement time in pg_stat_statement
Дата
Msg-id 1421796737983-5834805.post@n5.nabble.com
обсуждение исходный текст
Ответ на Re: Add min and max execute statement time in pg_stat_statement  (Andrew Dunstan <andrew@dunslane.net>)
Ответы Re: Add min and max execute statement time in pg_stat_statement  (Andrew Dunstan <andrew@dunslane.net>)
Re: Add min and max execute statement time in pg_stat_statement  (Arne Scheffer <arne.scheffer@uni-muenster.de>)
Re: Add min and max execute statement time in pg_stat_statement  (Peter Eisentraut <peter_e@gmx.net>)
Список pgsql-hackers
Andrew Dunstan wrote
> On 01/20/2015 01:26 PM, Arne Scheffer wrote:
>>
>> And a very minor aspect:
>> The term "standard deviation" in your code stands for
>> (corrected) sample standard deviation, I think,
>> because you devide by n-1 instead of n to keep the
>> estimator unbiased.
>> How about mentioning the prefix "sample"
>> to indicate this beiing the estimator?
> 
> 
> I don't understand. I'm following pretty exactly the calculations stated 
> at <http://www.johndcook.com/blog/standard_deviation/>
> 
> 
> I'm not a statistician. Perhaps others who are more literate in 
> statistics can comment on this paragraph.

I'm largely in the same boat as Andrew but...

I take it that Arne is referring to:

http://en.wikipedia.org/wiki/Bessel's_correction

but the mere presence of an (n-1) divisor does not mean that is what is
happening.  In this particular situation I believe the (n-1) simply is a
necessary part of the recurrence formula and not any attempt to correct for
sampling bias when estimating a population's variance.  In fact, as far as
the database knows, the values provided to this function do represent an
entire population and such a correction would be unnecessary.  I guess it
boils down to whether "future" queries are considered part of the population
or whether the population changes upon each query being run and thus we are
calculating the ever-changing population variance.  Note point 3 in the
linked Wikipedia article.

David J.



--
View this message in context:
http://postgresql.nabble.com/Add-min-and-max-execute-statement-time-in-pg-stat-statement-tp5774989p5834805.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Gierth
Дата:
Сообщение: Re: B-Tree support function number 3 (strxfrm() optimization)
Следующее
От: Robert Haas
Дата:
Сообщение: Re: B-Tree support function number 3 (strxfrm() optimization)