Re: gaussian distribution pgbench

Поиск
Список
Период
Сортировка
От Fabien COELHO
Тема Re: gaussian distribution pgbench
Дата
Msg-id alpine.DEB.2.10.1403150738110.13791@sto
обсуждение исходный текст
Ответ на Re: gaussian distribution pgbench  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Ответы Re: gaussian distribution pgbench  (Mitsumasa KONDO <kondo.mitsumasa@gmail.com>)
Re: gaussian distribution pgbench  (KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp>)
Re: gaussian distribution pgbench  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Список pgsql-hackers
Hello Heikki,

> A couple of comments:
>
> * There should be an explicit "\setrandom ... uniform" option too, even 
> though you get that implicitly if you don't specify the distribution

Indeed. I agree. I suggested it, but it got lost.

> * What exactly does the "threshold" mean? The docs informally explain that 
> "the larger the thresold, the more frequent values close to the middle of the 
> interval are drawn", but that's pretty vague.

There are explanations and computations as comments in the code. If it is 
about the documentation, I'm not sure that a very precise mathematical 
definition will help a lot of people, and might rather hinder 
understanding, so the doc focuses on an intuitive explanation instead.

> * Does min and max really make sense for gaussian and exponential 
> distributions? For gaussian, I would expect mean and standard deviation as 
> the parameters, not min/max/threshold.

Yes... and no:-) The aim is to draw an integer primary key from a table, 
so it must be in a specified range. This is approximated by drawing a 
double value with the expected distribution (gaussian or exponential) and 
project it carefully onto integers. If it is out of range, there is a loop 
and another value is drawn. The minimal threshold constraint (2.0) ensures 
that the probability of looping is low.

> * How about setting the variable as a float instead of integer? Would seem 
> more natural to me. At least as an option.

Which variable? The values set by setrandom are mostly used for primary 
keys. We really want integers in a range.

-- 
Fabien.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: jsonb and nested hstore
Следующее
От: Mitsumasa KONDO
Дата:
Сообщение: Re: gaussian distribution pgbench