Re: surprising results with random()

Поиск
Список
Период
Сортировка
От Steve Atkins
Тема Re: surprising results with random()
Дата
Msg-id 5C99FB52-A5C2-47DF-864E-6A2800F06F9E@blighty.com
обсуждение исходный текст
Ответ на surprising results with random()  (Jessi Berkelhammer <jberkelhammer@desc.org>)
Ответы Re: surprising results with random()  (raf <raf@raf.org>)
Список pgsql-general
On Feb 23, 2009, at 2:09 PM, Jessi Berkelhammer wrote:

> Hi,
>
> I have a view in which I want to randomly assign values if certain
> conditions hold. I was getting surprising results. Here is a (very)
> simplified version of the view, which seems to indicate the problem:
>
> CREATE OR REPLACE VIEW test_view AS
> SELECT
>     CASE
>         WHEN random() < .3333 THEN '1'
>         WHEN random() < .3333 THEN '2'
>         ELSE '3'
>     END AS test_value
>
> FROM client ;
>
> It seems this should generate a random number between 0 and 1, and set
> test_value to '1' if this first generated number is less than .3333.
> Otherwise, it should generate another random number, and set
> test_value
> to '2' if this is less than .3333. And if neither of the random
> numbers
> are less than .3333, it should set test_value to '3'. It seems to me
> that there should be a relative even distribution of the 3 values.
>
>
> However when I run this, the values are always similar to what is
> below:
>
> X_test=>  select test_value, count(*) from test_view group by 1
> order by 1;
> test_value | count
> ------------+-------
> 1          | 23947
> 2          | 16061
> 3          | 32443
>
> Why are there significantly fewer 2s? I understand that random() is
> not
> truly random, and that the seed affects this value. But it still
> confuses me that, no matter how many times I run this, there are
> always
> so few 2s. If it is generating an independent random number in the
> second call to random(), then I don't know why there are more so many
> more 1s than 2s.

Nope, it's nothing to do with random(), it's that your maths is wrong.

There are 9 possible cases. In 3 of them you return 1. In 2 of them you
return 2. In the remaining 4 cases you return 3.

If you were to run this 72451 times I'd expect to see
1:  24150 = 72451 * 3/9
2: 16100 = 72451 * 2/9
3: 32200 = 72451 * 4/9

Which, unsurprisingly, is fairly close to what you get.

Cheers,
   Steve



В списке pgsql-general по дате отправления:

Предыдущее
От: raf
Дата:
Сообщение: Re: problems with win32 enterprisedb 8.3.6 ssl=on
Следующее
От: ries van Twisk
Дата:
Сообщение: Re: surprising results with random()