random() generates collisions too early

Поиск
Список
Период
Сортировка
От Honza Horak
Тема random() generates collisions too early
Дата
Msg-id 5261219D.3060205@redhat.com
обсуждение исходный текст
Ответы Re: random() generates collisions too early  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Re: random() generates collisions too early  (Joe Van Dyk <joe@tanga.com>)
Список pgsql-bugs
Hi guys,

after playing a bit with "select random();", it turned out that numbers
get repeated quite early in a sequence. Originally I set lower PID range
(echo "2048" >/proc/sys/kernel/pid_max), but it doesn't seem to affect
the results.

So, what I observed... First, I generated a set including 1000 randomly
generated numbers without setting a seed.

   touch numbers
   for i in {1..1000} ; do
     echo "select random();"|psql|head -n 3|tail -n 1 >>numbers
   done

Then, I continued in generating random numbers and tried to find the new
one in the set:

   for i in {1..10000} ; do
     if grep `echo "select random();"|psql|head -n 3|tail -n 1` numbers
; then
       echo "SUCCESS: $i" ; break
     fi
   done

To my surprise I'm able to find a collision very quickly, in first 1000
numbers usually.

Originally, I used psql calls to get random() value from different
processes on purpose, but it seems Noah got similar results when
random() is called in one process:

On 10/18/2013 02:10 AM, Noah Misch wrote:
 > sudo sysctl -w kernel.pid_max=2048
 > psql -c 'create unlogged table samp(c float8)'
 > for n in `seq 1 200000`; do psql -qc 'insert into samp values
(random())'; done
 >
 > The results covered only 181383 distinct values, and 68 values
repeated four
 > or five times each.  We should at least consider using a
higher-entropy seed.

As I was told this is not taken as a security issue, since random() is
not considered as a CSPRNG in any case, but as Noah said, we should
probably try to make it a bit better.

Also, I'd suggest to state explicitly in the doc, that random()
shouldn't be taken as CSPRNG, since I can imagine people blindly
believing that random() can be good enough for such use cases, just
because they see how many possible values they get from double-precision
type:
http://www.postgresql.org/docs/9.3/static/functions-math.html

Regards,
Honza

В списке pgsql-bugs по дате отправления:

Предыдущее
От: kavehmz@gmail.com
Дата:
Сообщение: BUG #8534: Missing record in binary replica 9.3
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: BUG #8531: systemtap probe mark(checkpoint__done) error when i read the parameters