Re: pgsql: Use a separate random seed for SQL random()/setseed()functions.

Поиск
Список
Период
Сортировка
От Fabien COELHO
Тема Re: pgsql: Use a separate random seed for SQL random()/setseed()functions.
Дата
Msg-id alpine.DEB.2.21.1812300824140.26970@lancre
обсуждение исходный текст
Ответ на pgsql: Use a separate random seed for SQL random()/setseed()functions.  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-committers
Hello Tom,

I'm sorry I'm a bit late back into this discussion, I was on the road.

> To fix this reliably, we need random() and setseed() to use their own
> private random state variable.

Ok.

> Standard random(3) isn't amenable to such usage, so let's switch to 
> pg_erand48().

Hmmm… bad idea?

> It's hard to say whether that's more or less "random" than any 
> particular platform's version of random(3),

It looks much less pseudo-random on Linux: POSIX provides 3 pseudo-random 
functions (probably 2 too many): "random", "rand" (mapped on the previous 
one by glibc), and ".rand48". According to glibc documentation, "random" 
has an internal state of 31 long integers i.e. 992 bits (checked into the 
source code, although why it can only be seeded from 32 bits fails me) 
with a "nonlinear additive feedback" PRNG, vs 48 bits of rand48 linear 
congruential generator PRNG, also seeded from 32 bits.

For me, a 48 bit state is inadequate for anything but a toy application 
that would need a few casual pseudo-random numbers. I recommand against 
using the rand48 for a backend purpose which might have any, even remote, 
security implication.

As rand48 is a LCG, it cycles on the low-order bits so that they should 
not be used (although they are for erand48). The 48 bit state looks like a 
80's design, when hardware was between 16 and 32 bit, and was still in use 
in the 90's so that it is in POSIX and Java. I can retro-explain it as 
follows: the aim was to produce reasonable 32 bit pseudo-random ints on 
slow machines while not using low-order bits, so 48 was the closest 
round-up possible. Why not go up to 64 bits was very probably because it 
would have required more expensive mults to simulate 64 bit multiply on a 
16 or 32 bit architecture. The 48-bit LCG makes it "good enough" for less 
than a cubic root of size samples, i.e. 2**16 draws. This is much too 
small on today GHz hardware.

ISTM that 64 bits would be on the too-low side as well. I'd shop for a 128 
or 256-bit state generator. I'm unsure of the best choice, though. I have 
looked at "xorshift128+" and "xoshiro256**", which have some critics 
(basically, non-cryptographic PRNG can have their state rebuilt from a few 
outputs, and there usually is a simple transformation on outputs which 
make it fails statistical tests). ISTM that "xoshiro256**" would be a 
reasonable choice, much better than "rand48". An LCG with a larger state 
(>= 128) could be admissible as well.

> but it does have a wider seed value and a longer period than are required
> by POSIX, so we can hope that this isn't a big downgrade.

I'd say that it is a significant downgrade that I wish postgres woud 
avoid, especially with the argument that it for better security!

I'd suggest again that (1) postgres should provide an 
algorithm-independent interface to its PRNG with an external state and (2) 
use an alternative to rand48, the choice of which should be discussed.

-- 
Fabien.

В списке pgsql-committers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: pgsql: Trigger stmt_beg and stmt_end for top-level statement blocksof
Следующее
От: Tom Lane
Дата:
Сообщение: pgsql: Teach eval_const_expressions to constant-fold LEAST/GREATESTexp