Re: pgbench - add pseudo-random permutation function
| От | Fabien COELHO | 
|---|---|
| Тема | Re: pgbench - add pseudo-random permutation function | 
| Дата | |
| Msg-id | alpine.DEB.2.21.1902142155050.20189@lancre обсуждение исходный текст | 
| Ответ на | Re: pgbench - add pseudo-random permutation function (Andres Freund <andres@anarazel.de>) | 
| Ответы | Re: pgbench - add pseudo-random permutation function | 
| Список | pgsql-hackers | 
Hello Andres, >> +# PGAC_C_BUILTIN_CLZLL > > I think this has been partially superceded by > > commit 711bab1e4d19b5c9967328315a542d93386b1ac5 > Author: Alvaro Herrera <alvherre@alvh.no-ip.org> > Date: 2019-02-13 16:10:06 -0300 Indeed, the patch needs a rebase & conflit resolution. I'll do it. Later. >> <para> >> + Function <literal>pr_perm</literal> implements a pseudo-random permutation. >> + It allows to mix the output of non uniform random functions so that >> + values drawn more often are not trivially correlated. >> + It permutes integers in [0, size) using a seed by applying rounds of >> + simple invertible functions, similarly to an encryption function, >> + although beware that it is not at all cryptographically secure. >> + Compared to <literal>hash</literal> functions discussed above, the function >> + ensures that a perfect permutation is applied: there are no collisions >> + nor holes in the output values. >> + Values outside the interval are interpreted modulo the size. >> + The function errors if size is not positive. >> + If no seed is provided, <literal>:default_seed</literal> is used. >> + For a given size and seed, the function is fully deterministic: if two >> + permutations on the same size must not be correlated, use distinct seeds >> + as outlined in the previous example about hash functions. >> + </para> > > This doesn't really explain why we want this in pgbench. Who is "we"? If someone runs non uniform tests, ie with random_exp/zipf/gauss, close values are drawn with a similar frequency, thus correlated, inducing an undeserved correlation at the page level (eg for read) and better performance that would be the case if relative frequencies were not correlated to key values. So the function allows having more realistic non uniform test, whereas currently we can only have non uniform test with very unrealistically correlated values at the key level and possibly at the page level, meaning non representative performances because of these induced bias. This is under the assumption that pgbench should allow more realistic performance test scenarii, which I believe is a desirable purpose. If someone disagree with this purpose, then they would consider both non uniform random functions and this proposed pseudo-random permutation function as useless, as probably most other additions to pgbench. -- Fabien.
В списке pgsql-hackers по дате отправления: