Re: pgbench randomness initialization
От | Fabien COELHO |
---|---|
Тема | Re: pgbench randomness initialization |
Дата | |
Msg-id | alpine.DEB.2.10.1604071147420.11001@sto обсуждение исходный текст |
Ответ на | pgbench randomness initialization (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: pgbench randomness initialization
Re: pgbench randomness initialization |
Список | pgsql-hackers |
Hello Andres, > et al I was wondering why it's a good idea for pgbench to do > INSTR_TIME_SET_CURRENT(start_time); > srandom((unsigned int) INSTR_TIME_GET_MICROSEC(start_time)); > to initialize randomness and then > for (i = 0; i < nthreads; i++) > thread->random_state[0] = random(); > thread->random_state[1] = random(); > thread->random_state[2] = random(); > to initialize the individual thread random state which is then used by > pg_erand48(). > > To me it seems better to instead initialize srandom() with a known value > (say, uh, 0). Or even better don't use random() at all, and fill a > global pg_erand48() with a known state; and use pg_erand48() to > initialize the thread states. > > Obviously that doesn't make pgbench entirely reproducible, but it seems > a lot better than now. Individual threads would do work in a > reproducible order. > > I see very little reason to have the current behaviour, or at the very > least not by default. I think that it depends on what you want, which may vary: (1) "exactly" reproducible runs, but one run may hit a particular steady state not representative of what happens ingeneral. (2) runs which really vary from one to the next, so as to have an idea about how much it may vary, what is the performancestability. Currently pgbench focusses on (2), which may or may not be fine depending on what you are doing. From a personal point of view I think that (2) is more significant to collect performance data, even if the results are more unstable: that simply reflects reality and its intrinsic variations, so I'm fine that as the default. Now for those interested in (1) for some reason, I would suggest to rely a PGBENCH_RANDOM_SEED environment variable or --random-seed option which could be used to have a oxymoronic "deterministic randomness", if desired. I do not think that it should be the default, though. -- Fabien.
В списке pgsql-hackers по дате отправления: