On Thu, Apr 7, 2016 at 5:56 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
> I think that it depends on what you want, which may vary:
>
> (1) "exactly" reproducible runs, but one run may hit a particular
> steady state not representative of what happens in general.
>
> (2) runs which really vary from one to the next, so as
> to have an idea about how much it may vary, what is the
> performance stability.
>
> Currently pgbench focusses on (2), which may or may not be fine depending on
> what you are doing. From a personal point of view I think that (2) is more
> significant to collect performance data, even if the results are more
> unstable: that simply reflects reality and its intrinsic variations, so I'm
> fine that as the default.
>
> Now for those interested in (1) for some reason, I would suggest to rely a
> PGBENCH_RANDOM_SEED environment variable or --random-seed option which could
> be used to have a oxymoronic "deterministic randomness", if desired.
> I do not think that it should be the default, though.
I agree entirely. If performance is erratic, that's actually
something you want to discover during benchmarking. If different
pgbench runs (of non-trivial length) are producing substantially
different results, then that's really a problem we need to fix, not
just adjust pgbench to cover it up.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company