Fabien COELHO <coelho@cri.ensmp.fr> writes:
> [ pgbench-zipf-doc-3.patch ]
I started to look through this, and the more I looked the more unhappy
I got that we're having this discussion at all. The zipfian support
in pgbench is seriously over-engineered and under-documented. As an
example, I was flabbergasted to find out that the end-of-run summary
statistics now include this:
/* Report zipfian cache overflow */
for (i = 0; i < nthreads; i++)
{
totalCacheOverflows += threads[i].zipf_cache.overflowCount;
}
if (totalCacheOverflows > 0)
{
printf("zipfian cache array overflowed %d time(s)\n", totalCacheOverflows);
}
What is the point of that, and if there is a point, why is it nowhere
mentioned in pgbench.sgml? What would a user do with this information,
and how would they know what to do?
I remain of the opinion that we ought to simply rip out support for
zipfian with s < 1. It's not useful for benchmarking purposes to have
a random-number function with such poor computational properties.
I think leaving it in there is just a foot-gun: we'd be a lot better
off throwing an error that tells people to use some other distribution.
Or if we do leave it in there, we for sure have to have documentation
that *actually* explains how to use it, which this patch still doesn't.
There's nothing suggesting that you'd better not use a large number of
different (n,s) combinations.
regards, tom lane