Andres Freund <andres@anarazel.de> writes:
> So there's an interesting "dip" between 4 and 8 clients. A perf profile
> doesn't show any actual lock contention on master. Not that surprising,
> there shouldn't be any exclusive locks here.
What size of machine are you testing on?
I ran Graeme's tests on a 2-socket, 4-core-per-socket, no-hyperthreading
machine, which has separate NUMA zones for the 2 sockets. What I saw
(after fixing the "stable" issue) was that all the 8-client and 16-client
cases were about 8x faster than 1-client, and 2-client was generally
within hailing distance of 2x faster, but 4-client was often noticeably
worse than the expected 4x faster.
I figured this was likely some weird NUMA effect, possibly compounded
by brutally stupid scheduling on the part of my kernel. But I didn't
have time to look closer.
You might be seeing the same kind of effect, or something different.
It's hard to tell without knowing more about your machine.
regards, tom lane