On 2016-01-18 16:14:05 -0600, Kevin Grittner wrote:
> Unconvinced that we should do performance testing on a proposed
> performance patch before accepting it
I'm unconvinced that it makes sense to view this as a performance
patch. And unconvinced that you can sanely measure it. The lock prefix
is a one byte instruction prefix, and lock xchg, and xchg are exactly
the same, leaving the instruction width aside. It's just a littlebit
less work for the instruction decoder.
The point about alignment and such is, that changing some code somewhere
is likely to have a bigger performance impact than the actual effect of
the removal of those few bytes. So when you benchmark, you'd just
benchmark a slightly changed code layout.
objdump -d build/postgres/dev-assert/vpath/src/backend/postgres |grep 'lock xchg'|head -n1 4b732f: f0 86 01
lock xchg %al,(%rcx)
the f0 is the lock prefix. In total there's 22 of them in the postgres
codebase, when compiled with my flags/compiler.
I think it's unrealistic to benchmark slight codemovements on a regular
basis, particularly using a large machine. There's just not enough time
and hardware around for that.
Now I'm equally unconvinced that it's worthwhile to do anything
here. I just don't think benchmarking plays a role either way.
>, that the changes in NUMA
> scheduling in the Linux 3.8 kernel have a major effect on how well
> our code performs at high concurrency on NUMA machines with a lot
> of memory nodes
That I believe immediately.