On Mon, Apr 7, 2014 at 11:47 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> It's not a question of whether your test case is contrived. Your test
> case can be (and likely is) extremely realistic and still not account
> for other cases when the patch regresses performance. If I understand
> correctly, and I may not because I wasn't planning to spend time on
> this patch until the next CommitFest, the patch basically uses the
> bytes available in datum1 to cache what you refer to as a normalized
> or poor man's sort key which, it is hoped, will break most ties.
> However, on any workload where it fails to break ties, you're going to
> incur the additional overheads of (1) comparing the poor-man's sort
> key, (2) memcmping the strings (based on what you just said), and then
> (3) digging the correct datum back out of the tuple. I note that
> somebody thought #3 was important enough to be worth creating datum1
> for in the first place, so I don't think it can or should be assumed
> that undoing that optimization in certain cases will turn out to be
> cheap enough not to matter.
The much earlier datum1 optimization is mostly compelling for
pass-by-value types, for reasons that prominently involve
cache/locality considerations. That's probably why this patch is so
compelling - it makes those advantages apply to pass-by-reference
types too.
> Testing the cases where your patch wins and hand-waving that the
> losses won't be that bad in other cases - without actually testing -
> is not the right methodology.
Okay. Here is a worst-case, with the pgbench script the same as my
original test-case, but with much almost maximally unsympathetic data
to sort:
[local]/postgres=# update customers set firstname =
'padding-padding-padding-padding' || firstname;
UPDATE 20000
Master:
pg@hamster:~/sort-tests$ pgbench -f sort.sql -n -T 100
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 100 s
number of transactions actually processed: 323
latency average: 309.598 ms
tps = 3.227745 (including connections establishing)
tps = 3.227784 (excluding connections establishing)
Patch:
pg@hamster:~/sort-tests$ pgbench -f sort.sql -n -T 100
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 100 s
number of transactions actually processed: 307
latency average: 325.733 ms
tps = 3.066256 (including connections establishing)
tps = 3.066313 (excluding connections establishing)
That seems like a pretty modest regression for a case where 100% of
what I've done goes to waste. If something that I'd done worked out
10% of the time, we'd still be well ahead.
--
Peter Geoghegan