Re: spinlocks on HP-UX

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: spinlocks on HP-UX
Дата
Msg-id 8292.1314641721@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: spinlocks on HP-UX  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: spinlocks on HP-UX  (Robert Haas <robertmhaas@gmail.com>)
Re: spinlocks on HP-UX  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
I wrote:
> I am also currently running tests on x86_64 and PPC using Red Hat test
> machines --- expect results later today.

OK, I ran some more tests.  These are not directly comparable to my
previous results with IA64, because (a) I used RHEL6.2 and gcc 4.4.6;
(b) I used half as many pgbench threads as backends, rather than one
thread per eight backends.  Testing showed that pgbench cannot saturate
more than two backends per thread in this test environment, as shown
for example by this series:

pgbench -c 8 -j 1 -S -T 300 bench    tps = 22091.461409 (including ...
pgbench -c 8 -j 2 -S -T 300 bench    tps = 42587.661755 (including ...
pgbench -c 8 -j 4 -S -T 300 bench    tps = 77515.057885 (including ...
pgbench -c 8 -j 8 -S -T 300 bench    tps = 75830.463821 (including ...

I find this entirely astonishing, BTW; the backend is surely doing far
more than twice as much work per query as pgbench.  We need to look into
why pgbench is apparently still such a dog.  However, that's not
tremendously relevant to the question of whether we need an unlocked
test in spinlocks.


These tests were run on a 32-CPU Opteron machine (Sun Fire X4600 M2,
8 quad-core sockets).  Test conditions the same as my IA64 set, except
for the OS and the -j switches:

Stock git head:

pgbench -c 1 -j 1 -S -T 300 bench    tps = 9515.435401 (including ...
pgbench -c 2 -j 1 -S -T 300 bench    tps = 20239.289880 (including ...
pgbench -c 8 -j 4 -S -T 300 bench    tps = 78628.371372 (including ...
pgbench -c 16 -j 8 -S -T 300 bench    tps = 143065.596555 (including ...
pgbench -c 32 -j 16 -S -T 300 bench    tps = 227349.424654 (including ...
pgbench -c 64 -j 32 -S -T 300 bench    tps = 269016.946095 (including ...
pgbench -c 96 -j 48 -S -T 300 bench    tps = 253884.095190 (including ...
pgbench -c 128 -j 64 -S -T 300 bench    tps = 269235.253012 (including ...

Non-locked test in TAS():

pgbench -c 1 -j 1 -S -T 300 bench    tps = 9316.195621 (including ...
pgbench -c 2 -j 1 -S -T 300 bench    tps = 19852.444846 (including ...
pgbench -c 8 -j 4 -S -T 300 bench    tps = 77701.546927 (including ...
pgbench -c 16 -j 8 -S -T 300 bench    tps = 138926.775553 (including ...
pgbench -c 32 -j 16 -S -T 300 bench    tps = 188485.669320 (including ...
pgbench -c 64 -j 32 -S -T 300 bench    tps = 253602.490286 (including ...
pgbench -c 96 -j 48 -S -T 300 bench    tps = 251181.310600 (including ...
pgbench -c 128 -j 64 -S -T 300 bench    tps = 260812.933702 (including ...

Non-locked test in TAS_SPIN() only:

pgbench -c 1 -j 1 -S -T 300 bench    tps = 9283.944739 (including ...
pgbench -c 2 -j 1 -S -T 300 bench    tps = 20213.208443 (including ...
pgbench -c 8 -j 4 -S -T 300 bench    tps = 78824.247744 (including ...
pgbench -c 16 -j 8 -S -T 300 bench    tps = 141027.072774 (including ...
pgbench -c 32 -j 16 -S -T 300 bench    tps = 201658.416366 (including ...
pgbench -c 64 -j 32 -S -T 300 bench    tps = 271035.843105 (including ...
pgbench -c 96 -j 48 -S -T 300 bench    tps = 261337.324585 (including ...
pgbench -c 128 -j 64 -S -T 300 bench    tps = 271272.921058 (including ...

So basically there is no benefit to the unlocked test on this hardware.
But it doesn't cost much either, which is odd because the last time we
did this type of testing, adding an unlocked test was a "huge loss" on
Opteron.  Apparently AMD improved their handling of the case, and/or
the other changes we've made change the usage pattern completely.

I am hoping to do a similar test on another machine with $bignum Xeon
processors, to see if Intel hardware reacts any differently.  But that
machine is in the Westford office which is currently without power,
so it will have to wait a few days.  (I can no longer get at either
of the machines cited in this mail, either, so if you want to see
more test cases it'll have to wait.)


These tests were run on a 32-processor PPC64 machine (IBM 8406-71Y,
POWER7 architecture; I think it might be 16 cores with hyperthreading,
not sure).  The machine has "only" 6GB of RAM so I set shared_buffers to
4GB, other test conditions the same:

Stock git head:

pgbench -c 1 -j 1 -S -T 300 bench    tps = 8746.076443 (including ...
pgbench -c 2 -j 1 -S -T 300 bench    tps = 12297.297308 (including ...
pgbench -c 8 -j 4 -S -T 300 bench    tps = 48697.392492 (including ...
pgbench -c 16 -j 8 -S -T 300 bench    tps = 94133.227472 (including ...
pgbench -c 32 -j 16 -S -T 300 bench    tps = 126822.857978 (including ...
pgbench -c 64 -j 32 -S -T 300 bench    tps = 129364.417801 (including ...
pgbench -c 96 -j 48 -S -T 300 bench    tps = 125728.697772 (including ...
pgbench -c 128 -j 64 -S -T 300 bench    tps = 131566.394880 (including ...

Non-locked test in TAS():

pgbench -c 1 -j 1 -S -T 300 bench    tps = 8810.484890 (including ...
pgbench -c 2 -j 1 -S -T 300 bench    tps = 12336.612804 (including ...
pgbench -c 8 -j 4 -S -T 300 bench    tps = 49023.435650 (including ...
pgbench -c 16 -j 8 -S -T 300 bench    tps = 96306.706556 (including ...
pgbench -c 32 -j 16 -S -T 300 bench    tps = 131731.475778 (including ...
pgbench -c 64 -j 32 -S -T 300 bench    tps = 133451.416612 (including ...
pgbench -c 96 -j 48 -S -T 300 bench    tps = 110076.269474 (including ...
pgbench -c 128 -j 64 -S -T 300 bench    tps = 111339.797242 (including ...

Non-locked test in TAS_SPIN() only:

pgbench -c 1 -j 1 -S -T 300 bench    tps = 8726.269726 (including ...
pgbench -c 2 -j 1 -S -T 300 bench    tps = 12228.415466 (including ...
pgbench -c 8 -j 4 -S -T 300 bench    tps = 48227.623829 (including ...
pgbench -c 16 -j 8 -S -T 300 bench    tps = 93302.510254 (including ...
pgbench -c 32 -j 16 -S -T 300 bench    tps = 130661.097475 (including ...
pgbench -c 64 -j 32 -S -T 300 bench    tps = 133009.181697 (including ...
pgbench -c 96 -j 48 -S -T 300 bench    tps = 128710.757986 (including ...
pgbench -c 128 -j 64 -S -T 300 bench    tps = 133063.460934 (including ...

So basically no value to an unlocked test on this platform either.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: spinlocks on HP-UX
Следующее
От: Tom Lane
Дата:
Сообщение: Re: spinlocks on HP-UX