Re: Improving spin-lock implementation on ARM.

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Improving spin-lock implementation on ARM.
Дата
Msg-id 1367116.1606802480@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Improving spin-lock implementation on ARM.  (Alexander Korotkov <aekorotkov@gmail.com>)
Ответы Re: Improving spin-lock implementation on ARM.  (Alexander Korotkov <aekorotkov@gmail.com>)
Список pgsql-hackers
Alexander Korotkov <aekorotkov@gmail.com> writes:
> 2) None of the patches considered in this thread give a clear
> advantage for PostgreSQL built with LSE.

Yeah, I think so.

> To further confirm this let's wait for Kunpeng 920 tests by Krunal
> Bauskar and Amit Khandekar.  Also it would be nice if someone will run
> benchmarks similar to [1] on Apple M1.

I did what I could in this department.  It's late and I'm not going to
have time to run read/write benchmarks before bed, but here are some
results for the "pgbench -S" cases.  I tried to match your testing
choices, but could not entirely:

* Configure options are --enable-debug, --disable-cassert, no other
special configure options or CFLAG choices.

* I have not been able to find a way to make Apple's compiler not
use the LSE spinlock instructions, so all of these correspond to
your LSE cases.

* I used shared_buffers = 1GB, because this machine only has 16GB
RAM so 32GB is clearly out of reach.  Also I used pgbench scale
factor 100 not 1000.  Since we're trying to measure contention
effects not I/O speed, I don't think a huge test case is appropriate.

* I still haven't gotten pgbench to work with -j settings above 128,
so these runs use -j equal to half -c.  Shouldn't really affect
conclusions about scaling.  (BTW, I see a similar limitation on
macOS Catalina x86_64, so whatever that is, it's not new.)

* Otherwise, the first plot shows median of three results from
"pgbench -S -M prepared -T 120 -c $n -j $j", as you had it.
The right-hand plot shows all three of the values in yerrorbars
format, just to give a sense of the noise level.

Clearly, there is something weird going on at -c 4.  There's a cluster
of results around 180K TPS, and another cluster around 210-220K TPS,
and nothing in between.  I suspect that the scheduler is doing
something bogus with sometimes putting pgbench onto the slow cores.
Anyway, I believe that the apparent gap between HEAD and the other
curves at -c 4 is probably an artifact: HEAD had two 180K-ish results
and one 220K-ish result, while the other curves had the reverse, so
the medians are different but there's probably not any non-chance
effect there.

Bottom line is that these patches don't appear to do much of
anything on M1, as you surmised.

            regards, tom lane


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: TAP test utility module 'PG_LSN.pm'
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted