Re: Improving spin-lock implementation on ARM.

Поиск
Список
Период
Сортировка
От Krunal Bauskar
Тема Re: Improving spin-lock implementation on ARM.
Дата
Msg-id CAB10pyboVUQkkkBTSJ9G7s-U+aaVBZGerGQuAbKixBZ-uuxarg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Improving spin-lock implementation on ARM.  (Alexander Korotkov <aekorotkov@gmail.com>)
Ответы Re: Improving spin-lock implementation on ARM.  (Alexander Korotkov <aekorotkov@gmail.com>)
Список pgsql-hackers


On Tue, 1 Dec 2020 at 20:25, Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Tue, Dec 1, 2020 at 3:44 PM Krunal Bauskar <krunalbauskar@gmail.com> wrote:
> I have completed benchmarking with lse.
>
> Graph attached.

Thank you for benchmarking.

Now I agree with this comment by Tom Lane

> In general, I'm pretty skeptical of *all* the results posted so far on
> this thread, because everybody seems to be testing exactly one machine.
> If there's one thing that it's safe to assume about ARM, it's that
> there are a lot of different implementations; and this area seems very
> very likely to differ across implementations.

Different ARM implementations look too different.  As you pointed out,
LSE is enabled in gcc-10 by default.  I doubt we can accept a patch,
which gives benefits for specific platform and only when the compiler
isn't very modern.  Also, we didn't cover all ARM planforms.  Given
they are so different, we can't guarantee that patch doesn't cause
regression of some ARM.  Additionally, the effect of the CAS patch
even for Kunpeng seems modest.  It makes the drop off of TPS more
smooth, but it doesn't change the trend.

There are 2 parts:

* Does CAS patch help scale PGSQL. Yes.
* Is LSE beneficial for all architectures. Probably No.

The patch addresses only the former one which is true for all cases.
(Enabling LSE should be an independent process).

gcc-10 made it default but when I read [1] it quotes that canonical decided to remove it as default
as part of Ubuntu-20.04 which means LSE has not proven the test of canonical (probably).
Also, most of the distro has not yet started shipping GCC-10 which is way far before it makes it to all distro.

So if we keep the LSE effect aside and just look at the patch from performance improvement it surely helps
achieve a good gain. I see an improvement in the range of 10-40%.
Amit during his independent testing also observed the gain in the same range and your testing with G-2 has re-attested the same point.
Pardon me if this is modest as per pgsql standards.

With 1024 scalability PGSQL on other arches (beyond ARM) struggle to scale so there is something more
inherent that needs to be addressed from a generic perspective.

Also, the said patch non-only helps pgbench kind of workload but other workloads too.

--------------

I would request you guys to re-think it from this perspective to help ensure that PGSQL can scale well on ARM.
s_lock becomes a top-most function and LSE is not a universal solution but CAS surely helps ease the main bottleneck.

And surely let me know if more data is needed.

Link:
------
Regards,
Alexander Korotkov


--
Regards,
Krunal Bauskar

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Anastasia Lubennikova
Дата:
Сообщение: Re: Reduce the time required for a database recovery from archive.
Следующее
От: Anastasia Lubennikova
Дата:
Сообщение: Re: BUG #15383: Join Filter cost estimation problem in 10.5