Re: Move PinBuffer and UnpinBuffer to atomics

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: Move PinBuffer and UnpinBuffer to atomics
Дата
Msg-id CAPpHfdsogj38HTDhNMLE56uJy9N8-=gYa2nNuWbPujGp2n1ffQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Move PinBuffer and UnpinBuffer to atomics  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Ответы Re: Move PinBuffer and UnpinBuffer to atomics  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Hi, Andres!

I'd like to share testing results on IBM E880
​ server​
.
​ At first, I'd like to mentioned that we did experiments with atomic increment on power8, despite it doesn't have native atomic increment. Let me explain why. CAS operation in power8 assembly looks like this.

.L1: lwarx 9,0,5
​     ​
cmpw 0,9,3
​     ​
bne- 0,.L2
​     ​
stwcx. 4,0,5
​     ​
bne- 0,.L1
.L2: isync

​So, it's a loop until value isn't changed between lwarx and stwcx. When you put CAS inside loop it becomes loop inside loop. Atomic increment is a single loop like CAS.

.L1: lwarx 9,0,5
​     ​
add 9,9,3
​     ​
stwcx. 9,0,5
​     ​
bne- 0,.L1
​     ​
isync

​This is why atomic increment *could be* cheaper than loop over CAS and, it worth having experiments. ​Another idea is that we can put arbitrary logic between lwarx and stwcx. Thus, we can implement PinBuffer using single loop of lwarx and stwcx which could be better than loop of CAS.

​Tested patches are following:
1) pinunpin-cas.patch – moves pin/unpin buffer to atomic operations with buffer state. PinBuffer uses loop of CAS. Same as upthread, but rebased with current master.
2) pinunpin-increment.patch (based on pinunpin-cas.patch) – PinBuffer changes state using atomic increment operation instead of loop of CAS. Both refcount and usagecount are incremented at once. There is not enough bits in 32 bit state to guarantee that usagecount doesn't overflow. I moved usagecount to higher bits of state, thus overflow of usagecount doesn't affect other bits. Therefore, this patch doesn't pretend to be correct. However, we can use it to check if we should try moving this direction.
3) lwlock-increment.patch – LWLockAttemptLock change state using atomic increment operation instead of loop of CAS. This patch does it for LWLockAttemptLock like pinunpin-increment.patch does for PinBuffer. Actually, this patch is not directly related to buffer manager. However, it's nice to test loop of CAS vs atomic increment in different places.
4) ​pinunpin-ibm-asm.patch (based on pinunpin-cas.patch) – assembly optimizations of PinBuffer and LWLockAttemptLock which makes both of them use single loop of lwarx and stwcx.

​The following versions were compared in our benchmark.
​1) 9.4
2) 9.5
3) master
4) pinunpin-cas – ​pinunpin-cas.patch
5) pinunpin-increment – pinunpin-cas.patch + pinunpin-increment.patch
​6​
) pinunpin-cas-lwlock-increment – pinunpin-cas.patch + lwlock-increment.patch
7​
) pinunpin-cas-lwlock-increment – pinunpin-cas.patch + pinunpin-increment.patch
​ + ​
lwlock-increment.patch
​8) ​pinunpin-lwlock-asm – pinunpin-cas.patch + ​pinunpin-ibm-asm.patch

​See results in ibm-scalability.png. We can see that there is almost no effect of pinunpin-increment.patch. So, atomic increment in PinBuffer doesn't give benefit on power8. I will publish test results on Intel platform a bit later, but I can say there is no much effect too. So, I think, we can throw this idea away.
However, effect of lwlock-increment.patch is huge. It gives almost same effect as assembly optimizations. I think it worth having separate thread for discussing lwlock optimization. It's likely we can do something much better than current lwlock-increment.patch does.
For this thread, I think we can focus on pinunpin-cas.patch without thinking about atomic increment.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeremy Harris
Дата:
Сообщение: Re: [PATCH] Equivalence Class Filters
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Move PinBuffer and UnpinBuffer to atomics