Re: Contention preventing locking

Поиск
Список
Период
Сортировка
От Konstantin Knizhnik
Тема Re: Contention preventing locking
Дата
Msg-id cf0fc37d-2ddd-5c1e-a03e-7020610097dc@postgrespro.ru
обсуждение исходный текст
Ответ на Re: Contention preventing locking  (Michail Nikolaev <michail.nikolaev@gmail.com>)
Список pgsql-hackers


On 16.02.2018 11:59, Michail Nikolaev wrote:
Hello.

It provides more general way to address the issue comparing to single optimisations (but they could do the work too, of course).


Certainly, contention-aware lock scheduler is much more general way of addressing this problem.
But amount of efforts needed to implement such scheduler and integrate it in Postgres core is almost non-comparable.

I did some more tests with much simple benchmark than YCSB: it just execute specified percent of single record updates and selects for relatively small table.
I used 50% of updates, 100 records table size and vary number of connections from 10 to 1000 with step 10.
This benchmark (pgrw.c) and updated version of xlock patch are attached to this mail.

Results are present at the following spreadsheet:

https://docs.google.com/spreadsheets/d/1QOYfUehy8U3sdasMjGnPGQJY8JiRfZmlS64YRBM0YTo/edit?usp=sharing

I repeated tests several times. Dips on the chart corresponds to the auto-vacuum periods. For some reasons them are larger for xlock optimization.
But, you can notice that xlock optimization significantly reduce degradation speed although doesn't completely eliminate this negative trend.

Thanks, 
Michail.


чт, 15 февр. 2018 г. в 19:00, Konstantin Knizhnik <k.knizhnik@postgrespro.ru>:
Hi,

PostgreSQL performance degrades signficantly in case of high contention.
You can look at the attached YCSB results (ycsb-zipf-pool.png) to
estimate the level of this degradation.

Postgres is acquiring two kind of heavy weight locks during update: it
locks TID of the updated tuple and XID of transaction created this version.
In debugger I see the following picture: if several transactions are
trying to update the same record, then first of all they compete for
SnapshotDirty.xmax transaction lock in EvalPlanQualFetch.
Then in heap_tuple_update they are trying to lock TID of the updated
tuple: heap_acquire_tuplock in heapam.c
After that transactions wait completion of the transaction updated the
tuple: XactLockTableWait in heapam.c

So in heap_acquire_tuplock all competing transactions are waiting for
TID of the updated version. When transaction which changed this tuple is
committed, one of the competitors will grant this lock and proceed,
creating new version of the tuple. Then all other competitors will be
awaken and ... find out that locked tuple is not the last version any more.
Then they will locate new version and try to lock it... The more
competitors we have, then more attempts they all have to perform
(quadratic complexity).

My idea was that we can avoid such performance degradation if we somehow
queue competing transactions so that they will get control one-by-one,
without doing useless work.
First of all I tried to replace TID  lock with PK (primary key) lock.
Unlike TID, PK of record  is not changed during hot update. The second
idea is that instead immediate releasing lock after update we can hold
it until the end of transaction. And this optimization really provides
improve of performance - it corresponds to pg_f1_opt configuration at
the attached diagram.
For some workloads it provides up to two times improvement comparing
with vanilla Postgres. But there are many problems with correct
(non-prototype) implementation of this approach:
handling different types of PK, including compound keys, PK updates,...

Another approach,  which I have recently implemented, is much simpler
and address another lock kind: transaction lock.
The idea o this approach is mostly the same: all competing transaction
are waiting for transaction which is changing the tuple.
Then one of them is given a chance to proceed and now all other
transactions are waiting for this transaction and so on:

T1<-T2,T3,T4,T5,T6,T7,T8,T9
T2<-T3,T4,T5,T6,T7,T8,T9
T3<-T4,T5,T6,T7,T8,T9
...

<- here corresponds to "wait for" dependency between transactions.

If we change this picture to

T1<-T2, T2<-T3, T3<-T4, T4<-T5, T5<-T6, T6<-T7, T7<-T8, T8<-T9

then number of lock requests can be significantly reduced.
And it can be implemented using very simple patch (attached xlock.patch).
I hope that I have not done something incorrect here.
Effect of this simple patch is even larger:  more than 3 times for
50..70 clients.
Please look at the attached xlock.pdf spreadsheet.

Unfortunately combination of this two approaches gives worser result
than just single xlock.patch.
Certainly this patch also requires further improvement, for example it
will not correctly work in case of aborting transactions due to deadlock
or some other reasons.
I just want to know option of community if the proposed approaches to
reduce contention are really promising and it makes sense to continue
work in this direction.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Arthur Zakirov
Дата:
Сообщение: [PROPOSAL] Nepali Snowball dictionary
Следующее
От: Joe Conway
Дата:
Сообщение: Re: SHA-2 functions