Re: [HACKERS] Proposal for CSN based snapshots

Поиск
Список
Период
Сортировка
От Alexander Kuzmenkov
Тема Re: [HACKERS] Proposal for CSN based snapshots
Дата
Msg-id 8a855f33-2581-66bf-85f7-0b99239edbda@postgrespro.ru
обсуждение исходный текст
Ответ на Re: [HACKERS] Proposal for CSN based snapshots  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: [HACKERS] Proposal for CSN based snapshots  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Список pgsql-hackers
Hi hackers,

Here is a new version of the patch with some improvements, rebased to 
117469006b.

Performance on pgbench tpcb with subtransactions is now slightly better 
than master. See the picture 'savepoints2'. This was achieved by 
removing unnecessary exclusive locking on CSNLogControlLock in 
SubTransSetParent. After that change, both versions are mostly waiting 
on XidGenLock in GetNewTransactionId.

Performance on default pgbench tpcb is also improved. At scale 500, csn 
is at best 30% faster than master, see the picture 'tpcb500'. These 
improvements are due to slight optimizations of GetSnapshotData and 
refreshing RecentGlobalXmin less often. At scale 1500, csn is slightly 
faster at up to 200 clients, but then degrades steadily: see the picture 
'tpcb1500'. Nevertheless, CSN-related code paths do not show up in perf 
profiles or LWLock wait statistics [1]. I think what we are seeing here 
is again that when some bottlenecks are removed, the fast degradation of 
LWLocks under contention leads to net drop in performance. With this in 
mind, I tried running the same benchmarks with patch from Yura Sokolov 
[2], which should improve LWLock performance on NUMA machines. Indeed, 
with this patch csn starts outperforming master on all numbers of 
clients measured, as you can see in the picture 'tpcb1500'. This LWLock 
change influences the csn a lot more than master, which also suggests 
that we are observing a superlinear degradation of LWLocks under 
increasing contention.

After this I plan to improve the comments, since many of them have 
become out of date, and work on logical replication.

[1] To collect LWLock wait statistics, I sample pg_stat_activity, and 
also use a bcc script by Andres Freund: 

https://www.postgresql.org/message-id/flat/20170622210845.d2hsbqv6rxu2tiye%40alap3.anarazel.de#20170622210845.d2hsbqv6rxu2tiye@alap3.anarazel.de

[2] 

https://www.postgresql.org/message-id/flat/2968c0be065baab8865c4c95de3f435c@postgrespro.ru#2968c0be065baab8865c4c95de3f435c@postgrespro.ru

-- 
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: BUG #14941: Vacuum crashes
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: Re: User defined data types in Logical Replication