Re: Speed up Clog Access by increasing CLOG buffers

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Speed up Clog Access by increasing CLOG buffers
Дата
Msg-id CAA4eK1J12fSGAmFSeq0wdUgqD+4Ue43rZDr=ZEMbySMgxfGJKA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Speed up Clog Access by increasing CLOG buffers  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Speed up Clog Access by increasing CLOG buffers  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Sat, Apr 2, 2016 at 5:25 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Mar 31, 2016 at 3:48 PM, Andres Freund <andres@anarazel.de> wrote:

Here is the performance data (configuration of machine used to perform this test is mentioned at end of mail):

Non-default parameters
------------------------------------
max_connections = 300
shared_buffers=8GB
min_wal_size=10GB
max_wal_size=15GB
checkpoint_timeout    =35min
maintenance_work_mem = 1GB
checkpoint_completion_target = 0.9
wal_buffers = 256MB

median of 3, 20-min pgbench tpc-b results for --unlogged-tables

I have ran exactly same test on intel x86 m/c and the results are as below:

Client Count/Patch_ver (tps)2128256
HEAD – Commit 2143f5e128323500126756
clog_buf_12829095068540998
clog_buf_128 +group_update_clog_v829815304350779
clog_buf_128 +content_lock28435626154059
clog_buf_128 +nocontent_lock26305655454429


In this m/c, I don't see any run-to-run variation, however the trend of results seems somewhat similar to power m/c.  Clearly the first patch increasing clog bufs to 128 shows upto 50% performance improvement on 256 client-count.  We can also observe that group clog patch gives ~24% gain on top of increase clog bufs patch at 256 client count.  Both content lock and no content lock patches show similar performance gains and the performance is 6~7% better than group clog patch.  Also as on power m/c, no content lock patch seems to show some regression at lower client count (2 clients in this case).

Based on above results, increase_clog_bufs to 128 is a clear winner and I think we might not want to proceed with no content lock approach patch as that shows some regression and also it is no better than using content lock approach patch.   Now, I think we need to decide between group clog mode approach patch and use content lock approach patch, it seems to me that the difference between both of these is not high (6~7%) and I think that when there are sub-transactions involved (sub-transactions on same page as main transaction) group clog mode patch should give better performance as then content lock in itself will start becoming bottleneck.  Now, I think we can address that case for content lock approach by using grouping technique on content lock or something similar, but not sure if that is worth the effort.   Also, I see some variation in performance data with content lock patch on power m/c, but again that might be attributed to m/c characteristics.  So, I think we can proceed with either group clog patch or content lock patch and if we want to proceed with content lock approach, then we need to do some more work on it.


Note - For both content and no content lock, I have applied 0001-Improve-64bit-atomics-support patch.


m/c config (lscpu)
---------------------------
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                128
On-line CPU(s) list:   0-127
Thread(s) per core:    2
Core(s) per socket:    8
Socket(s):             8
NUMA node(s):          8
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 47
Model name:            Intel(R) Xeon(R) CPU E7- 8830  @ 2.13GHz
Stepping:              2
CPU MHz:               1064.000
BogoMIPS:              4266.62
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              24576K
NUMA node0 CPU(s):     0,65-71,96-103
NUMA node1 CPU(s):     72-79,104-111
NUMA node2 CPU(s):     80-87,112-119
NUMA node3 CPU(s):     88-95,120-127
NUMA node4 CPU(s):     1-8,33-40
NUMA node5 CPU(s):     9-16,41-48
NUMA node6 CPU(s):     17-24,49-56
NUMA node7 CPU(s):     25-32,57-64

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: [COMMITTERS] pgsql: Use GRANT system to manage access to sensitive functions
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Proposal: Generic WAL logical messages