Обсуждение: [PERFORM] BDR, wal sender, high system cpu, mutex_lock_common
Hi all,
--
I've an environment 9.4 + bdr:
PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.7.2-5) 4.7.2, 64-bit
kernel version:
3.2.0-4-amd64 #1 SMP Debian 3.2.65-1 x86_64 GNU/Linux
This is consolidation databases, in this machine there are around 250+ wal sender processes.
top output revealed high system cpu:
%Cpu(s): 1.4 us, 49.7 sy, 0.0 ni, 48.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
profiling cpu with perf:
perf top -e cpu-clock
Events: 142K cpu-clock
82.37% [kernel] [k] __mutex_lock_common.isra.5
4.49% [kernel] [k] do_raw_spin_lock
2.23% [kernel] [k] mutex_lock
2.16% [kernel] [k] mutex_unlock
2.12% [kernel] [k] arch_local_irq_restore
1.73% postgres [.] ValidXLogRecord
0.87% [kernel] [k] __mutex_unlock_slowpath
0.78% [kernel] [k] arch_local_irq_enable
0.63% [kernel] [k] sys_recvfrom
finally get which processes (wal senders) that are using mutexes:
perf top -e task-clock -p 55382
Events: 697 task-clock
88.08% [kernel] [k] __mutex_lock_common.isra.5
3.27% [kernel] [k] do_raw_spin_lock
2.34% [kernel] [k] arch_local_irq_restore
2.10% postgres [.] ValidXLogRecord
1.87% [kernel] [k] mutex_unlock
1.87% [kernel] [k] mutex_lock
0.47% [kernel] [k] sys_recvfrom
I think bdr is only reading wal file (current state is we behind current wal lsn),
so why reading wal file needs mutex?
I wonder, is there kernel version has better handling mutexes?
regards
ujang jaenudin | DBA Consultant (Freelancer)
http://ora62.wordpress.com
http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab
ujang jaenudin | DBA Consultant (Freelancer)
http://ora62.wordpress.com
http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab
additional info, strace output :
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
98.30 1.030072 5 213063 201463 read
1.69 0.017686 0 201464 201464 recvfrom
0.01 0.000110 0 806 lseek
0.00 0.000043 0 474 468 rt_sigreturn
0.00 0.000000 0 6 open
0.00 0.000000 0 6 close
------ ----------- ----------- --------- --------- ----------------
100.00 1.047911 415819 403395 total
On Sat, Sep 30, 2017 at 6:07 AM, milist ujang <ujang.milist@gmail.com> wrote:
Hi all,I've an environment 9.4 + bdr:PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.7.2-5) 4.7.2, 64-bitkernel version:3.2.0-4-amd64 #1 SMP Debian 3.2.65-1 x86_64 GNU/LinuxThis is consolidation databases, in this machine there are around 250+ wal sender processes.top output revealed high system cpu:%Cpu(s): 1.4 us, 49.7 sy, 0.0 ni, 48.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 stprofiling cpu with perf:perf top -e cpu-clockEvents: 142K cpu-clock82.37% [kernel] [k] __mutex_lock_common.isra.54.49% [kernel] [k] do_raw_spin_lock2.23% [kernel] [k] mutex_lock2.16% [kernel] [k] mutex_unlock2.12% [kernel] [k] arch_local_irq_restore1.73% postgres [.] ValidXLogRecord0.87% [kernel] [k] __mutex_unlock_slowpath0.78% [kernel] [k] arch_local_irq_enable0.63% [kernel] [k] sys_recvfromfinally get which processes (wal senders) that are using mutexes:perf top -e task-clock -p 55382Events: 697 task-clock88.08% [kernel] [k] __mutex_lock_common.isra.53.27% [kernel] [k] do_raw_spin_lock2.34% [kernel] [k] arch_local_irq_restore2.10% postgres [.] ValidXLogRecord1.87% [kernel] [k] mutex_unlock1.87% [kernel] [k] mutex_lock0.47% [kernel] [k] sys_recvfromI think bdr is only reading wal file (current state is we behind current wal lsn),so why reading wal file needs mutex?I wonder, is there kernel version has better handling mutexes?--regards
ujang jaenudin | DBA Consultant (Freelancer)
http://ora62.wordpress.com
http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab
regards
ujang jaenudin | DBA Consultant (Freelancer)
http://ora62.wordpress.com
http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab
ujang jaenudin | DBA Consultant (Freelancer)
http://ora62.wordpress.com
http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab