Обсуждение: How to avoid the XLog Write to be the performance bottleneck

Поиск
Список
Период
Сортировка

How to avoid the XLog Write to be the performance bottleneck

От
范国腾
Дата:
Hi, 

We have two postgres nodes(active/standby). The active node and the standby node use the same share disk(GFS2 file
system).

We are doing the performance test in active side:
(1)Now there is no SQL request sending to the standby side.
(2)The active node has 20 sessions and the test tool sends INSERT/UPDATE/SELECT request to them. The call load is very
high.

In active node, we find that the disk IO is very high but the CPU of each postgres process is about 20%. The pstack
resultshows that most of the postgres process is waiting for the XLOG Write Lock. It seems that the XLog write become
thebottleneck of the postgres database.
 

Could you please give any suggestion how to improve it?


=======================================================
[root@highgo1 ~]# pstack 9434
#0  0x00007fd79545b3a7 in semop () from /usr/lib64/libc.so.6
#1  0x0000000000652d01 in PGSemaphoreLock ()
#2  0x00000000006ab314 in LWLockAcquireOrWait ()
#3  0x00000000004e1f9d in XLogFlush ()
#4  0x00000000004d9167 in CommitTransaction ()
#5  0x00000000004d9bd5 in CommitTransactionCommand ()
#6  0x00000000006b9ff5 in finish_xact_command.part.4 ()
#7  0x00000000006bdd6a in PostgresMain ()
#8  0x0000000000477276 in ServerLoop ()
#9  0x0000000000664186 in PostmasterMain ()
#10 0x0000000000478172 in main ()

The recovery.conf in standby side is :
standby_mode = on
recovery_target_timeline = 'latest'
primary_conninfo = 'host=192.168.100.104 port=5866 user=repuser password=repuser1 application_name=node1'
primary_slot_name = 'slot1'

Thanks
Steven

Вложения

Re: How to avoid the XLog Write to be the performance bottleneck

От
Andres Freund
Дата:
On 2018-09-21 08:16:42 +0000, 范国腾 wrote:
> Hi, 
> 
> We have two postgres nodes(active/standby). The active node and the standby node use the same share disk(GFS2 file
system).
> 
> We are doing the performance test in active side:
> (1)Now there is no SQL request sending to the standby side.
> (2)The active node has 20 sessions and the test tool sends INSERT/UPDATE/SELECT request to them. The call load is
veryhigh.
 
> 
> In active node, we find that the disk IO is very high but the CPU of each postgres process is about 20%. The pstack
resultshows that most of the postgres process is waiting for the XLOG Write Lock. It seems that the XLog write become
thebottleneck of the postgres database.
 

Usually that doesn't really mean there's lock contention, but that
your IO isn't fast enough. What you can do:
a) check whether some/most/all of your transactions can use
   synchronous_commit = off - that can drastically reduce the amount of
   IO.
b) Consider putting your WAL onto a separate disk (or even partition),
   that can reduce overhead by disentangling synchronous writes (for the
   WAL) from asynchronous writes (the data being written back), and
   synchronous reads (queries).
c) Get faster storage.

Greetings,

Andres Freund