Обсуждение: How to avoid the XLog Write to be the performance bottleneck
Hi, We have two postgres nodes(active/standby). The active node and the standby node use the same share disk(GFS2 file system). We are doing the performance test in active side: (1)Now there is no SQL request sending to the standby side. (2)The active node has 20 sessions and the test tool sends INSERT/UPDATE/SELECT request to them. The call load is very high. In active node, we find that the disk IO is very high but the CPU of each postgres process is about 20%. The pstack resultshows that most of the postgres process is waiting for the XLOG Write Lock. It seems that the XLog write become thebottleneck of the postgres database. Could you please give any suggestion how to improve it? ======================================================= [root@highgo1 ~]# pstack 9434 #0 0x00007fd79545b3a7 in semop () from /usr/lib64/libc.so.6 #1 0x0000000000652d01 in PGSemaphoreLock () #2 0x00000000006ab314 in LWLockAcquireOrWait () #3 0x00000000004e1f9d in XLogFlush () #4 0x00000000004d9167 in CommitTransaction () #5 0x00000000004d9bd5 in CommitTransactionCommand () #6 0x00000000006b9ff5 in finish_xact_command.part.4 () #7 0x00000000006bdd6a in PostgresMain () #8 0x0000000000477276 in ServerLoop () #9 0x0000000000664186 in PostmasterMain () #10 0x0000000000478172 in main () The recovery.conf in standby side is : standby_mode = on recovery_target_timeline = 'latest' primary_conninfo = 'host=192.168.100.104 port=5866 user=repuser password=repuser1 application_name=node1' primary_slot_name = 'slot1' Thanks Steven
Вложения
On 2018-09-21 08:16:42 +0000, 范国腾 wrote: > Hi, > > We have two postgres nodes(active/standby). The active node and the standby node use the same share disk(GFS2 file system). > > We are doing the performance test in active side: > (1)Now there is no SQL request sending to the standby side. > (2)The active node has 20 sessions and the test tool sends INSERT/UPDATE/SELECT request to them. The call load is veryhigh. > > In active node, we find that the disk IO is very high but the CPU of each postgres process is about 20%. The pstack resultshows that most of the postgres process is waiting for the XLOG Write Lock. It seems that the XLog write become thebottleneck of the postgres database. Usually that doesn't really mean there's lock contention, but that your IO isn't fast enough. What you can do: a) check whether some/most/all of your transactions can use synchronous_commit = off - that can drastically reduce the amount of IO. b) Consider putting your WAL onto a separate disk (or even partition), that can reduce overhead by disentangling synchronous writes (for the WAL) from asynchronous writes (the data being written back), and synchronous reads (queries). c) Get faster storage. Greetings, Andres Freund