Обсуждение: BUG #9635: Wal sender process is using 100% CPU

Поиск
Список
Период
Сортировка

BUG #9635: Wal sender process is using 100% CPU

От
jkoceniak@mediamath.com
Дата:
The following bug has been logged on the website:

Bug reference:      9635
Logged by:          Jamie Koceniak
Email address:      jkoceniak@mediamath.com
PostgreSQL version: 9.1.9
Operating system:   x86_64-unknown-linux-gnu (Debian 4.7.2-5) 64-bit
Description:

Periodically throughout the day, we keep seeing the wal sender process
utilize 100% of the CPU. We began noticing this after we added 2 new slave
servers, going from 2 to 4 slaves. See top results and I also included our
wal settings. Thanks!

top - 05:03:18 up 174 days,  4:51,  2 users,  load average: 5.57, 4.75,
3.16
Tasks: 387 total,   8 running, 379 sleeping,   0 stopped,   0 zombie
%Cpu(s): 29.3 us,  4.7 sy,  0.0 ni, 65.3 id,  0.4 wa,  0.0 hi,  0.4 si,  0.0
st
MiB Mem:    290797 total,   218532 used,    72264 free,      311 buffers
MiB Swap:     7812 total,        1 used,     7811 free,   206978 cached

  PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND

30244 postgres  20   0 8497m 5608 2820 R   100  0.0   1:44.72 postgres: wal
sen
14447 postgres  20   0 8497m 5596 2816 R   100  0.0   3:11.27 postgres: wal
sen
16075 postgres  20   0 8497m 5600 2820 R   100  0.0   3:32.32 postgres: wal
sen
 8177 postgres  20   0 8497m 5360 2820 S    36  0.0   0:03.35 postgres: wal
sen
 4920 postgres  20   0 9647m 9.3g 8.1g S     3  3.3   1097:40 postgres:
writer
 4923 postgres  20   0 68872 2072  788 S     3  0.0 511:01.76 postgres:
archive
 4921 postgres  20   0 8496m  18m  17m S     2  0.0 593:36.38 postgres: wal
wri
 7853 root      20   0 23432 1836 1176 R     1  0.0   0:00.44 top

 4916 postgres  20   0 8492m 229m 228m S     0  0.1 598:44.57
/usr/lib/postgres


Current Wal settings:

             name             |   setting   |
------------------------------+-------------+
 max_wal_senders              | 10          |
 wal_block_size               | 8192        |
 wal_buffers                  | 2048        |
 wal_keep_segments            | 5000        |
 wal_level                    | hot_standby |
 wal_receiver_status_interval | 10          |
 wal_segment_size             | 2048        |
 wal_sender_delay             | 1000        |
 wal_sync_method              | fdatasync   |
 wal_writer_delay             | 200         |

Re: BUG #9635: Wal sender process is using 100% CPU

От
Heikki Linnakangas
Дата:
On 03/19/2014 07:13 PM, jkoceniak@mediamath.com wrote:
> The following bug has been logged on the website:
>
> Bug reference:      9635
> Logged by:          Jamie Koceniak
> Email address:      jkoceniak@mediamath.com
> PostgreSQL version: 9.1.9
> Operating system:   x86_64-unknown-linux-gnu (Debian 4.7.2-5) 64-bit
> Description:
>
> Periodically throughout the day, we keep seeing the wal sender process
> utilize 100% of the CPU. We began noticing this after we added 2 new slave
> servers, going from 2 to 4 slaves. See top results and I also included our
> wal settings. Thanks!

Hmm. There was a bug that caused walsender to go into a busy-loop, when
the standby disconnected. But that was fixed in 9.1.4 already.

Could you attach to one of the processes with gdb and see what functions
are being called? Or strace, or "perf top" or something.

- Heikki

Re: BUG #9635: Wal sender process is using 100% CPU

От
Jamie Koceniak
Дата:
SGkgSGVpa2tpLA0KDQpJIHBlcmZvcm1lZCBhbiBzdHJhY2Ugb24gb25lIG9mIHRoZSBwaWRzIHRo
YXQgd2FzIGNvbnN1bWluZyAxMDAlIG9mIHRoZSBjcHUuIA0KVGhpcyBpcyBqdXN0IGEgcG9ydGlv
biBvZiB0aGUgbG9nLg0KSSBhbHNvIHRyaW1tZWQgb3V0IHNvbWUgZGF0YSBwcm92aWRlZCBpbiB0
aGUgcXVvdGVzIGluIHRoZSByZWFkIGFuZCBzZW5kcywgbm90IHN1cmUgaWYgeW91IG5lZWQgaXQu
DQpBIG1ham9yaXR5IG9mIHRoZSBjYWxscyBpbiB0aGUgdHJhY2UgbG9nIHdlcmUgcmVjdmZyb20o
KSBhbmQgc2VuZHRvKCkuDQpFQUdBSU4gKFJlc291cmNlIHRlbXBvcmFyaWx5IHVuYXZhaWxhYmxl
KSBjYW1lIHVwIHRpbWUgYW5kIHRpbWUgYWdhaW4uDQoNCiMgc3RyYWNlIGxvZyBkdXJpbmcgMTAw
JSBjcHUgdXRpbGl6YXRpb24gb24gdGhlIG1hc3Rlcg0KDQpnZXRwcGlkKCkgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgPSA1NDc0DQpyZWN2ZnJvbSgxMCwgMHg3ZjVkZ2FjMWY3NTMsIDUs
IDAsIDAsIDApID0gLTEgRUFHQUlOIChSZXNvdXJjZSB0ZW1wb3JhcmlseSB1bmF2YWlsYWJsZSkN
CnJlYWQoNywgIiIuLi4sIDEzMTA3MikgPSAxMzEwNzINCnNlbmR0bygxMCwgIiIuLi4sIDMxNzgs
IDAsIE5VTEwsIDApID0gMzE3OA0Kc2VuZHRvKDEwLCAiIi4uLiwgNTM4MSwgMCwgTlVMTCwgMCkg
PSA1MzgxDQpnZXRwcGlkKCkgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgPSA1NDc0DQpy
ZWN2ZnJvbSgxMCwgMHg3ZjVkZ2FjMWY3NTMsIDUsIDAsIDAsIDApID0gLTEgRUFHQUlOIChSZXNv
dXJjZSB0ZW1wb3JhcmlseSB1bmF2YWlsYWJsZSkNCnJlYWQoNywgIiIuLi4sIDEzMTA3MikgPSAx
MzEwNzINCnNlbmR0bygxMCwgIiIuLi4sIDMzMzgsIDAsIE5VTEwsIDApID0gMzMzOA0KZ2V0cHBp
ZCgpICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgID0gNTQ3NA0KcmVjdmZyb20oMTAsICIg
IiwgNSwgMCwgTlVMTCwgTlVMTCkgPSA1DQpyZWN2ZnJvbSgxMCwgIiIsIDMyLCAwLCBOVUxMLCBO
VUxMKSA9IDMyDQpyZWN2ZnJvbSgxMCwgIiIsIDUsIDAsIE5VTEwsIE5VTEwpID0gNQ0KcmVjdmZy
b20oMTAsICIiLi4uLCA0OCwgMCwgTlVMTCwgTlVMTCkgPSA0OA0KZmNudGwoMTAsIEZfR0VURkwp
ICAgICAgICAgICAgICAgICAgICAgID0gMHg4MDIgKGZsYWdzIE9fUkRXUnxPX05PTkJMT0NLKQ0K
ZmNudGwoMTAsIEZfU0VURkwsIE9fUkRXUikgICAgICAgICAgICAgID0gMA0KZmNudGwoMTAsIEZf
U0VURkwsIE9fUkRPTkxZfE9fTk9OQkxPQ0spID0gMA0KcmVjdmZyb20oMTAsIDB4N2Y1ZGdhYzFm
NzUzLCA1LCAwLCAwLCAwKSA9IC0xIEVBR0FJTiAoUmVzb3VyY2UgdGVtcG9yYXJpbHkgdW5hdmFp
bGFibGUpDQpzZWxlY3QoMTEsIFszIDEwXSwgWzEwXSwgTlVMTCwgezEsIDB9KSAgPSA/IEVSRVNU
QVJUTk9IQU5EIChUbyBiZSByZXN0YXJ0ZWQpDQotLS0gU0lHVVNSMSAoVXNlciBkZWZpbmVkIHNp
Z25hbCAxKSBAIDAgKDApIC0tLQ0Kd3JpdGUoNiwgIlwwIiwgMSkgICAgICAgICAgICAgICAgICAg
ICAgID0gMQ0KcnRfc2lncmV0dXJuKDB4NikgICAgICAgICAgICAgICAgICAgICAgID0gLTEgRUlO
VFIgKEludGVycnVwdGVkIHN5c3RlbSBjYWxsKQ0KcmVhZCgzLCAiXDAiLCAxNikgICAgICAgICAg
ICAgICAgICAgICAgID0gMQ0KZ2V0cHBpZCgpICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ID0gNTQ3NA0KcmVjdmZyb20oMTAsIDUsIDAsIDAsIDApID0gLTEgRUFHQUlOIChSZXNvdXJjZSB0
ZW1wb3JhcmlseSB1bmF2YWlsYWJsZSkNCnNlbmR0bygxMCwgIiIuLi4sIDI3MTUsIDAsIE5VTEws
IDApID0gLTEgRUFHQUlOIChSZXNvdXJjZSB0ZW1wb3JhcmlseSB1bmF2YWlsYWJsZSkNCnJlYWQo
MywgMHg3ZmZsOTYxYjA3ODAsIDE2KSAgICAgICAgICAgICA9IC0xIEVBR0FJTiAoUmVzb3VyY2Ug
dGVtcG9yYXJpbHkgdW5hdmFpbGFibGUpDQpzZWxlY3QoMTEsIFszIDEwXSwgWzEwXSwgTlVMTCwg
ezEsIDB9KSAgPSA/IEVSRVNUQVJUTk9IQU5EIChUbyBiZSByZXN0YXJ0ZWQpDQotLS0gU0lHVVNS
MSAoVXNlciBkZWZpbmVkIHNpZ25hbCAxKSBAIDAgKDApIC0tLQ0Kd3JpdGUoNiwgIlwwIiwgMSkg
ICAgICAgICAgICAgICAgICAgICAgID0gMQ0KcnRfc2lncmV0dXJuKDB4NikgICAgICAgICAgICAg
ICAgICAgICAgID0gLTEgRUlOVFIgKEludGVycnVwdGVkIHN5c3RlbSBjYWxsKQ0KcmVhZCgzLCAi
XDAiLCAxNikgICAgICAgICAgICAgICAgICAgICAgID0gMQ0KZ2V0cHBpZCgpICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgID0gNTQ3NA0KDQpJIGN1cnJlbnRseSBkb24ndCBoYXZlIHBlcmYg
aW5zdGFsbGVkLg0KSSBjYW4gZ2V0IHRoYXQgaW5zdGFsbGVkIGFuZCBzZW5kIHJlc3VsdHMgaWYg
eW91IG5lZWQgaXQuDQoNClRoYW5rcyENCg0KLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCkZy
b206IEhlaWtraSBMaW5uYWthbmdhcyBbbWFpbHRvOmhsaW5uYWthbmdhc0B2bXdhcmUuY29tXSAN
ClNlbnQ6IFR1ZXNkYXksIE1hcmNoIDI1LCAyMDE0IDQ6NDQgQU0NClRvOiBKYW1pZSBLb2Nlbmlh
aw0KQ2M6IHBnc3FsLWJ1Z3NAcG9zdGdyZXNxbC5vcmcNClN1YmplY3Q6IFJlOiBbQlVHU10gQlVH
ICM5NjM1OiBXYWwgc2VuZGVyIHByb2Nlc3MgaXMgdXNpbmcgMTAwJSBDUFUNCg0KT24gMDMvMTkv
MjAxNCAwNzoxMyBQTSwgamtvY2VuaWFrQG1lZGlhbWF0aC5jb20gd3JvdGU6DQo+IFRoZSBmb2xs
b3dpbmcgYnVnIGhhcyBiZWVuIGxvZ2dlZCBvbiB0aGUgd2Vic2l0ZToNCj4NCj4gQnVnIHJlZmVy
ZW5jZTogICAgICA5NjM1DQo+IExvZ2dlZCBieTogICAgICAgICAgSmFtaWUgS29jZW5pYWsNCj4g
RW1haWwgYWRkcmVzczogICAgICBqa29jZW5pYWtAbWVkaWFtYXRoLmNvbQ0KPiBQb3N0Z3JlU1FM
IHZlcnNpb246IDkuMS45DQo+IE9wZXJhdGluZyBzeXN0ZW06ICAgeDg2XzY0LXVua25vd24tbGlu
dXgtZ251IChEZWJpYW4gNC43LjItNSkgNjQtYml0DQo+IERlc2NyaXB0aW9uOg0KPg0KPiBQZXJp
b2RpY2FsbHkgdGhyb3VnaG91dCB0aGUgZGF5LCB3ZSBrZWVwIHNlZWluZyB0aGUgd2FsIHNlbmRl
ciBwcm9jZXNzIA0KPiB1dGlsaXplIDEwMCUgb2YgdGhlIENQVS4gV2UgYmVnYW4gbm90aWNpbmcg
dGhpcyBhZnRlciB3ZSBhZGRlZCAyIG5ldyANCj4gc2xhdmUgc2VydmVycywgZ29pbmcgZnJvbSAy
IHRvIDQgc2xhdmVzLiBTZWUgdG9wIHJlc3VsdHMgYW5kIEkgYWxzbyANCj4gaW5jbHVkZWQgb3Vy
IHdhbCBzZXR0aW5ncy4gVGhhbmtzIQ0KDQpIbW0uIFRoZXJlIHdhcyBhIGJ1ZyB0aGF0IGNhdXNl
ZCB3YWxzZW5kZXIgdG8gZ28gaW50byBhIGJ1c3ktbG9vcCwgd2hlbiB0aGUgc3RhbmRieSBkaXNj
b25uZWN0ZWQuIEJ1dCB0aGF0IHdhcyBmaXhlZCBpbiA5LjEuNCBhbHJlYWR5Lg0KDQpDb3VsZCB5
b3UgYXR0YWNoIHRvIG9uZSBvZiB0aGUgcHJvY2Vzc2VzIHdpdGggZ2RiIGFuZCBzZWUgd2hhdCBm
dW5jdGlvbnMgYXJlIGJlaW5nIGNhbGxlZD8gT3Igc3RyYWNlLCBvciAicGVyZiB0b3AiIG9yIHNv
bWV0aGluZy4NCg0KLSBIZWlra2kNCg==

Re: BUG #9635: Wal sender process is using 100% CPU

От
Heikki Linnakangas
Дата:
On 05/05/2014 10:37 PM, Jamie Koceniak wrote:
> Hi Heikki,
>
> I performed an strace on one of the pids that was consuming 100% of the cpu.
> This is just a portion of the log.
> I also trimmed out some data provided in the quotes in the read and sends, not sure if you need it.
> A majority of the calls in the trace log were recvfrom() and sendto().
> EAGAIN (Resource temporarily unavailable) came up time and time again.
>
> # strace log during 100% cpu utilization on the master

Hmm, that trace looks normal to me. It doesn't look like it's
busy-looping, as I would expect if it's using 100% of the CPU.

Can you post the full log, or at least a larger chunk? I don't need the
data in the quotes from read() and sends() that you removed, but I would
like to see the trace from a longer period of time to see what the
pattern is.

- Heikki

Re: BUG #9635: Wal sender process is using 100% CPU

От
Jamie Koceniak
Дата:
Hi Heikki,

Attached is a trace route for one of the pids that was running at 100% cpu.
Note: when the wal_senders hit 100% cpu, our slave servers tend to lag the master for up to around 3-4 minutes until
thewal_sender cpu utilization drops. This is very problematic for us since we are trying to use the slaves for reads in
ourapplication.
 

Here is perf top results:
39.49%  libz.so.1.2.7           [.] 0x000000000000301e
 32.97%  postgres                [.] 0x00000000000cf150
  6.50%  postgres                [.] nocachegetattr
  2.63%  postgres                [.] btint4cmp
  1.64%  postgres                [.] _start
  1.40%  [unknown]               [.] 0x00007f9dfca6a34b
  1.06%  libc-2.13.so            [.] 0x000000000012934b
  0.94%  postgres                [.] slot_getattr
  0.85%  libcrypto.so.1.0.0      [.] 0x000000000009f48f
  0.69%  postgres                [.] ExecProject
  0.67%  libz.so.1.2.7           [.] adler32
  0.66%  postgres                [.] GetMemoryChunkSpace
  0.58%  [kernel]                [k] copy_user_generic_string
  0.45%  postgres                [.] heap_fill_tuple
  0.38%  postgres                [.] execTuplesMatch
  0.27%  postgres                [.] FunctionCall2Coll
  0.27%  postgres                [.] ExecProcNode
  0.25%  postgres                [.] heap_tuple_untoast_attr
  0.22%  postgres                [.] MemoryContextAlloc
  0.21%  postgres                [.] heap_form_minimal_tuple
  0.21%  postgres                [.] pfree
  0.19%  postgres                [.] heap_compute_data_size

 Here is some data using perf record and perf report:

+  33.93%        193019       postgres  postgres                  [.] 0x0000000000369e27
+  30.06%        173825       postgres  libz.so.1.2.7             [.] 0x00000000000036c0
+  10.21%         60618       postgres  [unknown]                 [.] 0x00007f9dfc2fd529
+   4.81%         27275       postgres  postgres                  [.] nocachegetattr
+   2.55%         14443       postgres  postgres                  [.] btint4cmp
+   1.22%          6916       postgres  postgres                  [.] ExecProject
+   1.20%          6837       postgres  postgres                  [.] _start
+   1.10%          6495       postgres  libc-2.13.so              [.] 0x000000000012b8e9
+   0.84%          4760       postgres  postgres                  [.] heap_fill_tuple
+   0.74%          4265       postgres  [kernel.kallsyms]         [k] copy_user_generic_string
+   0.71%          4001       postgres  postgres                  [.] GetMemoryChunkSpace
+   0.66%          3881       postgres  libcrypto.so.1.0.0        [.] 0x0000000000081ec7
+   0.63%          3673       postgres  libz.so.1.2.7             [.] adler32
+   0.43%          2448       postgres  postgres                  [.] slot_getattr
+   0.40%          2277       postgres  postgres                  [.] ExecProcNode
+   0.40%          2250       postgres  postgres                  [.] heap_form_minimal_tuple
+   0.37%          2078       postgres  postgres                  [.] heap_compute_data_size
+   0.31%          1739       postgres  postgres                  [.] ExecScan
+   0.27%          1779       postgres  [kernel.kallsyms]         [k] page_fault
+   0.24%          1374       postgres  postgres                  [.] LogicalTapeWrite
+   0.24%          4206        swapper  [kernel.kallsyms]         [k] intel_idle
+   0.21%          1206       postgres  postgres                  [.] slot_getsomeattrs

Also, here are our wal settings on the master.
Do these settings appear ok?

wal_block_size                  | 8192                                     | 
 wal_buffers                     | 16MB                                     | 
 wal_keep_segments               | 5000                                     | 
 wal_level                       | hot_standby                              | 
 wal_receiver_status_interval    | 10s                                      | 
 wal_segment_size                | 16MB                                     | 
 wal_sender_delay                | 1s                                       | 
 wal_sync_method                 | fdatasync                                | 
 wal_writer_delay                | 200ms                                    |

Thanks!

-----Original Message-----
From: Heikki Linnakangas [mailto:hlinnakangas@vmware.com] 
Sent: Tuesday, May 06, 2014 4:01 AM
To: Jamie Koceniak
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #9635: Wal sender process is using 100% CPU

On 05/05/2014 10:37 PM, Jamie Koceniak wrote:
> Hi Heikki,
>
> I performed an strace on one of the pids that was consuming 100% of the cpu.
> This is just a portion of the log.
> I also trimmed out some data provided in the quotes in the read and sends, not sure if you need it.
> A majority of the calls in the trace log were recvfrom() and sendto().
> EAGAIN (Resource temporarily unavailable) came up time and time again.
>
> # strace log during 100% cpu utilization on the master

Hmm, that trace looks normal to me. It doesn't look like it's busy-looping, as I would expect if it's using 100% of the
CPU.

Can you post the full log, or at least a larger chunk? I don't need the data in the quotes from read() and sends() that
youremoved, but I would like to see the trace from a longer period of time to see what the pattern is.
 

- Heikki

Вложения

Re: BUG #9635: Wal sender process is using 100% CPU

От
Heikki Linnakangas
Дата:
On 05/06/2014 07:44 PM, Jamie Koceniak wrote:
> Hi Heikki,
>
> Attached is a trace route for one of the pids that was running at 100% cpu.
> Note: when the wal_senders hit 100% cpu, our slave servers tend to lag the master for up to around 3-4 minutes until
thewal_sender cpu utilization drops. This is very problematic for us since we are trying to use the slaves for reads in
ourapplication. 
>
> Here is perf top results:
> 39.49%  libz.so.1.2.7           [.] 0x000000000000301e
>   32.97%  postgres                [.] 0x00000000000cf150
>    6.50%  postgres                [.] nocachegetattr
>    2.63%  postgres                [.] btint4cmp
>    1.64%  postgres                [.] _start
>    1.40%  [unknown]               [.] 0x00007f9dfca6a34b
>    1.06%  libc-2.13.so            [.] 0x000000000012934b
>    0.94%  postgres                [.] slot_getattr
>    0.85%  libcrypto.so.1.0.0      [.] 0x000000000009f48f
>    0.69%  postgres                [.] ExecProject
>    0.67%  libz.so.1.2.7           [.] adler32
>    0.66%  postgres                [.] GetMemoryChunkSpace
>    0.58%  [kernel]                [k] copy_user_generic_string
>    0.45%  postgres                [.] heap_fill_tuple
>    0.38%  postgres                [.] execTuplesMatch
>    0.27%  postgres                [.] FunctionCall2Coll
>    0.27%  postgres                [.] ExecProcNode
>    0.25%  postgres                [.] heap_tuple_untoast_attr
>    0.22%  postgres                [.] MemoryContextAlloc
>    0.21%  postgres                [.] heap_form_minimal_tuple
>    0.21%  postgres                [.] pfree
>    0.19%  postgres                [.] heap_compute_data_size

Hmm, these results seem to include all processes, not just the WAL sender

>   Here is some data using perf record and perf report:
>
> +  33.93%        193019       postgres  postgres                  [.] 0x0000000000369e27
> +  30.06%        173825       postgres  libz.so.1.2.7             [.] 0x00000000000036c0
> +  10.21%         60618       postgres  [unknown]                 [.] 0x00007f9dfc2fd529
> +   4.81%         27275       postgres  postgres                  [.] nocachegetattr
> +   2.55%         14443       postgres  postgres                  [.] btint4cmp
> +   1.22%          6916       postgres  postgres                  [.] ExecProject
> +   1.20%          6837       postgres  postgres                  [.] _start
> +   1.10%          6495       postgres  libc-2.13.so              [.] 0x000000000012b8e9
> +   0.84%          4760       postgres  postgres                  [.] heap_fill_tuple
> +   0.74%          4265       postgres  [kernel.kallsyms]         [k] copy_user_generic_string
> +   0.71%          4001       postgres  postgres                  [.] GetMemoryChunkSpace
> +   0.66%          3881       postgres  libcrypto.so.1.0.0        [.] 0x0000000000081ec7
> +   0.63%          3673       postgres  libz.so.1.2.7             [.] adler32
> +   0.43%          2448       postgres  postgres                  [.] slot_getattr
> +   0.40%          2277       postgres  postgres                  [.] ExecProcNode
> +   0.40%          2250       postgres  postgres                  [.] heap_form_minimal_tuple
> +   0.37%          2078       postgres  postgres                  [.] heap_compute_data_size
> +   0.31%          1739       postgres  postgres                  [.] ExecScan
> +   0.27%          1779       postgres  [kernel.kallsyms]         [k] page_fault
> +   0.24%          1374       postgres  postgres                  [.] LogicalTapeWrite
> +   0.24%          4206        swapper  [kernel.kallsyms]         [k] intel_idle
> +   0.21%          1206       postgres  postgres                  [.] slot_getsomeattrs

Same here.

> Also, here are our wal settings on the master.
> Do these settings appear ok?
>
> wal_block_size                  | 8192                                     |
>   wal_buffers                     | 16MB                                     |
>   wal_keep_segments               | 5000                                     |
>   wal_level                       | hot_standby                              |
>   wal_receiver_status_interval    | 10s                                      |
>   wal_segment_size                | 16MB                                     |
>   wal_sender_delay                | 1s                                       |
>   wal_sync_method                 | fdatasync                                |
>   wal_writer_delay                | 200ms                                    |

Looks reasonable to me.

So, the strace output repeats this, with small variations in the number
of bytes written in each sendto() call:

recvfrom(10, 0x7f9dfec1f753, 5, 0, 0, 0) = -1 EAGAIN (Resource
temporarily unavailable)
read(7, ..., 131072) = 131072
sendto(10, ..., 3690, 0, NULL, 0) = 3690
sendto(10, ..., 3605, 0, NULL, 0) = 3605
sendto(10, ..., 3669, 0, NULL, 0) = 3669
sendto(10, ..., 3653, 0, NULL, 0) = 3653
sendto(10, ..., 3637, 0, NULL, 0) = 3637
sendto(10, ..., 3621, 0, NULL, 0) = 3621
sendto(10, ..., 3669, 0, NULL, 0) = 3669
sendto(10, ..., 3605, 0, NULL, 0) = 3605
sendto(10, ..., 53, 0, NULL, 0) = 53
getppid()                               = 5474

The read() reads 128kB of WAL from the file in pg_xlog, and it is then
sent to the client. Now, it's pretty surprising that the number of bytes
written with sendto() doesn't match the number of bytes read from the
file. Unless, you're using SSL with compression. That would also explain
the high amount of CPU time spent in libz in the perf output.

Try disabling SSL compression, with sslcompression=0 option in
primary_conninfo, or disable SSL altogether if the network is
trustworthy enough for that.

- Heikki

Re: BUG #9635: Wal sender process is using 100% CPU

От
Andres Freund
Дата:
Hi,

On 2014-03-19 17:13:56 +0000, jkoceniak@mediamath.com wrote:
> Periodically throughout the day, we keep seeing the wal sender process
> utilize 100% of the CPU. We began noticing this after we added 2 new slave
> servers, going from 2 to 4 slaves. See top results and I also included our
> wal settings. Thanks!

A typical reason for this is that the standbys are connecting over
ssl. Is that the case? If so, intentionally?

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: BUG #9635: Wal sender process is using 100% CPU

От
Jamie Koceniak
Дата:
SSB3YXMgYWJsZSB0byByZWNyZWF0ZSB0aGUgaXNzdWUuDQpCYXNpY2FsbHksIHdlIGhhdmUgYSBq
b2IgdGhhdCBpbnNlcnRzIDMwKyBtaWxsaW9uIHJlY29yZHMgaW50byBhIHRhYmxlLg0KRHVyaW5n
IHRoZSBsb2FkaW5nIG9mIHRoaXMgdGFibGUsIHRoZSB3YWxfc2VuZGVyIGNwdSBzcGlrZXMgdG8g
MTAwJSBhbmQgYXBwZWFycyB0byBzdGF5IGF0IDEwMCUgdW50aWwgdGhlIHJlcGxpY2F0aW9uIGlz
IGNvbXBsZXRlZCBvbiB0aGUgc2xhdmUuDQpJIHdhcyBhYmxlIHRvIHRlc3QgaW5zZXJ0aW5nIHRo
aXMgbXVjaCBkYXRhIG9uIGFub3RoZXIgbWFzdGVyL3NsYXZlIGNvbmZpZ3VyYXRpb24gd2l0aCBz
dHJlYW1pbmcgcmVwbGljYXRpb24gYW5kIHNhdyB0aGUgc2FtZSByZXN1bHRzLg0KDQpPdGhlciB0
aGFuIG9mZmxvYWRpbmcgdGhpcyBqb2IsIGlzIHRoZXJlIGFueXRoaW5nIHdlIGNhbiBkbyB0byBk
ZWNyZWFzZSB0aGUgcmVwbGljYXRpb24gbGFnPw0KQWxzbywgaXMgdGhpcyBleHBlY3RlZCBiZWhh
dmlvciBkdWUgdG8gdGhlIHZvbHVtZSBvZiBkYXRhIHRoYXQgZ290IGluc2VydGVkIGFuZCBuZWVk
cyB0byBnZXQgcmVwbGljYXRlZD8NCg0KVGhhbmtzIQ0KDQotLS0tLU9yaWdpbmFsIE1lc3NhZ2Ut
LS0tLQ0KRnJvbTogSGVpa2tpIExpbm5ha2FuZ2FzIFttYWlsdG86aGxpbm5ha2FuZ2FzQHZtd2Fy
ZS5jb21dIA0KU2VudDogVHVlc2RheSwgTWF5IDA2LCAyMDE0IDEwOjMzIEFNDQpUbzogSmFtaWUg
S29jZW5pYWsNCkNjOiBwZ3NxbC1idWdzQHBvc3RncmVzcWwub3JnDQpTdWJqZWN0OiBSZTogW0JV
R1NdIEJVRyAjOTYzNTogV2FsIHNlbmRlciBwcm9jZXNzIGlzIHVzaW5nIDEwMCUgQ1BVDQoNCk9u
IDA1LzA2LzIwMTQgMDc6NDQgUE0sIEphbWllIEtvY2VuaWFrIHdyb3RlOg0KPiBIaSBIZWlra2ks
DQo+DQo+IEF0dGFjaGVkIGlzIGEgdHJhY2Ugcm91dGUgZm9yIG9uZSBvZiB0aGUgcGlkcyB0aGF0
IHdhcyBydW5uaW5nIGF0IDEwMCUgY3B1Lg0KPiBOb3RlOiB3aGVuIHRoZSB3YWxfc2VuZGVycyBo
aXQgMTAwJSBjcHUsIG91ciBzbGF2ZSBzZXJ2ZXJzIHRlbmQgdG8gbGFnIHRoZSBtYXN0ZXIgZm9y
IHVwIHRvIGFyb3VuZCAzLTQgbWludXRlcyB1bnRpbCB0aGUgd2FsX3NlbmRlciBjcHUgdXRpbGl6
YXRpb24gZHJvcHMuIFRoaXMgaXMgdmVyeSBwcm9ibGVtYXRpYyBmb3IgdXMgc2luY2Ugd2UgYXJl
IHRyeWluZyB0byB1c2UgdGhlIHNsYXZlcyBmb3IgcmVhZHMgaW4gb3VyIGFwcGxpY2F0aW9uLg0K
Pg0KPiBIZXJlIGlzIHBlcmYgdG9wIHJlc3VsdHM6DQo+IDM5LjQ5JSAgbGliei5zby4xLjIuNyAg
ICAgICAgICAgWy5dIDB4MDAwMDAwMDAwMDAwMzAxZQ0KPiAgIDMyLjk3JSAgcG9zdGdyZXMgICAg
ICAgICAgICAgICAgWy5dIDB4MDAwMDAwMDAwMDBjZjE1MA0KPiAgICA2LjUwJSAgcG9zdGdyZXMg
ICAgICAgICAgICAgICAgWy5dIG5vY2FjaGVnZXRhdHRyDQo+ICAgIDIuNjMlICBwb3N0Z3JlcyAg
ICAgICAgICAgICAgICBbLl0gYnRpbnQ0Y21wDQo+ICAgIDEuNjQlICBwb3N0Z3JlcyAgICAgICAg
ICAgICAgICBbLl0gX3N0YXJ0DQo+ICAgIDEuNDAlICBbdW5rbm93bl0gICAgICAgICAgICAgICBb
Ll0gMHgwMDAwN2Y5ZGZjYTZhMzRiDQo+ICAgIDEuMDYlICBsaWJjLTIuMTMuc28gICAgICAgICAg
ICBbLl0gMHgwMDAwMDAwMDAwMTI5MzRiDQo+ICAgIDAuOTQlICBwb3N0Z3JlcyAgICAgICAgICAg
ICAgICBbLl0gc2xvdF9nZXRhdHRyDQo+ICAgIDAuODUlICBsaWJjcnlwdG8uc28uMS4wLjAgICAg
ICBbLl0gMHgwMDAwMDAwMDAwMDlmNDhmDQo+ICAgIDAuNjklICBwb3N0Z3JlcyAgICAgICAgICAg
ICAgICBbLl0gRXhlY1Byb2plY3QNCj4gICAgMC42NyUgIGxpYnouc28uMS4yLjcgICAgICAgICAg
IFsuXSBhZGxlcjMyDQo+ICAgIDAuNjYlICBwb3N0Z3JlcyAgICAgICAgICAgICAgICBbLl0gR2V0
TWVtb3J5Q2h1bmtTcGFjZQ0KPiAgICAwLjU4JSAgW2tlcm5lbF0gICAgICAgICAgICAgICAgW2td
IGNvcHlfdXNlcl9nZW5lcmljX3N0cmluZw0KPiAgICAwLjQ1JSAgcG9zdGdyZXMgICAgICAgICAg
ICAgICAgWy5dIGhlYXBfZmlsbF90dXBsZQ0KPiAgICAwLjM4JSAgcG9zdGdyZXMgICAgICAgICAg
ICAgICAgWy5dIGV4ZWNUdXBsZXNNYXRjaA0KPiAgICAwLjI3JSAgcG9zdGdyZXMgICAgICAgICAg
ICAgICAgWy5dIEZ1bmN0aW9uQ2FsbDJDb2xsDQo+ICAgIDAuMjclICBwb3N0Z3JlcyAgICAgICAg
ICAgICAgICBbLl0gRXhlY1Byb2NOb2RlDQo+ICAgIDAuMjUlICBwb3N0Z3JlcyAgICAgICAgICAg
ICAgICBbLl0gaGVhcF90dXBsZV91bnRvYXN0X2F0dHINCj4gICAgMC4yMiUgIHBvc3RncmVzICAg
ICAgICAgICAgICAgIFsuXSBNZW1vcnlDb250ZXh0QWxsb2MNCj4gICAgMC4yMSUgIHBvc3RncmVz
ICAgICAgICAgICAgICAgIFsuXSBoZWFwX2Zvcm1fbWluaW1hbF90dXBsZQ0KPiAgICAwLjIxJSAg
cG9zdGdyZXMgICAgICAgICAgICAgICAgWy5dIHBmcmVlDQo+ICAgIDAuMTklICBwb3N0Z3JlcyAg
ICAgICAgICAgICAgICBbLl0gaGVhcF9jb21wdXRlX2RhdGFfc2l6ZQ0KDQpIbW0sIHRoZXNlIHJl
c3VsdHMgc2VlbSB0byBpbmNsdWRlIGFsbCBwcm9jZXNzZXMsIG5vdCBqdXN0IHRoZSBXQUwgc2Vu
ZGVyDQoNCj4gICBIZXJlIGlzIHNvbWUgZGF0YSB1c2luZyBwZXJmIHJlY29yZCBhbmQgcGVyZiBy
ZXBvcnQ6DQo+DQo+ICsgIDMzLjkzJSAgICAgICAgMTkzMDE5ICAgICAgIHBvc3RncmVzICBwb3N0
Z3JlcyAgICAgICAgICAgICAgICAgIFsuXSAweDAwMDAwMDAwMDAzNjllMjcNCj4gKyAgMzAuMDYl
ICAgICAgICAxNzM4MjUgICAgICAgcG9zdGdyZXMgIGxpYnouc28uMS4yLjcgICAgICAgICAgICAg
Wy5dIDB4MDAwMDAwMDAwMDAwMzZjMA0KPiArICAxMC4yMSUgICAgICAgICA2MDYxOCAgICAgICBw
b3N0Z3JlcyAgW3Vua25vd25dICAgICAgICAgICAgICAgICBbLl0gMHgwMDAwN2Y5ZGZjMmZkNTI5
DQo+ICsgICA0LjgxJSAgICAgICAgIDI3Mjc1ICAgICAgIHBvc3RncmVzICBwb3N0Z3JlcyAgICAg
ICAgICAgICAgICAgIFsuXSBub2NhY2hlZ2V0YXR0cg0KPiArICAgMi41NSUgICAgICAgICAxNDQ0
MyAgICAgICBwb3N0Z3JlcyAgcG9zdGdyZXMgICAgICAgICAgICAgICAgICBbLl0gYnRpbnQ0Y21w
DQo+ICsgICAxLjIyJSAgICAgICAgICA2OTE2ICAgICAgIHBvc3RncmVzICBwb3N0Z3JlcyAgICAg
ICAgICAgICAgICAgIFsuXSBFeGVjUHJvamVjdA0KPiArICAgMS4yMCUgICAgICAgICAgNjgzNyAg
ICAgICBwb3N0Z3JlcyAgcG9zdGdyZXMgICAgICAgICAgICAgICAgICBbLl0gX3N0YXJ0DQo+ICsg
ICAxLjEwJSAgICAgICAgICA2NDk1ICAgICAgIHBvc3RncmVzICBsaWJjLTIuMTMuc28gICAgICAg
ICAgICAgIFsuXSAweDAwMDAwMDAwMDAxMmI4ZTkNCj4gKyAgIDAuODQlICAgICAgICAgIDQ3NjAg
ICAgICAgcG9zdGdyZXMgIHBvc3RncmVzICAgICAgICAgICAgICAgICAgWy5dIGhlYXBfZmlsbF90
dXBsZQ0KPiArICAgMC43NCUgICAgICAgICAgNDI2NSAgICAgICBwb3N0Z3JlcyAgW2tlcm5lbC5r
YWxsc3ltc10gICAgICAgICBba10gY29weV91c2VyX2dlbmVyaWNfc3RyaW5nDQo+ICsgICAwLjcx
JSAgICAgICAgICA0MDAxICAgICAgIHBvc3RncmVzICBwb3N0Z3JlcyAgICAgICAgICAgICAgICAg
IFsuXSBHZXRNZW1vcnlDaHVua1NwYWNlDQo+ICsgICAwLjY2JSAgICAgICAgICAzODgxICAgICAg
IHBvc3RncmVzICBsaWJjcnlwdG8uc28uMS4wLjAgICAgICAgIFsuXSAweDAwMDAwMDAwMDAwODFl
YzcNCj4gKyAgIDAuNjMlICAgICAgICAgIDM2NzMgICAgICAgcG9zdGdyZXMgIGxpYnouc28uMS4y
LjcgICAgICAgICAgICAgWy5dIGFkbGVyMzINCj4gKyAgIDAuNDMlICAgICAgICAgIDI0NDggICAg
ICAgcG9zdGdyZXMgIHBvc3RncmVzICAgICAgICAgICAgICAgICAgWy5dIHNsb3RfZ2V0YXR0cg0K
PiArICAgMC40MCUgICAgICAgICAgMjI3NyAgICAgICBwb3N0Z3JlcyAgcG9zdGdyZXMgICAgICAg
ICAgICAgICAgICBbLl0gRXhlY1Byb2NOb2RlDQo+ICsgICAwLjQwJSAgICAgICAgICAyMjUwICAg
ICAgIHBvc3RncmVzICBwb3N0Z3JlcyAgICAgICAgICAgICAgICAgIFsuXSBoZWFwX2Zvcm1fbWlu
aW1hbF90dXBsZQ0KPiArICAgMC4zNyUgICAgICAgICAgMjA3OCAgICAgICBwb3N0Z3JlcyAgcG9z
dGdyZXMgICAgICAgICAgICAgICAgICBbLl0gaGVhcF9jb21wdXRlX2RhdGFfc2l6ZQ0KPiArICAg
MC4zMSUgICAgICAgICAgMTczOSAgICAgICBwb3N0Z3JlcyAgcG9zdGdyZXMgICAgICAgICAgICAg
ICAgICBbLl0gRXhlY1NjYW4NCj4gKyAgIDAuMjclICAgICAgICAgIDE3NzkgICAgICAgcG9zdGdy
ZXMgIFtrZXJuZWwua2FsbHN5bXNdICAgICAgICAgW2tdIHBhZ2VfZmF1bHQNCj4gKyAgIDAuMjQl
ICAgICAgICAgIDEzNzQgICAgICAgcG9zdGdyZXMgIHBvc3RncmVzICAgICAgICAgICAgICAgICAg
Wy5dIExvZ2ljYWxUYXBlV3JpdGUNCj4gKyAgIDAuMjQlICAgICAgICAgIDQyMDYgICAgICAgIHN3
YXBwZXIgIFtrZXJuZWwua2FsbHN5bXNdICAgICAgICAgW2tdIGludGVsX2lkbGUNCj4gKyAgIDAu
MjElICAgICAgICAgIDEyMDYgICAgICAgcG9zdGdyZXMgIHBvc3RncmVzICAgICAgICAgICAgICAg
ICAgWy5dIHNsb3RfZ2V0c29tZWF0dHJzDQoNClNhbWUgaGVyZS4NCg0KPiBBbHNvLCBoZXJlIGFy
ZSBvdXIgd2FsIHNldHRpbmdzIG9uIHRoZSBtYXN0ZXIuDQo+IERvIHRoZXNlIHNldHRpbmdzIGFw
cGVhciBvaz8NCj4NCj4gd2FsX2Jsb2NrX3NpemUgICAgICAgICAgICAgICAgICB8IDgxOTIgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgfA0KPiAgIHdhbF9idWZmZXJzICAgICAg
ICAgICAgICAgICAgICAgfCAxNk1CICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
IHwNCj4gICB3YWxfa2VlcF9zZWdtZW50cyAgICAgICAgICAgICAgIHwgNTAwMCAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICB8DQo+ICAgd2FsX2xldmVsICAgICAgICAgICAgICAg
ICAgICAgICB8IGhvdF9zdGFuZGJ5ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgfA0KPiAg
IHdhbF9yZWNlaXZlcl9zdGF0dXNfaW50ZXJ2YWwgICAgfCAxMHMgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgIHwNCj4gICB3YWxfc2VnbWVudF9zaXplICAgICAgICAgICAgICAg
IHwgMTZNQiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICB8DQo+ICAgd2FsX3Nl
bmRlcl9kZWxheSAgICAgICAgICAgICAgICB8IDFzICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgfA0KPiAgIHdhbF9zeW5jX21ldGhvZCAgICAgICAgICAgICAgICAgfCBmZGF0
YXN5bmMgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHwNCj4gICB3YWxfd3JpdGVyX2Rl
bGF5ICAgICAgICAgICAgICAgIHwgMjAwbXMgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICB8DQoNCkxvb2tzIHJlYXNvbmFibGUgdG8gbWUuDQoNClNvLCB0aGUgc3RyYWNlIG91dHB1
dCByZXBlYXRzIHRoaXMsIHdpdGggc21hbGwgdmFyaWF0aW9ucyBpbiB0aGUgbnVtYmVyIG9mIGJ5
dGVzIHdyaXR0ZW4gaW4gZWFjaCBzZW5kdG8oKSBjYWxsOg0KDQpyZWN2ZnJvbSgxMCwgMHg3Zjlk
ZmVjMWY3NTMsIDUsIDAsIDAsIDApID0gLTEgRUFHQUlOIChSZXNvdXJjZSB0ZW1wb3JhcmlseSB1
bmF2YWlsYWJsZSkgcmVhZCg3LCAuLi4sIDEzMTA3MikgPSAxMzEwNzIgc2VuZHRvKDEwLCAuLi4s
IDM2OTAsIDAsIE5VTEwsIDApID0gMzY5MCBzZW5kdG8oMTAsIC4uLiwgMzYwNSwgMCwgTlVMTCwg
MCkgPSAzNjA1IHNlbmR0bygxMCwgLi4uLCAzNjY5LCAwLCBOVUxMLCAwKSA9IDM2Njkgc2VuZHRv
KDEwLCAuLi4sIDM2NTMsIDAsIE5VTEwsIDApID0gMzY1MyBzZW5kdG8oMTAsIC4uLiwgMzYzNywg
MCwgTlVMTCwgMCkgPSAzNjM3IHNlbmR0bygxMCwgLi4uLCAzNjIxLCAwLCBOVUxMLCAwKSA9IDM2
MjEgc2VuZHRvKDEwLCAuLi4sIDM2NjksIDAsIE5VTEwsIDApID0gMzY2OSBzZW5kdG8oMTAsIC4u
LiwgMzYwNSwgMCwgTlVMTCwgMCkgPSAzNjA1IHNlbmR0bygxMCwgLi4uLCA1MywgMCwgTlVMTCwg
MCkgPSA1Mw0KZ2V0cHBpZCgpICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgID0gNTQ3NA0K
DQpUaGUgcmVhZCgpIHJlYWRzIDEyOGtCIG9mIFdBTCBmcm9tIHRoZSBmaWxlIGluIHBnX3hsb2cs
IGFuZCBpdCBpcyB0aGVuIHNlbnQgdG8gdGhlIGNsaWVudC4gTm93LCBpdCdzIHByZXR0eSBzdXJw
cmlzaW5nIHRoYXQgdGhlIG51bWJlciBvZiBieXRlcyB3cml0dGVuIHdpdGggc2VuZHRvKCkgZG9l
c24ndCBtYXRjaCB0aGUgbnVtYmVyIG9mIGJ5dGVzIHJlYWQgZnJvbSB0aGUgZmlsZS4gVW5sZXNz
LCB5b3UncmUgdXNpbmcgU1NMIHdpdGggY29tcHJlc3Npb24uIFRoYXQgd291bGQgYWxzbyBleHBs
YWluIHRoZSBoaWdoIGFtb3VudCBvZiBDUFUgdGltZSBzcGVudCBpbiBsaWJ6IGluIHRoZSBwZXJm
IG91dHB1dC4NCg0KVHJ5IGRpc2FibGluZyBTU0wgY29tcHJlc3Npb24sIHdpdGggc3NsY29tcHJl
c3Npb249MCBvcHRpb24gaW4gcHJpbWFyeV9jb25uaW5mbywgb3IgZGlzYWJsZSBTU0wgYWx0b2dl
dGhlciBpZiB0aGUgbmV0d29yayBpcyB0cnVzdHdvcnRoeSBlbm91Z2ggZm9yIHRoYXQuDQoNCi0g
SGVpa2tpDQo=

Re: BUG #9635: Wal sender process is using 100% CPU

От
Andres Freund
Дата:
Hi,

On 2014-05-07 03:02:22 +0000, Jamie Koceniak wrote:
> The ssl option is on in postgres.conf but we don't have any hostssl lines or cert configured as an authentication
methodin our pg_hba.conf file. 

Add sslmode=disable to the standby's primary_conninfo to be certain. The
profile suggests ssl is active. 'host' allows ssl connections, it just
doesn't force them to be used.

Also, zlib seems to be a contention point. If that profile is from a
standby, could it be that you're concurrently generating base backups?
Or compressing files in a archive_command?

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: BUG #9635: Wal sender process is using 100% CPU

От
Jamie Koceniak
Дата:
Hi Andres,

The ssl option is on in postgres.conf but we don't have any hostssl lines o=
r cert configured as an authentication method in our pg_hba.conf file.

In our pg_hba.conf we have configured each replication host as trust:
host    replication     all             xx.xxx.xxx.xx/32        trust

I replied in a previous email that we have a job that inserts 30+ million r=
ecords into a table.
During the loading of this table, the wal_sender cpu spikes to 100% and app=
ears to stay at 100% until the replication is completed on the slave.
I have been able to reproduce this.  Other than offloading this job, is the=
re anything we can do to improve the replication lag?

Thanks!

-----Original Message-----
From: Andres Freund [mailto:andres@2ndquadrant.com]=20
Sent: Tuesday, May 06, 2014 3:42 PM
To: Jamie Koceniak
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #9635: Wal sender process is using 100% CPU

Hi,

On 2014-03-19 17:13:56 +0000, jkoceniak@mediamath.com wrote:
> Periodically throughout the day, we keep seeing the wal sender process=20
> utilize 100% of the CPU. We began noticing this after we added 2 new=20
> slave servers, going from 2 to 4 slaves. See top results and I also=20
> included our wal settings. Thanks!

A typical reason for this is that the standbys are connecting over ssl. Is =
that the case? If so, intentionally?

Greetings,

Andres Freund

--=20
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services