Обсуждение: Abnormally high memory usage/OOM triggered

Поиск
Список
Период
Сортировка

Abnormally high memory usage/OOM triggered

От
Davlet Panech
Дата:
Hello,

I'm troubleshooting a problem with a Postgres installation (Linux): a 
client process got killed by OOM while executing an update statement, 
how can I avoid it in the future?

 From kernel logs it appears that the client process was using ~ 19GB of 
total virtual RAM when it was killed, which seems way too high.

Does my configuration look reasonable? I just don't understand how it 
could possibly use up 19 GB of memory based on the configuration below. 
Is there a memory leak in there somewhere?

I'm using Postgres 9.4.8 on x86_64-redhat-linux-gnu with 16GB of 
physical RAM and 8GB of swap space.

Postgres configuration:
=======================

   wal_level                      = hot_standby
   max_wal_senders                = 3
   checkpoint_segments            = 20
   checkpoint_completion_target   = 0.8
   wal_keep_segments              = 500
   hot_standby                    = on
   max_standby_streaming_delay    = 10s
   maintenance_work_mem           = 128MB
   wal_sender_timeout             = 20s
   wal_receiver_status_interval   = 10s

   shared_buffers                 = 2560MB
   maintenance_work_mem           = 256MB
   autovacuum_max_workers         = 3
   autovacuum_work_mem            = -1
   work_mem                       = 15695kB
   effective_cache_size           = 8192MB
   max_connections                = 200

/var/log/messages:
=======================

(See line that says "Killed process 10540 ..." towards the end)

Jan 16 17:08:37 aimapp1 kernel: ubiatn invoked oom-killer: 
gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
Jan 16 17:08:37 aimapp1 kernel: ubiatn cpuset=/ mems_allowed=0-1
Jan 16 17:08:37 aimapp1 kernel: Pid: 7181, comm: ubiatn Not tainted 
2.6.32-504.el6.x86_64 #1
Jan 16 17:08:37 aimapp1 kernel: Call Trace:
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff810d40c1>] ? 
cpuset_print_task_mems_allowed+0x91/0xb0
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff81127300>] ? 
dump_header+0x90/0x1b0
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff8122ea2c>] ? 
security_real_capable_noaudit+0x3c/0x70
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff81127782>] ? 
oom_kill_process+0x82/0x2a0
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff811276c1>] ? 
select_bad_process+0xe1/0x120
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff81127bc0>] ? 
out_of_memory+0x220/0x3c0
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff811344df>] ? 
__alloc_pages_nodemask+0x89f/0x8d0
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff8116c69a>] ? 
alloc_pages_current+0xaa/0x110
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff811246f7>] ? 
__page_cache_alloc+0x87/0x90
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff811240de>] ? 
find_get_page+0x1e/0xa0
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff81125697>] ? 
filemap_fault+0x1a7/0x500
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff8114eae4>] ? __do_fault+0x54/0x530
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff8114f0b7>] ? 
handle_pte_fault+0xf7/0xb00
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff81063bf3>] ? 
perf_event_task_sched_out+0x33/0x70
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff810097cc>] ? 
__switch_to+0x1ac/0x320
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff8114fcea>] ? 
handle_mm_fault+0x22a/0x300
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff815299be>] ? 
thread_return+0x4e/0x7d0
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff8104d0d8>] ? 
__do_page_fault+0x138/0x480
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff810a3def>] ? 
hrtimer_try_to_cancel+0x3f/0xd0
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff810a3ea2>] ? 
hrtimer_cancel+0x22/0x30
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff8152c053>] ? 
do_nanosleep+0x93/0xc0
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff810a3f74>] ? 
hrtimer_nanosleep+0xc4/0x180
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff810a2dd0>] ? 
hrtimer_wakeup+0x0/0x30
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff8152ffbe>] ? 
do_page_fault+0x3e/0xa0
Jan 16 17:08:37 aimapp1 kernel: [<ffffffff8152d375>] ? page_fault+0x25/0x30
Jan 16 17:08:37 aimapp1 kernel: Mem-Info:
Jan 16 17:08:37 aimapp1 kernel: Node 0 DMA per-cpu:
Jan 16 17:08:37 aimapp1 kernel: CPU    0: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    1: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    2: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    3: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    4: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    5: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    6: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    7: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    8: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    9: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   10: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   11: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   12: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   13: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   14: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   15: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   16: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   17: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   18: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   19: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   20: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   21: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   22: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   23: hi:    0, btch:   1 usd:   0
Jan 16 17:08:37 aimapp1 kernel: Node 0 DMA32 per-cpu:
Jan 16 17:08:37 aimapp1 kernel: CPU    0: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    1: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    2: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    3: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    4: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    5: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    6: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    7: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    8: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    9: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   10: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   11: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   12: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   13: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   14: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   15: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   16: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   17: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   18: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   19: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   20: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   21: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   22: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   23: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: Node 0 Normal per-cpu:
Jan 16 17:08:37 aimapp1 kernel: CPU    0: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    1: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    2: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    3: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    4: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    5: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    6: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    7: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    8: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    9: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   10: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   11: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   12: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   13: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   14: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   15: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   16: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   17: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   18: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   19: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   20: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   21: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   22: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   23: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: Node 1 Normal per-cpu:
Jan 16 17:08:37 aimapp1 kernel: CPU    0: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    1: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    2: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    3: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    4: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    5: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    6: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    7: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    8: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU    9: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   10: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   11: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   12: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   13: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   14: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   15: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   16: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   17: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   18: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   19: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   20: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   21: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   22: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: CPU   23: hi:  186, btch:  31 usd:   0
Jan 16 17:08:37 aimapp1 kernel: active_anon:3428843 inactive_anon:500193 
isolated_anon:0
Jan 16 17:08:37 aimapp1 kernel: active_file:311 inactive_file:90 
isolated_file:0
Jan 16 17:08:37 aimapp1 kernel: unevictable:0 dirty:0 writeback:48 
unstable:0
Jan 16 17:08:37 aimapp1 kernel: free:32365 slab_reclaimable:7217 
slab_unreclaimable:19877
Jan 16 17:08:37 aimapp1 kernel: mapped:558347 shmem:665176 
pagetables:26928 bounce:0
Jan 16 17:08:37 aimapp1 kernel: Node 0 DMA free:15748kB min:84kB 
low:104kB high:124kB active_anon:0kB inactive_anon:0kB active_file:0kB 
inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB 
present:15364kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB 
slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB 
pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 
all_unreclaimable? yes
Jan 16 17:08:37 aimapp1 kernel: lowmem_reserve[]: 0 1848 7908 7908
Jan 16 17:08:37 aimapp1 kernel: Node 0 DMA32 free:34592kB min:10404kB 
low:13004kB high:15604kB active_anon:1219496kB inactive_anon:341812kB 
active_file:20kB inactive_file:0kB unevictable:0kB isolated(anon):0kB 
isolated(file):0kB present:1892572kB mlocked:0kB dirty:0kB 
writeback:36kB mapped:180384kB shmem:257528kB slab_reclaimable:4456kB 
slab_unreclaimable:3264kB kernel_stack:32kB pagetables:14376kB 
unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:69 
all_unreclaimable? no
Jan 16 17:08:37 aimapp1 kernel: lowmem_reserve[]: 0 0 6060 6060
Jan 16 17:08:37 aimapp1 kernel: Node 0 Normal free:33700kB min:34120kB 
low:42648kB high:51180kB active_anon:5321596kB inactive_anon:759600kB 
active_file:992kB inactive_file:460kB unevictable:0kB isolated(anon):0kB 
isolated(file):0kB present:6205440kB mlocked:0kB dirty:0kB 
writeback:136kB mapped:941364kB shmem:1114560kB slab_reclaimable:10220kB 
slab_unreclaimable:52948kB kernel_stack:4312kB pagetables:39156kB 
unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:1387 
all_unreclaimable? no
Jan 16 17:08:37 aimapp1 kernel: lowmem_reserve[]: 0 0 0 0
Jan 16 17:08:37 aimapp1 kernel: Node 1 Normal free:45420kB min:45496kB 
low:56868kB high:68244kB active_anon:7174280kB inactive_anon:899360kB 
active_file:232kB inactive_file:16kB unevictable:0kB isolated(anon):0kB 
isolated(file):0kB present:8273920kB mlocked:0kB dirty:0kB 
writeback:20kB mapped:1111640kB shmem:1288616kB slab_reclaimable:14192kB 
slab_unreclaimable:23296kB kernel_stack:1000kB pagetables:54180kB 
unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:82 
all_unreclaimable? no
Jan 16 17:08:37 aimapp1 kernel: lowmem_reserve[]: 0 0 0 0
Jan 16 17:08:37 aimapp1 kernel: Node 0 DMA: 3*4kB 3*8kB 2*16kB 0*32kB 
1*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15748kB
Jan 16 17:08:37 aimapp1 kernel: Node 0 DMA32: 587*4kB 482*8kB 338*16kB 
188*32kB 70*64kB 35*128kB 8*256kB 8*512kB 2*1024kB 0*2048kB 0*4096kB = 
34780kB
Jan 16 17:08:37 aimapp1 kernel: Node 0 Normal: 795*4kB 493*8kB 272*16kB 
145*32kB 74*64kB 40*128kB 22*256kB 6*512kB 0*1024kB 0*2048kB 0*4096kB = 
34676kB
Jan 16 17:08:37 aimapp1 kernel: Node 1 Normal: 1113*4kB 712*8kB 358*16kB 
226*32kB 104*64kB 41*128kB 21*256kB 11*512kB 1*1024kB 0*2048kB 0*4096kB 
= 47044kB
Jan 16 17:08:37 aimapp1 kernel: 721314 total pagecache pages
Jan 16 17:08:37 aimapp1 kernel: 55586 pages in swap cache
Jan 16 17:08:37 aimapp1 kernel: Swap cache stats: add 16359283, delete 
16303697, find 15346604869/15347200740
Jan 16 17:08:37 aimapp1 kernel: Free swap  = 0kB
Jan 16 17:08:37 aimapp1 kernel: Total swap = 8388604kB
Jan 16 17:08:37 aimapp1 kernel: 4194303 pages RAM
Jan 16 17:08:37 aimapp1 kernel: 144188 pages reserved
Jan 16 17:08:37 aimapp1 kernel: 2433569 pages shared
Jan 16 17:08:37 aimapp1 kernel: 3447964 pages non-shared
Jan 16 17:08:37 aimapp1 kernel: [ pid ]   uid  tgid total_vm      rss 
cpu oom_adj oom_score_adj name
Jan 16 17:08:37 aimapp1 kernel: [ 1042]     0  1042     2663        8 
6     -17         -1000 udevd
Jan 16 17:08:37 aimapp1 kernel: [ 2445]     0  2445    23283       35 
19     -17         -1000 auditd
Jan 16 17:08:37 aimapp1 kernel: [ 2475]     0  2475    62991      609 
18       0             0 rsyslogd
Jan 16 17:08:37 aimapp1 kernel: [ 2541]     0  2541     4585       55 
12       0             0 irqbalance
Jan 16 17:08:37 aimapp1 kernel: [ 2557]    32  2557     4744       15 
0       0             0 rpcbind
Jan 16 17:08:37 aimapp1 kernel: [ 2577]    29  2577     5837        1 
0       0             0 rpc.statd
Jan 16 17:08:37 aimapp1 kernel: [ 2608]     0  2608   154350     4393 
1       0             0 corosync
Jan 16 17:08:37 aimapp1 kernel: [ 2711]    81  2711     5391        9 
0       0             0 dbus-daemon
Jan 16 17:08:37 aimapp1 kernel: [ 2729]     0  2729    47352        1 
6       0             0 cupsd
Jan 16 17:08:37 aimapp1 kernel: [ 2758]     0  2758     1020        0 
6       0             0 acpid
Jan 16 17:08:37 aimapp1 kernel: [ 2768]    68  2768     9714      292 
6       0             0 hald
Jan 16 17:08:37 aimapp1 kernel: [ 2769]     0  2769     5100        5 
8       0             0 hald-runner
Jan 16 17:08:37 aimapp1 kernel: [ 2801]     0  2801     5630        9
7       0             0 hald-addon-inpu
Jan 16 17:08:37 aimapp1 kernel: [ 2810]    68  2810     4502        1 
0       0             0 hald-addon-acpi
Jan 16 17:08:37 aimapp1 kernel: [ 2837]     0  2837   113175       47 
12       0             0 automount
Jan 16 17:08:37 aimapp1 kernel: [ 2881]     0  2881     1565        0 
19       0             0 mcelog
Jan 16 17:08:37 aimapp1 kernel: [ 2897]     0  2897    16673        8 
6     -17         -1000 sshd
Jan 16 17:08:37 aimapp1 kernel: [ 2906]    38  2906     6628       42 
0       0             0 ntpd
Jan 16 17:08:37 aimapp1 kernel: [ 2915]   496  2915    10338       17 
7       0             0 nrpe
Jan 16 17:08:37 aimapp1 kernel: [ 3019]     0  3019    20333       32 
6       0             0 master
Jan 16 17:08:37 aimapp1 kernel: [ 3040]    89  3040    20399       39 
0       0             0 qmgr
Jan 16 17:08:37 aimapp1 kernel: [ 3045]     0  3045    28661        9 
0       0             0 abrtd
Jan 16 17:08:37 aimapp1 kernel: [ 3057]     0  3057    65362       45 
0       0             0 httpd
Jan 16 17:08:37 aimapp1 kernel: [ 3068]     0  3068    29341       30 
0       0             0 crond
Jan 16 17:08:37 aimapp1 kernel: [ 3090]     0  3090     5394        4 
12       0             0 atd
Jan 16 17:08:37 aimapp1 kernel: [ 3165]     0  3165    26868       25 
19       0             0 pacemakerd
Jan 16 17:08:37 aimapp1 kernel: [ 3173]   189  3173    28658     2694 
9       0             0 cib
Jan 16 17:08:37 aimapp1 kernel: [ 3175]     0  3175    26870     1336 
13       0             0 stonithd
Jan 16 17:08:37 aimapp1 kernel: [ 3176]     0  3176    17978      116 
7       0             0 lrmd
Jan 16 17:08:37 aimapp1 kernel: [ 3177]   189  3177    23416      849 
0       0             0 attrd
Jan 16 17:08:37 aimapp1 kernel: [ 3179]   189  3179    26847       19 
12       0             0 pengine
Jan 16 17:08:37 aimapp1 kernel: [ 3877]     0  3877     1016        1 
6       0             0 mingetty
Jan 16 17:08:37 aimapp1 kernel: [ 3879]     0  3879     1016        1 
19       0             0 mingetty
Jan 16 17:08:37 aimapp1 kernel: [ 3881]     0  3881     1016        1 
22       0             0 mingetty
Jan 16 17:08:37 aimapp1 kernel: [ 3883]     0  3883     1016        1 
10       0             0 mingetty
Jan 16 17:08:37 aimapp1 kernel: [ 3885]     0  3885     1016        1 
21       0             0 mingetty
Jan 16 17:08:37 aimapp1 kernel: [ 6813]     0  6813     2662        7 
6     -17         -1000 udevd
Jan 16 17:08:37 aimapp1 kernel: [ 6814]     0  6814     2662        7 
6     -17         -1000 udevd
Jan 16 17:08:37 aimapp1 kernel: [ 1006]     0  1006  1029170        8 
12       0             0 console-kit-dae
Jan 16 17:08:37 aimapp1 kernel: [ 8046]     0  8046     1016        6 
1       0             0 mingetty
Jan 16 17:08:37 aimapp1 kernel: [15669]    26 15669   723376    12231 
16     -17         -1000 postgres
Jan 16 17:08:37 aimapp1 kernel: [15671]    26 15671    44567       40 
1       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [15673]    26 15673   723800   284257 
6       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [15674]    26 15674   723732    31588 
6       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [15675]    26 15675    45277      149 
0       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [15764]    26 15764   723697     2788 
11       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [15765]    26 15765   723834      175 
6       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [16521]    26 16521   723914      164 
16       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [ 6474]   189  6474    32269      397 
0       0             0 crmd
Jan 16 17:08:37 aimapp1 kernel: [14919]     0 14919    24626       88 
5       0             0 sshd
Jan 16 17:08:37 aimapp1 kernel: [15679]     0 15679    27580        8 
12       0             0 bash
Jan 16 17:08:37 aimapp1 kernel: [10151]    26 10151   729494   337404 
12       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [10153]    26 10153   728940      993 
1       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [10156]    26 10156   725774     1153 
15       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [10157]    26 10157   725921     1061 
0       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [10244]    26 10244   732881     3691 
1       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [10245]    26 10245   728956      986 
18       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [10248]    26 10248   732628     4575 
13       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [10273]    26 10273   729495   337372 
1       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [30924]     0 30924    25736     4025 
2       0             0 crm_mon
Jan 16 17:08:37 aimapp1 kernel: [ 4951]     0  4951    24689      118 
5       0             0 sshd
Jan 16 17:08:37 aimapp1 kernel: [ 5360]     0  5360    27613        8 
6       0             0 bash
Jan 16 17:08:37 aimapp1 kernel: [21108]     0 21108    24592       82 
17       0             0 sshd
Jan 16 17:08:37 aimapp1 kernel: [21171]     0 21171    27613       63 
7       0             0 bash
Jan 16 17:08:37 aimapp1 kernel: [31495]     0 31495    24936       95 
0       0             0 sshd
Jan 16 17:08:37 aimapp1 kernel: [31590]     0 31590    14463        9 
12       0             0 sftp-server
Jan 16 17:08:37 aimapp1 kernel: [ 7174]   500  7174  1115232    68649 
0       0             0 ubiatn
Jan 16 17:08:37 aimapp1 kernel: [14606]     0 14606    25237       19 
2       0             0 tail
Jan 16 17:08:37 aimapp1 kernel: [ 8278]    26  8278   724157     1098 
1       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [ 8468]    26  8468   726500     3147 
9       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [ 8679]    26  8679   727303     4279 
2       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [10540]    26 10540  4738280  3427932 
18       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [ 1276]    26  1276   724168   377697 
9       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [ 1962]    26  1962   724145   377746 
1       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [29437]    48 29437    65362       35 
15       0             0 httpd
Jan 16 17:08:37 aimapp1 kernel: [29438]    48 29438    65362       37 
3       0             0 httpd
Jan 16 17:08:37 aimapp1 kernel: [29439]    48 29439    65362       35 
15       0             0 httpd
Jan 16 17:08:37 aimapp1 kernel: [29440]    48 29440    65362       35 
17       0             0 httpd
Jan 16 17:08:37 aimapp1 kernel: [29441]    48 29441    65362       36 
5       0             0 httpd
Jan 16 17:08:37 aimapp1 kernel: [29442]    48 29442    65362       35 
17       0             0 httpd
Jan 16 17:08:37 aimapp1 kernel: [29443]    48 29443    65362       35 
5       0             0 httpd
Jan 16 17:08:37 aimapp1 kernel: [29444]    48 29444    65362       35 
17       0             0 httpd
Jan 16 17:08:37 aimapp1 kernel: [24913]    91 24913  2447253   216518 
20       0             0 java
Jan 16 17:08:37 aimapp1 kernel: [22150]    26 22150   729385     1230 
19       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [25111]    26 25111   729385     1214 
19       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [23940]    26 23940   723934      479 
19       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [21174]    26 21174   724201    48139 
7       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [27824]    26 27824   724201    76093 
18       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [16949]    26 16949   729385     1181 
20       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: [  685]    89   685    20353      243 
0       0             0 pickup
Jan 16 17:08:37 aimapp1 kernel: [20901]    26 20901   729417     1162
20       0             0 postgres
Jan 16 17:08:37 aimapp1 kernel: Out of memory: Kill process 10540 
(postgres) score 738 or sacrifice child
Jan 16 17:08:37 aimapp1 kernel: Killed process 10540, UID 26, (postgres) 
total-vm:18953120kB, anon-rss:11637464kB, file-rss:2074240kB


Re: Abnormally high memory usage/OOM triggered

От
scott ribe
Дата:
On Jan 17, 2018, at 2:57 PM, Davlet Panech <dpanech@gmail.com> wrote:
>
> Does my configuration look reasonable? I just don't understand how it could possibly use up 19 GB of memory based on
theconfiguration below. Is there a memory leak in there somewhere? 

It does seem awfully high, but... An update can involve a join across multiple tables. Or an update can run a trigger
whichcan cascade. Either of those could result in an "accidental cross product" join, which can always blow up memory. 

--
Scott Ribe
https://www.linkedin.com/in/scottribe/
(303) 722-0567



Re: Abnormally high memory usage/OOM triggered

От
Tom Lane
Дата:
Davlet Panech <dpanech@gmail.com> writes:
> I'm troubleshooting a problem with a Postgres installation (Linux): a 
> client process got killed by OOM while executing an update statement, 
> Is there a memory leak in there somewhere?
> I'm using Postgres 9.4.8 on x86_64-redhat-linux-gnu with 16GB of 
> physical RAM and 8GB of swap space.

I see a possibly relevant entry in the 9.4.10 release notes:

      Fix query-lifespan memory leak in a bulk UPDATE on a table
      with a PRIMARY KEY or REPLICA IDENTITY index

Looking at the relevant commit (ae4760d66), it seems the leak was
just a few bytes per row, but if the update touches enough rows ...

            regards, tom lane


Re: Abnormally high memory usage/OOM triggered

От
Davlet Panech
Дата:
On 1/17/2018 5:57 PM, scott ribe wrote:
> On Jan 17, 2018, at 2:57 PM, Davlet Panech <dpanech@gmail.com> wrote:
>>
>> Does my configuration look reasonable? I just don't understand how it could possibly use up 19 GB of memory based on
theconfiguration below. Is there a memory leak in there somewhere?
 
> 
> It does seem awfully high, but... An update can involve a join across multiple tables. Or an update can run a trigger
whichcan cascade. Either of those could result in an "accidental cross product" join, which can always blow up memory.
 
There must be a way to put an upper limit on memory even for such cases. 
I was under the impression that parameters such as "work_mem" serve that 
purpose, is that not the case? So an "accidental cross product" join's 
memory usage is unbounded? It can't be... could somebody confirm this 
please?

Thanks,
D.


Re: Abnormally high memory usage/OOM triggered

От
scott ribe
Дата:
On Jan 18, 2018, at 10:13 AM, Davlet Panech <dpanech@gmail.com> wrote:
>
> On 1/17/2018 5:57 PM, scott ribe wrote:
>> On Jan 17, 2018, at 2:57 PM, Davlet Panech <dpanech@gmail.com> wrote:
>>>
>>> Does my configuration look reasonable? I just don't understand how it could possibly use up 19 GB of memory based
onthe configuration below. Is there a memory leak in there somewhere? 
>> It does seem awfully high, but... An update can involve a join across multiple tables. Or an update can run a
triggerwhich can cascade. Either of those could result in an "accidental cross product" join, which can always blow up
memory.
> There must be a way to put an upper limit on memory even for such cases. I was under the impression that parameters
suchas "work_mem" serve that purpose, is that not the case? So an "accidental cross product" join's memory usage is
unbounded?It can't be... could somebody confirm this please? 

You are correct as far as I know, so yeah, that case should result in filling disk, not RAM

--
Scott Ribe
https://www.linkedin.com/in/scottribe/
(303) 722-0567



Re: Abnormally high memory usage/OOM triggered

От
Tom Lane
Дата:
Davlet Panech <dpanech@gmail.com> writes:
> On 1/17/2018 5:57 PM, scott ribe wrote:
>> It does seem awfully high, but... An update can involve a join across multiple tables. Or an update can run a
triggerwhich can cascade. Either of those could result in an "accidental cross product" join, which can always blow up
memory.


> There must be a way to put an upper limit on memory even for such cases.
> I was under the impression that parameters such as "work_mem" serve that
> purpose, is that not the case? So an "accidental cross product" join's
> memory usage is unbounded? It can't be... could somebody confirm this
> please?

A large join result could blow out memory on the client side, unless the
client is careful to read it in segments, which most clients aren't.
I expect the server to be smarter though.

            regards, tom lane


Re: Abnormally high memory usage/OOM triggered

От
Keith
Дата:


On Thu, Jan 18, 2018 at 12:13 PM, Davlet Panech <dpanech@gmail.com> wrote:
On 1/17/2018 5:57 PM, scott ribe wrote:
On Jan 17, 2018, at 2:57 PM, Davlet Panech <dpanech@gmail.com> wrote:

Does my configuration look reasonable? I just don't understand how it could possibly use up 19 GB of memory based on the configuration below. Is there a memory leak in there somewhere?

It does seem awfully high, but... An update can involve a join across multiple tables. Or an update can run a trigger which can cascade. Either of those could result in an "accidental cross product" join, which can always blow up memory.
There must be a way to put an upper limit on memory even for such cases. I was under the impression that parameters such as "work_mem" serve that purpose, is that not the case? So an "accidental cross product" join's memory usage is unbounded? It can't be... could somebody confirm this please?

Thanks,
D.


work_mem isn't really an upper limit on overall memory usage. It's just an upper limit on how much is used in certain processes before spilling to disk. A query or group of queries can easily use up all of system memory if it's complex enough by using multiple instances of work_mem. This is why work_mem shouldn't be set any higher than necessary. The wiki explains this better


"This size is applied to each and every sort done by each user, and complex queries can use multiple working memory sort buffers. Set it to 50MB, and have 30 users submitting queries, and you are soon using 1.5GB of real memory. "

I would go with Tom's suggestion in this case, though, since that bug seems to fit the situation described by the patch he found. It's always important to be running the latest patch release to rule out a bug being the cause of an issue.

Keith

Re: Abnormally high memory usage/OOM triggered

От
Davlet Panech
Дата:
On 1/18/2018 12:45 PM, Keith wrote:
> 
> 
> On Thu, Jan 18, 2018 at 12:13 PM, Davlet Panech <dpanech@gmail.com 
> <mailto:dpanech@gmail.com>> wrote:
> 
>     On 1/17/2018 5:57 PM, scott ribe wrote:
> 
>         On Jan 17, 2018, at 2:57 PM, Davlet Panech <dpanech@gmail.com
>         <mailto:dpanech@gmail.com>> wrote:
> 
> 
>             Does my configuration look reasonable? I just don't
>             understand how it could possibly use up 19 GB of memory
>             based on the configuration below. Is there a memory leak in
>             there somewhere?
> 
> 
>         It does seem awfully high, but... An update can involve a join
>         across multiple tables. Or an update can run a trigger which can
>         cascade. Either of those could result in an "accidental cross
>         product" join, which can always blow up memory.
> 
>     There must be a way to put an upper limit on memory even for such
>     cases. I was under the impression that parameters such as "work_mem"
>     serve that purpose, is that not the case? So an "accidental cross
>     product" join's memory usage is unbounded? It can't be... could
>     somebody confirm this please?
> 
>     Thanks,
>     D.
> 
> 
> work_mem isn't really an upper limit on overall memory usage. It's just 
> an upper limit on how much is used in certain processes before spilling 
> to disk. A query or group of queries can easily use up all of system 
> memory if it's complex enough by using multiple instances of work_mem. 
> This is why work_mem shouldn't be set any higher than necessary. The 
> wiki explains this better
> 
> https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
> 
> "This size is applied to each and every sort done by each user, and 
> complex queries can use multiple working memory sort buffers. Set it to 
> 50MB, and have 30 users submitting queries, and you are soon using 1.5GB 
> of real memory. "

I understand, but in my case a single server-side postgres process used 
19GB, which (excluding shared memory etc) is something like a 100 times 
what I would expect, even for "complex" queries.

> 
> I would go with Tom's suggestion in this case, though, since that bug 
> seems to fit the situation described by the patch he found. It's always 
> important to be running the latest patch release to rule out a bug being 
> the cause of an issue.

OK, so it is likely a memory leak; I just wanted to rule out other 
explanations.

Thanks to all who replied.