Обсуждение: OOM Killing on Docker while ANALYZE running

Поиск
Список
Период
Сортировка

OOM Killing on Docker while ANALYZE running

От
Jorge Daniel
Дата:

Hi guys , I'm dealing with OOM killing on Postgresql 9.4.8 running on docker , 



Docker :  Docker version 17.05.0,

"resources_template = Resources(cpu = 1,ram = 1 * GB, disk = 1 * GB)"


Using Ceph volumes as storages



------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 5079992
max total shared memory (kbytes) = 5079992
min seg size (bytes) = 1

------ Semaphore Limits --------
max number of arrays = 32000
max semaphores per array = 32000
max semaphores system wide = 1024000000
max ops per semop call = 500
semaphore max value = 32767

------ Messages Limits --------
max queues system wide = 32000
max size of message (bytes) = 8192
default max size of queue (bytes) = 16384


------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x0052e2c1 0          postgres   600        56         5
0x6b8b4567 32769      postgres   600        2028       5

postgres@d90b66757bb1:~$ ipcs -a

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x0052e2c1 0          postgres   600        56         5
0x6b8b4567 32769      postgres   600        2028       5

------ Semaphore Arrays --------
key        semid      owner      perms      nsems
0x0052e2c1 0          postgres   600        17
0x0052e2c2 32769      postgres   600        17
0x0052e2c3 65538      postgres   600        17
0x0052e2c4 98307      postgres   600        17
0x0052e2c5 131076     postgres   600        17
0x0052e2c6 163845     postgres   600        17
0x0052e2c7 196614     postgres   600        17

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages


The postgres installation is : 
 
PostgreSQL 9.4.8 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2, 64-bit


From postgresql.conf :

shared_buffers = 500MB                    # min 128kB
temp_buffers = 80MB                     # min 800kB
work_mem = 80MB                         # min 64kB
maintenance_work_mem = 80MB              # min 1MB
autovacuum_work_mem = 300MB             # min 1MB, or -1 to use maintenance_work_mem
effective_cache_size = 500MB





While running 

 /usr/bin/pgbench --initialize --fillfactor=100 --no-vacuum --scale=1000 pgbench
.........
100000000 of 100000000 tuples (100%) done (elapsed 189.29 s, remaining 0.00 s).
set primary keys...
done.

2018-01-25 10:19:19.652 PST rhost=[local] app=psql:user=postgres:db=postgres:LOG:  disconnection: session time: 0:00:14.824 user=postgres database=postgres host=[local]
2018-01-25 10:19:33.371 PST rhost=[local] app=[unknown]:user=[unknown]:db=[unknown]:LOG:  connection received: host=[local]
2018-01-25 10:19:33.371 PST rhost=[local] app=[unknown]:user=postgres:db=pgbench:LOG:  connection authorized: user=postgres database=pgbench
2018-01-25 10:24:55.640 PST rhost=[local] app=pgbench:user=postgres:db=pgbench:LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp812.0", size 1073741824
2018-01-25 10:24:55.640 PST rhost=[local] app=pgbench:user=postgres:db=pgbench:STATEMENT:  alter table pgbench_accounts add primary key (aid)
2018-01-25 10:24:55.677 PST rhost=[local] app=pgbench:user=postgres:db=pgbench:LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp812.1", size 928219136
2018-01-25 10:24:55.677 PST rhost=[local] app=pgbench:user=postgres:db=pgbench:STATEMENT:  alter table pgbench_accounts add primary key (aid)
2018-01-25 10:24:55.752 PST rhost=[local] app=pgbench:user=postgres:db=pgbench:LOG:  disconnection: session time: 0:05:22.381 user=postgres database=pgbench host=[local]
2018-01-25 10:26:29.937 PST rhost= app=:user=:db=:LOG:  server process (PID 2709) was terminated by signal 9: Killed
2018-01-25 10:26:29.937 PST rhost= app=:user=:db=:DETAIL:  Failed process was running: autovacuum: ANALYZE public.pgbench_accounts
2018-01-25 10:26:29.937 PST rhost= app=:user=:db=:LOG:  terminating any other active server processes
2018-01-25 10:26:29.937 PST rhost= app=:user=:db=:WARNING:  terminating connection because of crash of another server process
2018-01-25 10:26:29.937 PST rhost= app=:user=:db=:DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2018-01-25 10:26:29.937 PST rhost= app=:user=:db=:HINT:  In a moment you should be able to reconnect to the database and repeat your command.


pgbench=# select count(*) from pgbench_accounts ;
   count
-----------
 100000000
(1 row)

pgbench=# SELECT pg_size_pretty(pg_total_relation_size('pgbench_accounts'));
 pg_size_pretty
----------------
 15 GB
(1 row)


The kern.log shows: 

Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818867] watch invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=0-1, order=0, oom_score_adj=0
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818868] watch cpuset=d90b66757bb18658502690ae68d4ea365f922ed40855c15b5eb5d074be38c223 mems_allowed=0-1
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818873] CPU: 29 PID: 1911120 Comm: watch Tainted: G           OE   4.10.0-42-generic #46~16.04.1-Ubuntu
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818874] Hardware name: Supermicro SYS-F628R3-RTB+-SS30C/X10DRFR, BIOS 2.0c 04/13/2017
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818875] Call Trace:
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818881]  dump_stack+0x63/0x90
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818886]  dump_header+0x7b/0x1fd
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818891]  ? mem_cgroup_scan_tasks+0xc4/0xf0
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818895]  oom_kill_process+0x219/0x3e0
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818897]  out_of_memory+0x120/0x4b0
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818898]  mem_cgroup_out_of_memory+0x4b/0x80
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818900]  mem_cgroup_oom_synchronize+0x325/0x340
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818901]  ? memory_high_write+0xe0/0xe0
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818902]  pagefault_out_of_memory+0x36/0x80
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818907]  mm_fault_error+0x8f/0x190
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818909]  __do_page_fault+0x4b2/0x4e0
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818910]  do_page_fault+0x22/0x30
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818914]  page_fault+0x28/0x30
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818916] RIP: 0033:0x7fedbf439830
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818916] RSP: 002b:00007ffeb543c288 EFLAGS: 00010202
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818917] RAX: 00007fedbf7c1de0 RBX: 00007fedbf7c1de0 RCX: 00007fedbf4bef20
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818918] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 00007fedbf57c209
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818918] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818919] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000001
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818919] R13: ffffffffffffffff R14: 00000000ffffffff R15: 00000000ffffffff
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818920] Task in /docker/d90b66757bb18658502690ae68d4ea365f922ed40855c15b5eb5d074be38c223 killed as a result of limit of /docker/d90b6675
7bb18658502690ae68d4ea365f922ed40855c15b5eb5d074be38c223
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818924] memory: usage 1048576kB, limit 1048576kB, failcnt 7864663
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818924] memory+swap: usage 1048576kB, limit 2097152kB, failcnt 0
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818925] kmem: usage 42116kB, limit 9007199254740988kB, failcnt 0
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818925] Memory cgroup stats for /docker/d90b66757bb18658502690ae68d4ea365f922ed40855c15b5eb5d074be38c223: cache:2344KB rss:1004116KB rss
_huge:0KB mapped_file:256KB dirty:4KB writeback:0KB swap:0KB inactive_anon:518092KB active_anon:486012KB inactive_file:1436KB active_file:732KB unevictable:0KB
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.818940] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.819358] [1877175]     0 1877175     4899       72      13       3        0             0 docker-entrypoi
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.819361] [1877277]     0 1877277    12050       93      27       3        0             0 su
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.819362] [1877279]    26 1877279   164354      335      51       4        0             0 postgres
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.819363] [1877284]    26 1877284   164598      473      51       4        0             0 postgres
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.819364] [1877285]    26 1877285   164417      335      49       4        0             0 postgres
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.819366] [1877286]    26 1877286   164417      335      49       4        0             0 postgres
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.819367] [1877287]    26 1877287   164517      388      50       4        0             0 postgres
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.819368] [1877288]    26 1877288    26624      335      49       4        0             0 postgres
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.819369] [1877289]    26 1877289    26659      351      50       4        0             0 postgres
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.819385] [1878251]     0 1878251     4947      126      13       3        0             0 bash
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.819388] [1911120]     0 1911120     3519      459      13       3        0             0 watch
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.819404] [1943707]     0 1943707     4946      125      15       3        0             0 bash
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.819411] [1946711]     0 1946711    12050       96      29       3        0             0 su
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.819412] [1946712]    26 1946712     4545      122      14       3        0             0 bash
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.819435] [2015235]    26 2015235   422880   249393     540       5        0             0 postgres
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.819443] Memory cgroup out of memory: Kill process 2015235 (postgres) score 953 or sacrifice child
Jan 25 10:26:29 lon-r10-u17 kernel: [1737264.828834] Killed process 2015235 (postgres) total-vm:1691520kB, anon-rss:997572kB, file-rss:0kB, shmem-rss:0kB

Again I reply the escenario but now  monitoring the used memory inside the container, and it really shows a bump when I shoot the ANALYZE :

pgbench=# analyze verbose pgbench_accounts;
INFO:  analyzing "public.pgbench_accounts"
INFO:  "pgbench_accounts": scanned 1639345 of 1639345 pages, containing 100000000 live rows and 0 dead rows; 3000000 rows in sample, 100000000 estimated total rows
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

Meantime 
$ while true ; do cat /sys/fs/cgroup/memory/memory.usage_in_b  sleep 1; done

82620416
82821120
82780160
83017728
82903040
82993152
83054592
82894848
82964480
83017728
675348480  ------>>ANALYZE 
1073606656
1073700864
1073684480
1073479680
1073704960
1073598464
1073713152
1073590272
1073733632
1073610752
1073467392
1073729536
1073717248
1073582080
1073594368
1073598464
1073741824
1073700864
1073623040
1073700864
1073610752
1073487872
1073496064
1073565696
1073717248
1073238016
1073717248
1073717248
1073647616
1073676288
1073524736
1073692672
1073717248
1073631232
1073696768
1073496064
1073627136
1073627136
1073635328
1073561600
1073545216
710078464 ------> KILL
85487616
85274624
85237760

It's clear that the whole container memory was in fact filled-up with the something happening in analyze, but then if the work_mem is set, why it's breaking outside its memory limits? Has anyone saw  this before? 

The actual case is happening in my Prod clusters , with large tables. 
I replicated the issue  with pgbench to have  a simpler escenario . 

Any thoguhts?



Re: OOM Killing on Docker while ANALYZE running

От
Alvaro Herrera
Дата:
Jorge Daniel wrote:
> Hi guys , I'm dealing with OOM killing on Postgresql 9.4.8 running on docker ,

What is your statistics target?  analyze is supposed to acquire samples
of the data, not the whole table ...

> pgbench=# analyze verbose pgbench_accounts;
> INFO:  analyzing "public.pgbench_accounts"
> INFO:  "pgbench_accounts": scanned 1639345 of 1639345 pages, containing 100000000 live rows and 0 dead rows; 3000000
rowsin sample, 100000000 estimated total rows
 

Here it seems to be saying that it reads all 1.6 million pages ...

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: OOM Killing on Docker while ANALYZE running

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Jorge Daniel wrote:
>> Hi guys , I'm dealing with OOM killing on Postgresql 9.4.8 running on docker ,

> What is your statistics target?  analyze is supposed to acquire samples
> of the data, not the whole table ...

>> pgbench=# analyze verbose pgbench_accounts;
>> INFO:  analyzing "public.pgbench_accounts"
>> INFO:  "pgbench_accounts": scanned 1639345 of 1639345 pages, containing 100000000 live rows and 0 dead rows; 3000000
rowsin sample, 100000000 estimated total rows 

The standard setting of default_statistics_target would only result in
trying to acquire a 30000-row sample, so the problem here is having
cranked that up beyond what the machine can sustain.  AFAIR we do not
consider maintenance_work_mem as imposing a limit on the sample size.

            regards, tom lane


Re: OOM Killing on Docker while ANALYZE running

От
Jorge Daniel
Дата:

HI Alvaro: 


These are our values


default_statistics_target
---------------------------
 10000

 
 relpages | reltuples
----------+-----------
  1639345 |     1e+08


                      Table "public.pgbench_accounts"
  Column  |     Type      | Modifiers | Storage  | Stats target | Description
----------+---------------+-----------+----------+--------------+-------------
 aid      | integer       | not null  | plain    |              |
 bid      | integer       |           | plain    |              |
 abalance | integer       |           | plain    |              |
 filler   | character(84) |           | extended |              |
Indexes:
    "pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
Options: fillfactor=100



I guess you hit the knotch: 


pgbench=# set default_statistics_target =10;
SET
pgbench=# analyze verbose pgbench_accounts ;
INFO:  analyzing "public.pgbench_accounts"
INFO:  "pgbench_accounts": scanned 3000 of 1639345 pages, containing 183000 live rows and 0 dead rows; 3000 rows in sample, 100000000 estimated total rows
ANALYZE
pgbench=# set default_statistics_target =100;
SET
pgbench=# analyze verbose pgbench_accounts ;
INFO:  analyzing "public.pgbench_accounts"
INFO:  "pgbench_accounts": scanned 30000 of 1639345 pages, containing 1830000 live rows and 0 dead rows; 30000 rows in sample, 100000001 estimated total rows
ANALYZE
pgbench=# set default_statistics_target =1000;
SET
pgbench=# analyze verbose pgbench_accounts ;
INFO:  analyzing "public.pgbench_accounts"
INFO:  "pgbench_accounts": scanned 300000 of 1639345 pages, containing 18300000 live rows and 0 dead rows; 300000 rows in sample, 100000008 estimated total rows
ANALYZE
pgbench=# set default_statistics_target =10000;
SET
pgbench=# analyze verbose pgbench_accounts ;
INFO:  analyzing "public.pgbench_accounts"
INFO:  "pgbench_accounts": scanned 1639345 of 1639345 pages, containing 100000000 live rows and 0 dead rows; 3000000 rows in sample, 100000000 estimated total rows
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

Kind regards 


Jorge Daniel Fernandez 



From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Sent: Thursday, January 25, 2018 4:47 PM
To: Jorge Daniel
Cc: pgsql-admin@lists.postgresql.org; pgsql-admin@postgresql.org; dyuryeva@medallia.com
Subject: Re: OOM Killing on Docker while ANALYZE running
 
Jorge Daniel wrote:
> Hi guys , I'm dealing with OOM killing on Postgresql 9.4.8 running on docker ,

What is your statistics target?  analyze is supposed to acquire samples
of the data, not the whole table ...

> pgbench=# analyze verbose pgbench_accounts;
> INFO:  analyzing "public.pgbench_accounts"
> INFO:  "pgbench_accounts": scanned 1639345 of 1639345 pages, containing 100000000 live rows and 0 dead rows; 3000000 rows in sample, 100000000 estimated total rows

Here it seems to be saying that it reads all 1.6 million pages ...

--
Álvaro Herrera                https://www.2ndQuadrant.com/


PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: OOM Killing on Docker while ANALYZE running

От
Ray Stell
Дата:
On 1/25/18 3:48 PM, Tom Lane wrote:

> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
>> Jorge Daniel wrote:
>>> Hi guys , I'm dealing with OOM killing on Postgresql 9.4.8 running on docker

Mike Coleman's talk at DockerCon17 is here:
http://training.play-with-docker.com/ops-stage1/
At about the 38 minute mark, on the topic of transitioning enterprise 
apps to docker:
"define your guardrails,  leave the db out in the VM."  Mike is a docker 
evangelist.



Re: OOM Killing on Docker while ANALYZE running

От
Mark Kirkwood
Дата:
On 26/01/18 15:51, Ray Stell wrote:

> On 1/25/18 3:48 PM, Tom Lane wrote:
>
>> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
>>> Jorge Daniel wrote:
>>>> Hi guys , I'm dealing with OOM killing on Postgresql 9.4.8 running 
>>>> on docker
>
> Mike Coleman's talk at DockerCon17 is here:
> http://training.play-with-docker.com/ops-stage1/
> At about the 38 minute mark, on the topic of transitioning enterprise 
> apps to docker:
> "define your guardrails,  leave the db out in the VM."  Mike is a 
> docker evangelist.
>
>

Hmm - I don't think docker per se is the problem here - that 
default_statistics_target of 10000 is the more likely culprit. A minimal 
allowance of resources (e.g memory in this case) combined with hugely 
over spec'ed memory use (the default_statistics_target setting) is gonna 
trigger OOM for any environment.

regards

Mark