Обсуждение: ERROR: out of memory | with 23GB cached 7GB reserved on 30GB machine

Поиск

Список

Период

Сортировка

ERROR: out of memory | with 23GB cached 7GB reserved on 30GB machine

От

Montana Low

Дата:

22 октября 2014 г., 01:25:36

I'm running postgres-9.3 on a 30GB ec2 xen instance w/ linux kernel 3.16.3. I receive numerous Error: out of memory messages in the log, which are aborting client requests, even though there appears to be 23GB available in the OS cache.

There is no swap on the box. Postgres is behind pgbouncer to protect from the 200 real clients, which limits connections to 32, although there are rarely more than 20 active connections, even though postgres max_connections is set very high for historic reasons. There is also a 4GB java process running on the box.

relevant postgresql.conf:

max_connections = 1000 # (change requires restart)
shared_buffers = 7GB # min 128kB
work_mem = 40MB # min 64kB
maintenance_work_mem = 1GB # min 1MB
effective_cache_size = 20GB

sysctl.conf:

vm.swappiness = 0
vm.overcommit_memory = 2
kernel.shmmax=34359738368
kernel.shmall=8388608

log example:

ERROR: out of memory
DETAIL: Failed on request of size 67108864.
STATEMENT: SELECT "package_texts".* FROM "package_texts" WHERE "package_texts"."id" = $1 LIMIT 1

example pg_top, showing 23GB available in cache:

last pid: 6607; load avg: 3.59, 2.32, 2.61; up 16+09:17:29 20:49:51
18 processes: 1 running, 17 sleeping
CPU states: 22.5% user, 0.0% nice, 4.9% system, 63.2% idle, 9.4% iowait
Memory: 29G used, 186M free, 7648K buffers, 23G cached
DB activity: 2479 tps, 1 rollbs/s, 217 buffer r/s, 99 hit%, 11994 row r/s, 3820 row w/s
DB I/O: 0 reads/s, 0 KB/s, 0 writes/s, 0 KB/s
DB disk: 149.8 GB total, 46.7 GB free (68% used)
Swap:

example top showing the only other significant 4GB process on the box:

top - 21:05:09 up 16 days, 9:32, 2 users, load average: 2.73, 2.91, 2.88
Tasks: 147 total, 3 running, 244 sleeping, 0 stopped, 0 zombie
%Cpu(s): 22.1 us, 4.1 sy, 0.0 ni, 62.9 id, 9.8 wa, 0.0 hi, 0.7 si, 0.3 st
KiB Mem: 30827220 total, 30642584 used, 184636 free, 7292 buffers
KiB Swap: 0 total, 0 used, 0 free. 23449636 cached Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7407 postgres 20 0 7604928 10172 7932 S 29.6 0.0 2:51.27 postgres
10469 postgres 20 0 7617716 176032 160328 R 11.6 0.6 0:01.48 postgres
10211 postgres 20 0 7630352 237736 208704 S 10.6 0.8 0:03.64 postgres
18202 elastic+ 20 0 8726984 4.223g 4248 S 9.6 14.4 883:06.79 java
9711 postgres 20 0 7619500 354188 335856 S 7.0 1.1 0:08.03 postgres
3638 postgres 20 0 7634552 1.162g 1.127g S 6.6 4.0 0:50.42 postgres

Re: ERROR: out of memory | with 23GB cached 7GB reserved on 30GB machine

От

Tom Lane

Дата:

22 октября 2014 г., 01:35:15

Montana Low <montanalow@gmail.com> writes:
> I'm running postgres-9.3 on a 30GB ec2 xen instance w/ linux kernel 3.16.3.
> I receive numerous Error: out of memory messages in the log, which are
> aborting client requests, even though there appears to be 23GB available in
> the OS cache.

Perhaps the postmaster is being started with a ulimit setting that
restricts process size?

            regards, tom lane

Re: ERROR: out of memory | with 23GB cached 7GB reserved on 30GB machine

От

"Tomas Vondra"

Дата:

22 октября 2014 г., 01:46:11

Dne 22 Říjen 2014, 0:25, Montana Low napsal(a):
> I'm running postgres-9.3 on a 30GB ec2 xen instance w/ linux kernel
> 3.16.3.
> I receive numerous Error: out of memory messages in the log, which are
> aborting client requests, even though there appears to be 23GB available
> in
> the OS cache.
>
> There is no swap on the box. Postgres is behind pgbouncer to protect from
> the 200 real clients, which limits connections to 32, although there are
> rarely more than 20 active connections, even though postgres
> max_connections is set very high for historic reasons. There is also a 4GB
> java process running on the box.
>
>
>
>
> relevant postgresql.conf:
>
> max_connections = 1000                  # (change requires restart)
> shared_buffers = 7GB                    # min 128kB
> work_mem = 40MB                         # min 64kB
> maintenance_work_mem = 1GB              # min 1MB
> effective_cache_size = 20GB
>
>
>
> sysctl.conf:
>
> vm.swappiness = 0
> vm.overcommit_memory = 2

This means you have 'no overcommit', so the amount of memory is limited by
overcommit_ratio + swap. The default value for overcommit_ratio is 50%
RAM, and as you have no swap that effectively means only 50% of the RAM is
available to the system.

If you want to verify this, check /proc/meminfo - see the lines
CommitLimit (the current limit) and Commited_AS (committed address space).
Once the committed_as reaches the limit, it's game over.

There are different ways to fix this, or at least improve that:

(1) increasing the overcommit_ratio (clearly, 50% is way too low -
something 90% might be more appropriate on 30GB RAM without swap)

(2) adding swap (say a small ephemeral drive, with swappiness=10 or
something like that)

Tomas

Re: ERROR: out of memory | with 23GB cached 7GB reserved on 30GB machine

От

Montana Low

Дата:

22 октября 2014 г., 01:55:47

I didn't realize that about overcommit_ratio. It was at 50, I've changed it to 95. I'll see if that clears up the problem moving forward.

# cat /proc/meminfo
MemTotal: 30827220 kB
MemFree: 153524 kB
MemAvailable: 17941864 kB
Buffers: 6188 kB
Cached: 24560208 kB
SwapCached: 0 kB
Active: 20971256 kB
Inactive: 8538660 kB
Active(anon): 12460680 kB
Inactive(anon): 36612 kB
Active(file): 8510576 kB
Inactive(file): 8502048 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 50088 kB
Writeback: 160 kB
AnonPages: 4943740 kB
Mapped: 7571496 kB
Shmem: 7553176 kB
Slab: 886428 kB
SReclaimable: 858936 kB
SUnreclaim: 27492 kB
KernelStack: 4208 kB
PageTables: 188352 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 15413608 kB
Committed_AS: 14690544 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 59012 kB
VmallocChunk: 34359642367 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 31465472 kB
DirectMap2M: 0 kB

# sysctl -a:

vm.admin_reserve_kbytes = 8192

vm.block_dump = 0

vm.dirty_background_bytes = 0

vm.dirty_background_ratio = 10

vm.dirty_bytes = 0

vm.dirty_expire_centisecs = 3000

vm.dirty_ratio = 20

vm.dirty_writeback_centisecs = 500

vm.drop_caches = 0

vm.extfrag_threshold = 500

vm.hugepages_treat_as_movable = 0

vm.hugetlb_shm_group = 0

vm.laptop_mode = 0

vm.legacy_va_layout = 0

vm.lowmem_reserve_ratio = 256 256 32

vm.max_map_count = 65530

vm.min_free_kbytes = 22207

vm.min_slab_ratio = 5

vm.min_unmapped_ratio = 1

vm.mmap_min_addr = 4096

vm.nr_hugepages = 0

vm.nr_hugepages_mempolicy = 0

vm.nr_overcommit_hugepages = 0

vm.nr_pdflush_threads = 0

vm.numa_zonelist_order = default

vm.oom_dump_tasks = 1

vm.oom_kill_allocating_task = 0

vm.overcommit_kbytes = 0

vm.overcommit_memory = 2

vm.overcommit_ratio = 50

vm.page-cluster = 3

vm.panic_on_oom = 0

vm.percpu_pagelist_fraction = 0

vm.scan_unevictable_pages = 0

vm.stat_interval = 1

vm.swappiness = 0

vm.user_reserve_kbytes = 131072

vm.vfs_cache_pressure = 100

vm.zone_reclaim_mode = 0

On Tue, Oct 21, 2014 at 3:46 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
>
> Dne 22 Říjen 2014, 0:25, Montana Low napsal(a):
> > I'm running postgres-9.3 on a 30GB ec2 xen instance w/ linux kernel
> > 3.16.3.
> > I receive numerous Error: out of memory messages in the log, which are
> > aborting client requests, even though there appears to be 23GB available
> > in
> > the OS cache.
> >
> > There is no swap on the box. Postgres is behind pgbouncer to protect from
> > the 200 real clients, which limits connections to 32, although there are
> > rarely more than 20 active connections, even though postgres
> > max_connections is set very high for historic reasons. There is also a 4GB
> > java process running on the box.
> >
> >
> >
> >
> > relevant postgresql.conf:
> >
> > max_connections = 1000 # (change requires restart)
> > shared_buffers = 7GB # min 128kB
> > work_mem = 40MB # min 64kB
> > maintenance_work_mem = 1GB # min 1MB
> > effective_cache_size = 20GB
> >
> >
> >
> > sysctl.conf:
> >
> > vm.swappiness = 0
> > vm.overcommit_memory = 2
>
> This means you have 'no overcommit', so the amount of memory is limited by
> overcommit_ratio + swap. The default value for overcommit_ratio is 50%
> RAM, and as you have no swap that effectively means only 50% of the RAM is
> available to the system.
>
> If you want to verify this, check /proc/meminfo - see the lines
> CommitLimit (the current limit) and Commited_AS (committed address space).
> Once the committed_as reaches the limit, it's game over.
>
> There are different ways to fix this, or at least improve that:
>
> (1) increasing the overcommit_ratio (clearly, 50% is way too low -
> something 90% might be more appropriate on 30GB RAM without swap)
>
> (2) adding swap (say a small ephemeral drive, with swappiness=10 or
> something like that)
>
> Tomas
>

Re: ERROR: out of memory | with 23GB cached 7GB reserved on 30GB machine

От

Montana Low

Дата:

22 октября 2014 г., 09:24:23

increasing overcommit_ratio to 95 solved the problem, the box is now using it's memory as expected without needing to resort to swap.

On Tue, Oct 21, 2014 at 3:55 PM, Montana Low <montanalow@gmail.com> wrote:

I didn't realize that about overcommit_ratio. It was at 50, I've changed it to 95. I'll see if that clears up the problem moving forward.

# cat /proc/meminfo
MemTotal: 30827220 kB
MemFree: 153524 kB
MemAvailable: 17941864 kB
Buffers: 6188 kB
Cached: 24560208 kB
SwapCached: 0 kB
Active: 20971256 kB
Inactive: 8538660 kB
Active(anon): 12460680 kB
Inactive(anon): 36612 kB
Active(file): 8510576 kB
Inactive(file): 8502048 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 50088 kB
Writeback: 160 kB
AnonPages: 4943740 kB
Mapped: 7571496 kB
Shmem: 7553176 kB
Slab: 886428 kB
SReclaimable: 858936 kB
SUnreclaim: 27492 kB
KernelStack: 4208 kB
PageTables: 188352 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 15413608 kB
Committed_AS: 14690544 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 59012 kB
VmallocChunk: 34359642367 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 31465472 kB
DirectMap2M: 0 kB

# sysctl -a:

vm.admin_reserve_kbytes = 8192
vm.block_dump = 0
vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 10
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 20
vm.dirty_writeback_centisecs = 500
vm.drop_caches = 0
vm.extfrag_threshold = 500
vm.hugepages_treat_as_movable = 0
vm.hugetlb_shm_group = 0
vm.laptop_mode = 0
vm.legacy_va_layout = 0
vm.lowmem_reserve_ratio = 256 256 32
vm.max_map_count = 65530
vm.min_free_kbytes = 22207
vm.min_slab_ratio = 5
vm.min_unmapped_ratio = 1
vm.mmap_min_addr = 4096
vm.nr_hugepages = 0
vm.nr_hugepages_mempolicy = 0
vm.nr_overcommit_hugepages = 0
vm.nr_pdflush_threads = 0
vm.numa_zonelist_order = default
vm.oom_dump_tasks = 1
vm.oom_kill_allocating_task = 0
vm.overcommit_kbytes = 0
vm.overcommit_memory = 2
vm.overcommit_ratio = 50
vm.page-cluster = 3
vm.panic_on_oom = 0
vm.percpu_pagelist_fraction = 0
vm.scan_unevictable_pages = 0
vm.stat_interval = 1
vm.swappiness = 0
vm.user_reserve_kbytes = 131072
vm.vfs_cache_pressure = 100
vm.zone_reclaim_mode = 0

On Tue, Oct 21, 2014 at 3:46 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
>
> Dne 22 Říjen 2014, 0:25, Montana Low napsal(a):
> > I'm running postgres-9.3 on a 30GB ec2 xen instance w/ linux kernel
> > 3.16.3.
> > I receive numerous Error: out of memory messages in the log, which are
> > aborting client requests, even though there appears to be 23GB available
> > in
> > the OS cache.
> >
> > There is no swap on the box. Postgres is behind pgbouncer to protect from
> > the 200 real clients, which limits connections to 32, although there are
> > rarely more than 20 active connections, even though postgres
> > max_connections is set very high for historic reasons. There is also a 4GB
> > java process running on the box.
> >
> >
> >
> >
> > relevant postgresql.conf:
> >
> > max_connections = 1000 # (change requires restart)
> > shared_buffers = 7GB # min 128kB
> > work_mem = 40MB # min 64kB
> > maintenance_work_mem = 1GB # min 1MB
> > effective_cache_size = 20GB
> >
> >
> >
> > sysctl.conf:
> >
> > vm.swappiness = 0
> > vm.overcommit_memory = 2
>
> This means you have 'no overcommit', so the amount of memory is limited by
> overcommit_ratio + swap. The default value for overcommit_ratio is 50%
> RAM, and as you have no swap that effectively means only 50% of the RAM is
> available to the system.
>
> If you want to verify this, check /proc/meminfo - see the lines
> CommitLimit (the current limit) and Commited_AS (committed address space).
> Once the committed_as reaches the limit, it's game over.
>
> There are different ways to fix this, or at least improve that:
>
> (1) increasing the overcommit_ratio (clearly, 50% is way too low -
> something 90% might be more appropriate on 30GB RAM without swap)
>
> (2) adding swap (say a small ephemeral drive, with swappiness=10 or
> something like that)
>
> Tomas
>

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: ERROR: out of memory | with 23GB cached 7GB reserved on 30GB machine

ERROR: out of memory | with 23GB cached 7GB reserved on 30GB machine

Re: ERROR: out of memory | with 23GB cached 7GB reserved on 30GB machine

Re: ERROR: out of memory | with 23GB cached 7GB reserved on 30GB machine

Re: ERROR: out of memory | with 23GB cached 7GB reserved on 30GB machine

Re: ERROR: out of memory | with 23GB cached 7GB reserved on 30GB machine