Re: Properly handle OOM death?

Поиск

Список

Период

Сортировка

От	Adrian Klaver
Тема	Re: Properly handle OOM death?
Дата	13 марта 2023 г. 17:28:12
Msg-id	17d66a4c-4b05-106e-b2c7-e2babbd31de7@aklaver.com обсуждение исходный текст
Ответ на	Properly handle OOM death? (Israel Brewster <ijbrewster@alaska.edu>)
Ответы	Re: Properly handle OOM death?
Список	pgsql-general

Дерево обсуждения

On 3/13/23 10:21 AM, Israel Brewster wrote:
> I’m running a postgresql 13 database on an Ubuntu 20.04 VM that is a bit 
> more memory constrained than I would like, such that every week or so 
> the various processes running on the machine will align badly and the 
> OOM killer will kick in, killing off postgresql, as per the following 
> journalctl output:
> 
> Mar 12 04:04:23 novarupta systemd[1]: postgresql@13-main.service: A 
> process of this unit has been killed by the OOM killer.
> Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Failed 
> with result 'oom-kill'.
> Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: 
> Consumed 5d 17h 48min 24.509s CPU time.
> 
> And the service is no longer running.
> 
> When this happens, I go in and restart the postgresql service, and 
> everything is happy again for the next week or two.
> 
> Obviously this is not a good situation. Which leads to two questions:
> 
> 1) is there some tweaking I can do in the postgresql config itself to 
> prevent the situation from occurring in the first place?
> 2) My first thought was to simply have systemd restart postgresql 
> whenever it is killed like this, which is easy enough. Then I looked at 
> the default unit file, and found these lines:
> 
> # prevent OOM killer from choosing the postmaster (individual backends will
> # reset the score to 0)
> OOMScoreAdjust=-900
> # restarting automatically will prevent "pg_ctlcluster ... stop" from 
> working,
> # so we disable it here. Also, the postmaster will restart by itself on most
> # problems anyway, so it is questionable if one wants to enable external
> # automatic restarts.
> #Restart=on-failure
> 
> Which seems to imply that the OOM killer should only be killing off 
> individual backends, not the entire cluster to begin with - which should 
> be fine. And also that adding the restart=on-failure option is probably 
> not the greatest idea. Which makes me wonder what is really going on?

You might want to read:

https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT

> 
> Thanks.
> 
> ---
> Israel Brewster
> Software Engineer
> Alaska Volcano Observatory
> Geophysical Institute - UAF
> 2156 Koyukuk Drive
> Fairbanks AK 99775-7320
> Work: 907-474-5172
> cell:  907-328-9145
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Properly handle OOM death?