Re: Properly handle OOM death?
От | Adrian Klaver |
---|---|
Тема | Re: Properly handle OOM death? |
Дата | |
Msg-id | 17d66a4c-4b05-106e-b2c7-e2babbd31de7@aklaver.com обсуждение исходный текст |
Ответ на | Properly handle OOM death? (Israel Brewster <ijbrewster@alaska.edu>) |
Ответы |
Re: Properly handle OOM death?
(Israel Brewster <ijbrewster@alaska.edu>)
|
Список | pgsql-general |
On 3/13/23 10:21 AM, Israel Brewster wrote: > I’m running a postgresql 13 database on an Ubuntu 20.04 VM that is a bit > more memory constrained than I would like, such that every week or so > the various processes running on the machine will align badly and the > OOM killer will kick in, killing off postgresql, as per the following > journalctl output: > > Mar 12 04:04:23 novarupta systemd[1]: postgresql@13-main.service: A > process of this unit has been killed by the OOM killer. > Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Failed > with result 'oom-kill'. > Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: > Consumed 5d 17h 48min 24.509s CPU time. > > And the service is no longer running. > > When this happens, I go in and restart the postgresql service, and > everything is happy again for the next week or two. > > Obviously this is not a good situation. Which leads to two questions: > > 1) is there some tweaking I can do in the postgresql config itself to > prevent the situation from occurring in the first place? > 2) My first thought was to simply have systemd restart postgresql > whenever it is killed like this, which is easy enough. Then I looked at > the default unit file, and found these lines: > > # prevent OOM killer from choosing the postmaster (individual backends will > # reset the score to 0) > OOMScoreAdjust=-900 > # restarting automatically will prevent "pg_ctlcluster ... stop" from > working, > # so we disable it here. Also, the postmaster will restart by itself on most > # problems anyway, so it is questionable if one wants to enable external > # automatic restarts. > #Restart=on-failure > > Which seems to imply that the OOM killer should only be killing off > individual backends, not the entire cluster to begin with - which should > be fine. And also that adding the restart=on-failure option is probably > not the greatest idea. Which makes me wonder what is really going on? You might want to read: https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT > > Thanks. > > --- > Israel Brewster > Software Engineer > Alaska Volcano Observatory > Geophysical Institute - UAF > 2156 Koyukuk Drive > Fairbanks AK 99775-7320 > Work: 907-474-5172 > cell: 907-328-9145 > -- Adrian Klaver adrian.klaver@aklaver.com
В списке pgsql-general по дате отправления: