SIGDANGER and oomd

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема SIGDANGER and oomd
Дата
Msg-id CAEepm=2GXoCL3qGw=3dhNLSFghmXOAwzyPx9fDbjF7taACS9nA@mail.gmail.com
обсуждение исходный текст
Список pgsql-hackers
Hello hackers,

On AIX, which I think might have been a pioneer in overcommitting
memory (?), they have the wonderfully named signal SIGDANGER.  It's
delivered to every process on the system as a final courtesy some time
before the OOM killer starts delivering SIGKILL.  The theory is that
you might be able to give back memory to avoid the carnage.

Most (all?) popular operating systems overcommit memory, and from time
to time someone proposes that they should have SIGDANGER too.  I've
often wondered if it makes sense or not as a concept: although you
might manage to avoid getting killed for a while, is your access
pattern sustainable?  On the other hand, maybe if you're less afraid
of the OOM you could use resources better.  I'm not sure, but perhaps
it's just too blunt an instrument.  Certainly it would be possible for
PostgreSQL to respond to SIGDANGER by dropping caches and perhaps even
old idle backends via a CHECK_FOR_INTERRUPTS() handler, and combined
with log monitoring so that you know you have a serious problem and
need to consider making changes so it doesn't happen again, I think
over all it's probably a good thing to support, because your system
might actually survive the storm.  This question was somewhat
hypothetical, because Linux doesn't have it, and I don't expect we
have too many AIX users.  Neither does FreeBSD, though I've been
tempted to propose it there.

Recently Facebook published oomd[1][2] for Linux.  I haven't tried it
(for one thing you currently need a patched kernel which is a barrier
to casual investigation) but basically it's a better OOM mousetrap
that runs in userspace.  It could be interesting for PostgreSQL
installations, because you can configure it to be smarter about what
to kill, and you can teach it to give you feedback and warnings about
pressure.  It may be that no modification to PostgreSQL is required at
all to make good use of this, or it may be that you'd want a new
signalling mechanism that we'd have to provide, amounting to a
home-made equivalent of SIGDANGER (or something richer).

I don't have a specific proposal, but thought this was newsworthy and
might give someone some ideas.

[1] https://github.com/facebookincubator/oomd
[2] https://code.fb.com/production-engineering/open-sourcing-oomd-a-new-approach-to-handling-ooms/

--
Thomas Munro
http://www.enterprisedb.com


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Query is over 2x slower with jit=on
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: pgsql: Allow concurrent-safe open() and fopen() in frontend codefor Wi