Re: BUG #17793: Query with large number of joins crashes PostgreSQL

Поиск

Список

Период

Сортировка

От	Stephen Frost
Тема	Re: BUG #17793: Query with large number of joins crashes PostgreSQL
Дата	15 февраля 2023 г. 05:00:32
Msg-id	Y+w8wDqJaaAZrB3E@tamriel.snowman.net обсуждение исходный текст
Ответ на	Re: BUG #17793: Query with large number of joins crashes PostgreSQL (Joe Conway <mail@joeconway.com>)
Список	pgsql-bugs

Дерево обсуждения

Greetings,

* Joe Conway (mail@joeconway.com) wrote:
> On 2/14/23 09:47, Tom Lane wrote:
> > Francisco Olarte <folarte@peoplecall.com> writes:
> > > On Tue, 14 Feb 2023 at 11:29, PG Bug reporting form
> > > <noreply@postgresql.org> wrote:
> > > > It looks like the OS is killing the process due to running OOM, which is not
> > > > very surprising when looking at the query. Is this expected, or should PG
> > > > have guards in place to prevent this from happening?
> >
> > > When you run postgres in an environment where someone ( OOM killer )
> > > can K9 it, protection is hard. IIRC OOM can kill you because ANOTHER
> > > process touches memory, among other things.
> >
> > Yeah.  We have no visibility into what the OOM killer will choose to
> > allow or not allow at any instant.
> >
> > > ( I do run DBs in machines w/ overcommit disabled, this prevents it
> > > from happening, but it is not Pg who prevents it ).
> >
> > If overcommit-disabled doesn't seem practical, another idea that's
> > been recommended is to start the postmaster under a "ulimit -v"
> > setting chosen to cause plain ENOMEM failures before the OOM
> > killer kicks in.

Unfortunately, that ends up being on a per-process basis and therefore
isn't as useful as we'd like it to be (ideally, we'd allow all of the
backends to allocate up until we hit some global "only use X amount of
memory" instead).

> Given:
> > It looks like the OS is killing the process due to running OOM ...
> > I could reproduce it using a fresh docker container of the image
> > "postgres:13.10".
>
> I gather the OP is running Postgres in a container. If so, and if they have
> a memory.limit set (also assuming cgroup v1, otherwise memory.max for cgroup
> v2), disabling overcommit at the host level will not prevent an OOM kill
> when the cgroup exceeds memory.limit

Indeed, we need a way to allow users to specify a limit which PG keeps
track of and prevents PG from hitting the limit.  There's work being
done to both allow users to see how much memory each backend is using as
well as tracking an overall allocation amount in shared memory across
all processes, to allow us to decide to fail an allocation in a given
backend instead of having the kernel decide to run the OOM killer.

Further, once we're managing this, we can also make choices such as
"keep X amount reserved for system processes like autovacuum" and "don't
fail allocations for system processes like autovacuum" or such (just an
example, not saying we necessarily want to do exactly that, but the
point is that it's definitely nicer when we have control instead of
getting hit with a memory allocation failure or worse the OOM killer
deciding to kill -9 PG).

Thanks,

Stephen

Вложения

signature.asc

В списке pgsql-bugs по дате отправления:

Предыдущее

От: Michael Paquier
Дата: 15 февраля 2023 г., 04:22:54
Сообщение: Re: BUG #17760: SCRAM authentication fails with "modern" (rsassaPss signature) server certificate

Следующее

От: Robins Tharakan
Дата: 15 февраля 2023 г., 07:16:13
Сообщение: Re: BUG #17791: Assert on procarray.c

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: BUG #17793: Query with large number of joins crashes PostgreSQL

Вложения

Предыдущее

Следующее