Re: OOM with many sorts

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: OOM with many sorts
Дата
Msg-id CA+hUKGJ2yNYpH5X_AFSR9721FYJ23oW7UC0qZ=HjK9xW+Oqq0w@mail.gmail.com
обсуждение исходный текст
Ответ на OOM with many sorts  (Justin Pryzby <pryzby@telsasoft.com>)
Список pgsql-general
On Tue, Jul 9, 2019 at 4:44 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
>                              [...many partial GroupAggregate/Sort repeated here for ~200 child tables...]
>
> This isn't the first time I've had to reduce work_mem on a parallel query to
> avoid OOM, but it seems unreasonable to be concerned with 50MB sorts.

This is a fundamental problem with our model of memory budgets.  To
oversimplify massively, we can use about work_mem * ntables * the
number of parallel workers.  Core counts are going up, and now we have
a convenient way to get large values of ntables.  One of many -hackers
threads to discuss the problem:

https://www.postgresql.org/message-id/flat/CAH2-WzmNwV%3DLfDRXPsmCqgmm91mp%3D2b4FvXNF%3DcCvMrb8YFLfQ%40mail.gmail.com

> It looks like the results of each Sort node stay in RAM, during processing of
> additional sort nodes (is that required?)

That's a very interesting question linked to whole-query memory
budgets.  If you don't have a whole-query memory budget, then you have
nothing to lose by keeping hash and sort results in memory, and you
gain the ability to do cheap rescans (if the node happens to be in a
nest loop).  I discussed this in the context of hash joins over here:

https://www.postgresql.org/message-id/CAEepm%3D0N6DODN7nx6Zb93YOW-y%3DRftNNFZJRaLyG6jbJHJVjsA%40mail.gmail.com

The TL;DR is that some other databases throw out eg hash tables as
soon as possible, and consider changing the shape of their join
nesting in order to minimise the total number of hash tables in memory
at once, in order to come in under a certain budget for memory used at
any one point in time.  That is, they don't just divide query_work_mem
up over all the nodes, they understand things about when nodes will
allocate and free memory.

There be some less clever things we can do, though, before we tackle
the big problems involved.  We could probably still opportunistically
give back memory sooner, when we know there is no chance of rescan,
and other things along those lines.

> Thanks in advance for any advice.

I think it's impossible to choose a single value for work_mem if you
have a mixture of types of queries that hit wildly different numbers
of partitions and workers.  I think this is an ongoing topic for
-hackers.

-- 
Thomas Munro
https://enterprisedb.com



В списке pgsql-general по дате отправления:

Предыдущее
От: Taylor Sarrafian
Дата:
Сообщение: Logical Replication for Very Large Databases
Следующее
От: "Igal @ Lucee.org"
Дата:
Сообщение: pg_dump and search_path