Re: Postgresql OOM

Поиск

Список

Период

Сортировка

От	Radu Radutiu
Тема	Re: Postgresql OOM
Дата	12 июня 14:28:56
Msg-id	CAG4TxrjjjR3BfRWsV3z0c2BL9XZueFPG0uGxeHrhCC+92g05Gg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Postgresql OOM (Andres Freund <andres@anarazel.de>)
Список	pgsql-hackers

Дерево обсуждения

Hi,

That's helpful, thanks!

One thing to note is that postgres' work_mem is confusing - it applies to
individual query execution nodes, not the whole query. Additionally, when you
use parallel workers, each of the parallel workers can use the "full"
work_mem, rather than work_mem being evenly distributed across workers.

Within that, the memory usage in the EXPLAIN ANALYZE isn't entirely unexpected
(I'd say it's unreasonable if you're not a postgres dev, but that's a
different issue).

We can see each of the Hash nodes use ~1GB, which is due to
(1 leader + 4 workers) * work_mem = 5 * 200MB = 1GB.

In this specific query we probably could free the memory in the "lower" hash
join nodes once the node directly above has finished building, but we don't
have the logic for that today.

I would understand 1 GB, even 2GB (considering hash_mem_multiplier) but the server ran out of memory with more than 64 GB.

Of course that doesn't explain why the memory usage / temp file creation is so
extreme with a lower limit / fewer workers. There aren't any bad statistics
afaics, nor can I really see a potential for a significant skew, it looks to
me that the hashtables are all built on a single text field and that nearly
all the input rows are distinct.

Could you post the table definition (\d+) and the database definition
(\l). One random guess I have is that you ended up with a "non-deterministic"
collation for that column and that we end up with a bad hash due to that.

I was able to eliminate the columns not involved in the query while still retaining the problematic behavior (that's the reason for the new table names):

postgres=# SELECT datname, datcollate FROM pg_database WHERE datname = current_database();
datname | datcollate
----------+-------------
postgres | en_US.UTF-8
(1 row)

A dump of the two tables above can be found at https://drive.google.com/file/d/1ep1MYjNzlFaICL3GlPZaMdpOxRG6WCGz/view?usp=sharing (compressed size 1GB; size of database after import 14 GB ).

# prepare my_query (timestamp,bigint) as SELECT t.input_sequence, rec_tro.output_sequence, r.input_sequence, rpl_rec_tro.output_sequence, rpl_snd_tro.output_sequence, snd_tro.output_sequence, t.msg_type FROM inreq t LEFT JOIN outreq rec_tro ON rec_tro.input_sequence = t.input_sequence LEFT JOIN inreq r ON r.originalRequest_id = t.input_sequence LEFT JOIN outreq rpl_rec_tro ON rpl_rec_tro.input_sequence = r.input_sequence LEFT JOIN outreq rpl_snd_tro ON rpl_snd_tro.reply_input_sequence = r.input_sequence LEFT JOIN outreq snd_tro ON snd_tro.reply_input_sequence = t.input_sequence WHERE t.receive_time < $1 AND t.input_sequence < $2 AND t.msg_status IN ('COMPLETED', 'REJECTED') ORDER BY t.msg_status DESC, t.input_sequence ;

# EXPLAIN (ANALYZE,BUFFERS) EXECUTE my_query('2024-05-18 00:00:00', 202406020168279904);

Best Regards,

Radu

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Bertrand Drouvot
Дата: 12 июня, 14:27:23
Сообщение: Re: Track the amount of time waiting due to cost_delay

Следующее

От: Dilip Kumar
Дата: 12 июня, 14:38:49
Сообщение: Re: Logical Replication of sequences

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Postgresql OOM

Предыдущее

Следующее