Hello all,
I have a query that forces an out of memory error, where the OS will kill the postgresql process.
The query plan (run immediately after a vacuum analyze) is at
https://explain.depesz.com/s/ITQI#html .
PostgreSQL version 16.3, running on RHEL 8.9, 16 vCPU, 64 GB RAM, 32 GB swap
shared_buffers=8G
effective_cache_size=24G
maintenance_work_mem=2G
work_mem=104857kB
default_statistics_target = 100
max_worker_processes = 16
max_parallel_workers_per_gather = 4
max_parallel_workers = 16
max_parallel_maintenance_workers = 4
jit=off
It looks like the excessive memory allocation is reported in HashSpillContext. I've attached the dump of the memory context for the 5 processes (query + 4 parallel workers) some time after query start. I also see a huge number of temporary files being created. For the time being I've set enable_parallel_hash = 'off' and the problem went away.
I've seen a potentially similar problem reported in
https://www.postgresql.org/message-id/flat/20230516200052.sbg6z4ghcmsas3wv%40liskov#f6059259c7c9251fb8c17f5793a2d427 .
Any idea on how to identify the problem? I can reproduce it on demand. Should I report it pgsql-bugs?
Best regards,
Radu