I set the ulimit for data to 4194304k:
core file size (blocks, -c) 0
data seg size (kbytes, -d) 4194304
file size (blocks, -f) unlimited
pending signals (-i) 1024
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 8192
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 32255
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
but after running a while the process still grew past 10 gig and created
a swap storm (as reported by top):
6266 postgres 15 0 11.2g 3.6g 732 D 9 91.7 21:29.01 postgres:
user2 owl 199.107.233.201(45564) EXPLAIN
... and I had to kill -INT it to keep the system responsive.
Tom Lane wrote:
> Richard Huxton <dev@archonet.com> writes:
>> Let's see if that hash-join is really the culprit. Can you run EXPLAIN
>> and then EXPLAIN ANALYSE on the query, but first issue:
>> SET enable_hashjoin=off;
>> If that make little difference, try the same with enable_hashagg.
>
> It seems like it must be the hashagg step --- hashjoin spills to disk in
> an orderly fashion when it reaches work_mem, but hashagg doesn't (yet).
> However, if we know that there're only going to be 60K hashagg entries,
> how could the memory get blown out by that? Perhaps there's a memory
> leak here somewhere.
>
> Please restart your postmaster under a reasonable ulimit setting, so
> that it will get ENOMEM rather than going into swap hell, and then try
> the query again. When it runs up against the ulimit it will give an
> "out of memory" error and dump some per-context memory usage info into
> the postmaster log. That info is what we need to see.
>
> regards, tom lane