On 17.5.2014 19:55, Tom Lane wrote:
> Tomas Vondra <tv@fuzzy.cz> writes:
>> ... then of course the usual 'terminating connection because of crash of
>> another server process' warning. Apparently, it's getting killed by the
>> OOM killer, because it exhausts all the memory assigned to that VM (2GB).
>
> Can you fix things so it runs into its process ulimit before the OOM killer
> triggers? Then we'd get a memory map dumped to stderr, which would be
> helpful in localizing the problem.
I did this in /etc/security/limits.d/80-pgbuild.conf:
pgbuild hard as 1835008
so the user the buildfarm runs under will have up to ~1.75GB of RAM (of
the 2GB available to the container).
>
>> ... So this seems like a
>> memory leak somewhere in the cache invalidation code.
>
> Smells that way to me too, but let's get some more evidence.
The tests are already running, and there are a few postgres processes:
PID VIRT RES %CPU TIME+ COMMAND
11478 449m 240m 100.0 112:53.57 postgres: pgbuild regression [local]
CREATE VIEW
11423 219m 19m 0.0 0:00.17 postgres: checkpointer process
11424 219m 2880 0.0 0:00.05 postgres: writer process
11425 219m 5920 0.0 0:00.12 postgres: wal writer process
11426 219m 2708 0.0 0:00.05 postgres: autovacuum launcher process
11427 79544 1836 0.0 0:00.17 postgres: stats collector process
11479 1198m 1.0g 0.0 91:09.99 postgres: pgbuild regression [local]
CREATE INDEX waiting
Attached is 'pmap -x' output for the two interesting processes (11478,
11479).
Tomas