Re: array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion

Поиск
Список
Период
Сортировка
От Valentine Gogichashvili
Тема Re: array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion
Дата
Msg-id CAP93muUfwRvg6EpdNZh=5viaZUUXfzm1do3EkkKBTJiC-9Jeaw@mail.gmail.com
обсуждение исходный текст
Ответ на array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion  (Frank van Vugt <ftm.van.vugt@foxi.nl>)
Ответы Re: array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion  (Tomas Vondra <tv@fuzzy.cz>)
Re: array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion  (Frank van Vugt <ftm.van.vugt@foxi.nl>)
Список pgsql-bugs
Hi Frank,

this is a little bit not relevant to the question itself. But to prevent
OOM killer from currupting your database please consider this for your
production environments:
http://www.postgresql.org/docs/9.1/static/kernel-resources.html#LINUX-MEMOR=
Y-OVERCOMMIT

=E1=83=95=E1=83=90=E1=83=9A=E1=83=94=E1=83=9C=E1=83=A2=E1=83=98=E1=83=9C =
=E1=83=92=E1=83=9D=E1=83=92=E1=83=98=E1=83=A9=E1=83=90=E1=83=A8=E1=83=95=E1=
=83=98=E1=83=9A=E1=83=98
Valentine Gogichashvili


On Sat, Oct 19, 2013 at 3:38 PM, Frank van Vugt <ftm.van.vugt@foxi.nl>wrote=
:

> L.S.
>
> Something seems wrong here.... when applying arrag_agg() on a large
> recordset,
> above a certain size 'things fall over' and memory usage races off until
> the
> system runs out of it:
>
>
> # select version();
>                                       version
>
> -------------------------------------------------------------------------=
----------
>  PostgreSQL 9.2.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.7.=
1,
> 64-bit
>
>
> # \! free -g
>              total       used       free     shared    buffers     cached
> Mem:            31          1         30          0          0          0
> -/+ buffers/cache:          0         30
> Swap:           31          0         31
>
>
>
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> Create concatenate function and aggregate:
>
> CREATE OR REPLACE FUNCTION comma_cat(text, text)
>         RETURNS text
>         LANGUAGE 'plpgsql'
>         IMMUTABLE
>         STRICT
>         SECURITY INVOKER
>         AS '
>                 BEGIN
>                         IF $1 =3D '''' THEN
>                                 RETURN $2;
>                         ELSIF $2 =3D '''' THEN
>                                 RETURN $1;
>                         ELSE
>                                 RETURN $1 || '', '' || $2;
>                         END IF;
>                 END;';
> CREATE AGGREGATE comcat(text) ( SFUNC =3D comma_cat, STYPE =3D text );
>
>
> Activate timing:
>
> \timing
> Timing is on.
>
>
> Create demo data:
>
> create temp table f as
>         select id, random() as value
>         from generate_series(1, 1e7::int) as f(id);
>
> Time: 7036,917 ms
>
>
>
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> Don't mind the 'usefulness' of the exact query below, I ran into this iss=
ue
> when experimenting a bit using random().
>
>
>
>
> On my system, using the comcat() aggregate is no problem regardless of th=
e
> size of the recordset:
>
> with g as (select * from f limit 1e5)
>         select comcat(id::text), min(value)
>         from g
>         group by g.value
>         having count(1) > 1;
> Time: 189,835 ms
>
>
> with g as (select * from f limit 1e6)
>         select comcat(id::text), min(value)
>         from g
>         group by g.value
>         having count(1) > 1;
> Time: 1815,756 ms
>
>
> with g as (select * from f)
>         select comcat(id::text), min(value)
>         from g
>         group by g.value
>         having count(1) > 1;
> Time: 18660,326 ms
>
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> However, using the array_agg() this breaks (on my system ) on the largest
> set:
>
> with g as (select * from f limit 1e5)
>         select array_agg(id), min(value)
>         from g
>         group by g.value
>         having count(1) > 1;
> Time: 361,242 ms
>
>
> with g as (select * from f limit 1e6)
>         select array_agg(id), min(value)
>         from g
>         group by g.value
>         having count(1) > 1;
> Time: 3310,347 ms
>
>
> with g as (select * from f)
>         select array_agg(id), min(value)
>         from g
>         group by g.value
>         having count(1) > 1;
> Time: <none, fails>
>
>
> =3D> the last query eats up all 32GB main memory in seconds, then starts =
on
> the
> 32GB swap (which obviously takes a bit longer to digest) until eventually
> the
> child process gets killed by the oom-killer and postgresql restarts....
>
>
>
>
>
>
> --
>
> Best,
>
>
>
>
> Frank.
>
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs
>

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Frank van Vugt
Дата:
Сообщение: array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion