array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion

Поиск
Список
Период
Сортировка
От Frank van Vugt
Тема array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion
Дата
Msg-id 3839201.Nfa2RvcheX@techfox.foxi
обсуждение исходный текст
Ответы Re: array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion  (Valentine Gogichashvili <valgog@gmail.com>)
Re: array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion  (Tomas Vondra <tv@fuzzy.cz>)
Список pgsql-bugs
L.S.

Something seems wrong here.... when applying arrag_agg() on a large recordset,
above a certain size 'things fall over' and memory usage races off until the
system runs out of it:


# select version();
                                      version
-----------------------------------------------------------------------------------
 PostgreSQL 9.2.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.7.1,
64-bit


# \! free -g
             total       used       free     shared    buffers     cached
Mem:            31          1         30          0          0          0
-/+ buffers/cache:          0         30
Swap:           31          0         31



====================
Create concatenate function and aggregate:

CREATE OR REPLACE FUNCTION comma_cat(text, text)
        RETURNS text
        LANGUAGE 'plpgsql'
        IMMUTABLE
        STRICT
        SECURITY INVOKER
        AS '
                BEGIN
                        IF $1 = '''' THEN
                                RETURN $2;
                        ELSIF $2 = '''' THEN
                                RETURN $1;
                        ELSE
                                RETURN $1 || '', '' || $2;
                        END IF;
                END;';
CREATE AGGREGATE comcat(text) ( SFUNC = comma_cat, STYPE = text );


Activate timing:

\timing
Timing is on.


Create demo data:

create temp table f as
    select id, random() as value
    from generate_series(1, 1e7::int) as f(id);

Time: 7036,917 ms



====================
Don't mind the 'usefulness' of the exact query below, I ran into this issue
when experimenting a bit using random().




On my system, using the comcat() aggregate is no problem regardless of the
size of the recordset:

with g as (select * from f limit 1e5)
    select comcat(id::text), min(value)
    from g
    group by g.value
    having count(1) > 1;
Time: 189,835 ms


with g as (select * from f limit 1e6)
    select comcat(id::text), min(value)
    from g
    group by g.value
    having count(1) > 1;
Time: 1815,756 ms


with g as (select * from f)
    select comcat(id::text), min(value)
    from g
    group by g.value
    having count(1) > 1;
Time: 18660,326 ms

====================
However, using the array_agg() this breaks (on my system ) on the largest set:

with g as (select * from f limit 1e5)
    select array_agg(id), min(value)
    from g
    group by g.value
    having count(1) > 1;
Time: 361,242 ms


with g as (select * from f limit 1e6)
    select array_agg(id), min(value)
    from g
    group by g.value
    having count(1) > 1;
Time: 3310,347 ms


with g as (select * from f)
    select array_agg(id), min(value)
    from g
    group by g.value
    having count(1) > 1;
Time: <none, fails>


=> the last query eats up all 32GB main memory in seconds, then starts on the
32GB swap (which obviously takes a bit longer to digest) until eventually the
child process gets killed by the oom-killer and postgresql restarts....






--

Best,




Frank.

В списке pgsql-bugs по дате отправления:

Предыдущее
От: digoal
Дата:
Сообщение: Re: BUG #8531: systemtap probe mark(checkpoint__done) error when i read the parameters
Следующее
От: Valentine Gogichashvili
Дата:
Сообщение: Re: array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion