Re: postmaster consuming /lots/ of memory with hash aggregate. why?

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: postmaster consuming /lots/ of memory with hash aggregate. why?
Дата
Msg-id AANLkTik9RA2YqQszZJZP90Wb0NbkhoMWJAz0gJoknPLi@mail.gmail.com
обсуждение исходный текст
Ответ на Re: postmaster consuming /lots/ of memory with hash aggregate. why?  (Jon Nelson <jnelson+pgsql@jamponi.net>)
Ответы Re: postmaster consuming /lots/ of memory with hash aggregate. why?  (Jon Nelson <jnelson+pgsql@jamponi.net>)
Список pgsql-performance
Hello

look on EXPLAIN ANALYZE command. Probably your statistic are out, and
then planner can be confused. EXPLAIN ANALYZE statement show it.

Regards

Pavel Stehule

2010/11/12 Jon Nelson <jnelson+pgsql@jamponi.net>:
> On Fri, Nov 5, 2010 at 7:26 PM, Jon Nelson <jnelson+pgsql@jamponi.net> wrote:
>> I have a postgres 8.4.5 instance on CentOS 5 (x86_64) which appears to
>> go crazy with the amount of memory it consumes.
>> When I run the query below, in a matter of a few seconds memory
>> balloons to 5.3G (virtual), 4.6G (resident) and 1840 (shared), and
>> eventually the oom killer is invoked, killing the entire process.
>>
>> Physical memory is 8GB but other processes on the box consume
>> approximately 4GB of that.
>>
>> The settings changed from their defaults:
>>
>> effective_cache_size = 4GB
>> work_mem = 16MB
>> maintenance_work_mem = 128MB
>> wal_buffers = 16MB
>> checkpoint_segments = 16
>> shared_buffers = 384MB
>> checkpoint_segments = 64
>>
>> and
>>
>> default_statistics_target = 100
>>
>> The query is this:
>>
>> insert into d_2010_09_13_sum
>>        select FOO.i, FOO.n, sum(FOO.cc) as cc, sum(FOO.oc) as oc
>>        from (
>>          select * from d_2010_09_12_sum
>>          union all
>>          select * from d_2010_09_13
>>        ) AS FOO group by i, n;
>>
>> here is the explain:
>>
>>  Subquery Scan "*SELECT*"  (cost=1200132.06..1201332.06 rows=40000 width=80)
>>   ->  HashAggregate  (cost=1200132.06..1200732.06 rows=40000 width=41)
>>         ->  Append  (cost=0.00..786531.53 rows=41360053 width=41)
>>               ->  Seq Scan on d_2010_09_12_sum  (cost=0.00..520066.48
>> rows=27272648 width=42)
>>               ->  Seq Scan on d_2010_09_13  (cost=0.00..266465.05
>> rows=14087405 width=40)
>>
>> Both source tables freshly vacuum analyze'd.
>> The row estimates are correct for both source tables.
>>
>> If I use "set enable_hashagg = false" I get this plan:
>>
>>  Subquery Scan "*SELECT*"  (cost=8563632.73..9081838.25 rows=40000 width=80)
>>   ->  GroupAggregate  (cost=8563632.73..9081238.25 rows=40000 width=41)
>>         ->  Sort  (cost=8563632.73..8667033.84 rows=41360441 width=41)
>>               Sort Key: d_2010_09_12_sum.i, d_2010_09_12_sum.n
>>               ->  Result  (cost=0.00..786535.41 rows=41360441 width=41)
>>                     ->  Append  (cost=0.00..786535.41 rows=41360441 width=41)
>>                           ->  Seq Scan on d_2010_09_12_sum
>> (cost=0.00..520062.04 rows=27272204 width=42)
>>                           ->  Seq Scan on d_2010_09_13
>> (cost=0.00..266473.37 rows=14088237 width=40)
>>
>> and postmaster's memory never exceeds (roughly) 548M (virtual), 27M
>> (resident), 5M (shared).
>>
>> I even set default_statistics_target to 1000 and re-ran "vacuum
>> analyze verbose" on both tables - no change.
>> If I set work_mem to 1MB (from 16MB) then the GroupAggregate variation
>> is chosen instead.
>> Experimentally, HashAggregate is chosen when work_mem is 16MB, 8MB,
>> 6MB, 5MB but not 4MB and on down.
>>
>> Two things I don't understand:
>>
>> 1. Why, when hash aggregation is allowed, does memory absolutely
>> explode (eventually invoking the wrath of the oom killer). 16MB for
>> work_mem does not seem outrageously high. For that matter, neither
>> does 5MB.
>>
>> 2. Why do both HashAggregate and GroupAggregate say the cost estimate
>> is 40000 rows?
>
> Unfortunately, I've found that as my database size grows, I've
> generally had to disable hash aggregates for fear of even simple
> seeming queries running out of memory, even with work_mem = 1MB.
>
> In some cases I saw memory usage (with hashagg) grow to well over 5GB
> and with group aggregate it barely moves.  Am *I* doing something
> wrong? Some of these queries are on partitioned tables (typically
> querying the parent) and the resulting UNION or UNION ALL really
> starts to hurt, and when the server runs out of memory and kills of
> the postmaster process a few minutes or even hours into the query it
> doesn't make anybody very happy.
>
> Is there some setting I can turn on to look to see when memory is
> being allocated (and, apparently, not deallocated)?
>
> The latest query has a HashAggregate that looks like this:
> HashAggregate  (cost=19950525.30..19951025.30 rows=40000 width=37)
> but there are, in reality, approximately 200 million rows (when I run
> the query with GroupAggregate, that's what I get).
>
> Why does it keep choosing 40,000 rows?
>
> I suppose I could use the newly-learned ALTER USER trick to disable
> hash aggregation for the primary user, because disabling hash
> aggregation system-wide sounds fairly drastic. However, if I *don't*
> disable it, the query quickly balloons memory usage to the point where
> the process is killed off.
>
> --
> Jon
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

В списке pgsql-performance по дате отправления:

Предыдущее
От: Jon Nelson
Дата:
Сообщение: Re: postmaster consuming /lots/ of memory with hash aggregate. why?
Следующее
От: Jon Nelson
Дата:
Сообщение: Re: postmaster consuming /lots/ of memory with hash aggregate. why?