Re: postmaster consuming /lots/ of memory with hash aggregate. why?
От | Pierre C |
---|---|
Тема | Re: postmaster consuming /lots/ of memory with hash aggregate. why? |
Дата | |
Msg-id | op.vlqq6vxteorkce@apollo13 обсуждение исходный текст |
Ответ на | postmaster consuming /lots/ of memory with hash aggregate. why? (Jon Nelson <jnelson+pgsql@jamponi.net>) |
Ответы |
Re: postmaster consuming /lots/ of memory with hash
aggregate. why?
|
Список | pgsql-performance |
> 2. Why do both HashAggregate and GroupAggregate say the cost estimate > is 40000 rows? I've reproduced this : CREATE TABLE popo AS SELECT (x%1000) AS a,(x%1001) AS b FROM generate_series( 1,1000000 ) AS x; VACUUM ANALYZE popo; EXPLAIN ANALYZE SELECT a,b,count(*) FROM (SELECT * FROM popo UNION ALL SELECT * FROM popo) AS foo GROUP BY a,b; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=43850.00..44350.00 rows=40000 width=8) (actual time=1893.441..2341.780 rows=1000000 loops=1) -> Append (cost=0.00..28850.00 rows=2000000 width=8) (actual time=0.025..520.581 rows=2000000 loops=1) -> Seq Scan on popo (cost=0.00..14425.00 rows=1000000 width=8) (actual time=0.025..142.639 rows=1000000 loops=1) -> Seq Scan on popo (cost=0.00..14425.00 rows=1000000 width=8) (actual time=0.003..114.257 rows=1000000 loops=1) Total runtime: 2438.741 ms (5 lignes) Temps : 2439,247 ms I guess the row count depends on the correlation of a and b, which pg has no idea about. In the first example, there is no correlation, now with full correlation : UPDATE popo SET a=b; VACUUM FULL popo; VACUUM FULL popo; ANALYZE popo; EXPLAIN ANALYZE SELECT a,b,count(*) FROM (SELECT * FROM popo UNION ALL SELECT * FROM popo) AS foo GROUP BY a,b; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=43850.00..44350.00 rows=40000 width=8) (actual time=1226.201..1226.535 rows=1001 loops=1) -> Append (cost=0.00..28850.00 rows=2000000 width=8) (actual time=0.008..518.068 rows=2000000 loops=1) -> Seq Scan on popo (cost=0.00..14425.00 rows=1000000 width=8) (actual time=0.007..128.609 rows=1000000 loops=1) -> Seq Scan on popo (cost=0.00..14425.00 rows=1000000 width=8) (actual time=0.005..128.502 rows=1000000 loops=1) Total runtime: 1226.797 ms
В списке pgsql-performance по дате отправления: