Re: Speeding up aggregates
От | Joe Conway |
---|---|
Тема | Re: Speeding up aggregates |
Дата | |
Msg-id | 3DF50FA5.3090000@joeconway.com обсуждение исходный текст |
Ответ на | Re: Speeding up aggregates (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Speeding up aggregates
(Tom Lane <tgl@sss.pgh.pa.us>)
|
Список | pgsql-performance |
Tom Lane wrote: > Joe Conway <mail@joeconway.com> writes: > >>Just to follow up on my last post, I did indeed find that bumping up sort_mem >>caused a switch back to HashAggregate, and a big improvement: > > >>parts=# explain analyze select i.part_id, sum(w.qty_oh) as total_oh from inv >>i, iwhs w where i.part_id = w.part_id group by i.part_id having sum(w.qty_oh) > 0; >> QUERY PLAN >>---------------------------------------------------------------------------------------------------------------------------- >> HashAggregate (cost=5254.46..5432.10 rows=35528 width=36) (actual >>time=1286.89..1399.36 rows=4189 loops=1) >> Filter: (sum(qty_oh) > 0::double precision) >> -> Hash Join (cost=1319.10..4710.31 rows=72553 width=36) (actual >>time=163.36..947.54 rows=72548 loops=1) > > > How many rows out if you drop the HAVING clause? parts=# set sort_mem to 8000; SET parts=# explain analyze select i.part_id, sum(w.qty_oh) as total_oh from inv i, iwhs w where i.part_id = w.part_id group by i.part_id; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=5617.22..5706.04 rows=35528 width=36) (actual time=1525.93..1627.41 rows=34575 loops=1) -> Hash Join (cost=1319.10..5254.45 rows=72553 width=36) (actual time=156.86..1248.73 rows=72548 loops=1) Hash Cond: ("outer".part_id = "inner".part_id) -> Seq Scan on iwhs w (cost=0.00..2121.53 rows=72553 width=22) (actual time=0.01..274.00 rows=72553 loops=1) -> Hash (cost=1230.28..1230.28 rows=35528 width=14) (actual time=156.65..156.65 rows=0 loops=1) -> Seq Scan on inv i (cost=0.00..1230.28 rows=35528 width=14) (actual time=0.03..86.86 rows=35528 loops=1) Total runtime: 1680.86 msec (7 rows) > The planner's choice of which to use is dependent on its estimate of the > required hashtable size, which is proportional to its guess about how > many distinct groups there will be. The above output doesn't tell us > that however, only how many groups passed the HAVING clause. I'm > curious about the quality of this estimate, since the code to try to > generate not-completely-bogus group count estimates is all new ... If I'm reading it correctly, it looks like the estimate in this case is pretty good. Joe
В списке pgsql-performance по дате отправления: