Re: Performance: Sum()
От | Stephan Szabo |
---|---|
Тема | Re: Performance: Sum() |
Дата | |
Msg-id | 20011108142539.R54468-100000@megazone23.bigpanda.com обсуждение исходный текст |
Ответ на | Performance: Sum() (David Link <dlink@soundscan.com>) |
Список | pgsql-general |
> Query 2: for 10 weeks: > > select > s.channel, > sum(s.units) > from > summary s > where > s.media = 'M' and > s.week > 200100 and > s.week <= 200111 > group by > s.channel > ; > > Aggregate (cost=17398.11..17765.44 rows=7347 width=24) > -> Group (cost=17398.11..17581.77 rows=73466 width=24) > -> Sort (cost=17398.11..17398.11 rows=73466 width=24) > -> Seq Scan on summary s (cost=0.00..10608.64 > rows=73466 width=24) > > Now it does not want to use any of my fine indexes (see below) to > return 73,466 rows (about 1/5th the table). Also note, the Sort stage > has a considerable cost: 6,790 or ~40% total query cost. > > Questions: > > 1. Couldn't one of the indexes speed up the fetching of the rows? Probably not right now. The tuples still need to be read in order to get the mvcc information, so after a certain point the index scan becomes more expensive as you have to read the index and the heap rows -- if you're reading almost all of the heap pages, sequence scan is most likely faster. One question is, is 73466 a reasonable estimate of the number of rows? > 2. What is the sort stage doing? Ordering the input for the group step. IIRC, the group by implementation expects its inputs in sorted order. > 3. How to optimize this querey 2? I guess a starting point would be to see how it runs with set enable_seqscan=off What does explain show and does it run faster?
В списке pgsql-general по дате отправления: