Re: Slow query with a lot of data
От | Moritz Onken |
---|---|
Тема | Re: Slow query with a lot of data |
Дата | |
Msg-id | 8EF0BB61-D8FE-47E8-97DD-1009F7EB5431@houseofdesign.de обсуждение исходный текст |
Ответ на | Slow query with a lot of data (Moritz Onken <onken@houseofdesign.de>) |
Ответы |
Re: Slow query with a lot of data
|
Список | pgsql-performance |
Am 18.08.2008 um 16:30 schrieb Matthew Wakeling: > On Mon, 18 Aug 2008, Moritz Onken wrote: >> I have indexes on result.domain, domain_categories.domain, >> result.user, domain_categories.category. Clustered result on user >> and domain_categories on domain. > >> " -> Materialize (cost=2118752.28..2270064.64 >> rows=12104989 width=8) (actual time=46460.599..82336.116 >> rows=12123161 loops=1)" >> " -> Sort (cost=2118752.28..2149014.75 rows=12104989 >> width=8) (actual time=46460.592..59595.851 rows=12104989 loops=1)" >> " Sort Key: b.domain" >> " Sort Method: external sort Disk: 283992kB" >> " -> Seq Scan on domain_categories b >> (cost=0.00..198151.89 rows=12104989 width=8) (actual >> time=14.352..22572.869 rows=12104989 loops=1)" > > This is weird, given you say you have clustered domain_categories on > domain. Have you analysed? You should be able to run: > > EXPLAIN SELECT * from domain_categories ORDER BY domain > > and have it say "Index scan" instead of "Seq Scan followed by disc > sort)". > > Matthew > Thanks, the index was created but I forgot to run analyze again on that table. I had a little mistake in my previous sql query. The corrected version is this: explain analyze select a."user", b.category, count(1) from result a, domain_categories b where a."domain" = b."domain" and a."user" = 1337 group by a."user", b.category; (notice the additional group by column). explain analyze: "HashAggregate (cost=817397.78..817428.92 rows=2491 width=8) (actual time=42874.339..42878.419 rows=3361 loops=1)" " -> Merge Join (cost=748.47..674365.50 rows=19070970 width=8) (actual time=15702.449..42829.388 rows=36308 loops=1)" " Merge Cond: (b.domain = a.domain)" " -> Index Scan using domain_categories_domain on domain_categories b (cost=0.00..391453.79 rows=12105014 width=8) (actual time=39.018..30166.349 rows=12104989 loops=1)" " -> Sort (cost=395.52..405.49 rows=3985 width=8) (actual time=0.188..32.345 rows=36309 loops=1)" " Sort Key: a.domain" " Sort Method: quicksort Memory: 27kB" " -> Index Scan using result_user_idx on result a (cost=0.00..157.21 rows=3985 width=8) (actual time=0.021..0.101 rows=61 loops=1)" " Index Cond: ("user" = 1337)" "Total runtime: 42881.382 ms" This is still very slow...
В списке pgsql-performance по дате отправления: