Re: Slow query with a lot of data

Поиск
Список
Период
Сортировка
От Moritz Onken
Тема Re: Slow query with a lot of data
Дата
Msg-id 8EF0BB61-D8FE-47E8-97DD-1009F7EB5431@houseofdesign.de
обсуждение исходный текст
Ответ на Slow query with a lot of data  (Moritz Onken <onken@houseofdesign.de>)
Ответы Re: Slow query with a lot of data  (Matthew Wakeling <matthew@flymine.org>)
Список pgsql-performance
Am 18.08.2008 um 16:30 schrieb Matthew Wakeling:

> On Mon, 18 Aug 2008, Moritz Onken wrote:
>> I have indexes on result.domain, domain_categories.domain,
>> result.user, domain_categories.category. Clustered result on user
>> and domain_categories on domain.
>
>> "        ->  Materialize  (cost=2118752.28..2270064.64
>> rows=12104989 width=8) (actual time=46460.599..82336.116
>> rows=12123161 loops=1)"
>> "              ->  Sort  (cost=2118752.28..2149014.75 rows=12104989
>> width=8) (actual time=46460.592..59595.851 rows=12104989 loops=1)"
>> "                    Sort Key: b.domain"
>> "                    Sort Method:  external sort  Disk: 283992kB"
>> "                    ->  Seq Scan on domain_categories b
>> (cost=0.00..198151.89 rows=12104989 width=8) (actual
>> time=14.352..22572.869 rows=12104989 loops=1)"
>
> This is weird, given you say you have clustered domain_categories on
> domain. Have you analysed? You should be able to run:
>
> EXPLAIN SELECT * from domain_categories ORDER BY domain
>
> and have it say "Index scan" instead of "Seq Scan followed by disc
> sort)".
>
> Matthew
>

Thanks, the index was created but I forgot to run analyze again on
that table.

I had a little mistake in my previous sql query. The corrected version
is this:
explain analyze select a."user", b.category, count(1) from result a,
domain_categories b where a."domain" = b."domain" and a."user" = 1337
group by a."user", b.category;

(notice the additional group by column).

explain analyze:


"HashAggregate  (cost=817397.78..817428.92 rows=2491 width=8) (actual
time=42874.339..42878.419 rows=3361 loops=1)"
"  ->  Merge Join  (cost=748.47..674365.50 rows=19070970 width=8)
(actual time=15702.449..42829.388 rows=36308 loops=1)"
"        Merge Cond: (b.domain = a.domain)"
"        ->  Index Scan using domain_categories_domain on
domain_categories b  (cost=0.00..391453.79 rows=12105014 width=8)
(actual time=39.018..30166.349 rows=12104989 loops=1)"
"        ->  Sort  (cost=395.52..405.49 rows=3985 width=8) (actual
time=0.188..32.345 rows=36309 loops=1)"
"              Sort Key: a.domain"
"              Sort Method:  quicksort  Memory: 27kB"
"              ->  Index Scan using result_user_idx on result a
(cost=0.00..157.21 rows=3985 width=8) (actual time=0.021..0.101
rows=61 loops=1)"
"                    Index Cond: ("user" = 1337)"
"Total runtime: 42881.382 ms"

This is still very slow...



В списке pgsql-performance по дате отправления:

Предыдущее
От: Matthew Wakeling
Дата:
Сообщение: Re: Slow query with a lot of data
Следующее
От: Matthew Wakeling
Дата:
Сообщение: Re: Slow query with a lot of data