select count(idn) is slow (Seq Scan) instead of Bitmap Heap.. why?

Поиск
Список
Период
Сортировка
От Adam PAPAI
Тема select count(idn) is slow (Seq Scan) instead of Bitmap Heap.. why?
Дата
Msg-id 4A75BEB5.2070002@wooh.hu
обсуждение исходный текст
Ответы Re: select count(idn) is slow (Seq Scan) instead of Bitmap Heap.. why?  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-performance
Hello,


I have a problem with an inner join + count().

my query is:

explain analyze select
k.idn,k.kerdes_subject,k.kerdes_text,u.vezeteknev,u.keresztnev,u.idn as
user_id, kg.kategoria_neve, count(v.idn)

FROM kategoriak as kg

INNER JOIN kerdesek as k on kg.idn = k.kategoria_id
INNER JOIN users as u ON k.user_id = u.idn
INNER JOIN valaszok as v ON k.idn = v.kerdes_id

where kg.idn=15 group by k.idn, k.kerdes_subject,k.kerdes_text,
u.idn,u.vezeteknev,u.keresztnev,kg.kategoria_neve

The problem is with the count(v.idn).

This column has a relation with: v.kerdes_id = k.idn => k.kategoria_id =
  kg.idn

and the WHERE says: kg.idn = 15.

Why does it run through all lines in v?

the explain sais:

  GroupAggregate  (cost=103238.59..103602.66 rows=10402 width=1382)
(actual time=8531.405..8536.633 rows=73 loops=1)
    ->  Sort  (cost=103238.59..103264.59 rows=10402 width=1382) (actual
time=8531.339..8533.199 rows=1203 loops=1)
          Sort Key: k.idn, k.kerdes_subject, k.kerdes_text, u.idn,
u.vezeteknev, u.keresztnev, kg.kategoria_neve
          ->  Hash Join  (cost=3827.79..89951.54 rows=10402 width=1382)
(actual time=1778.590..8523.015 rows=1203 loops=1)
                Hash Cond: (v.kerdes_id = k.idn)
                ->  Seq Scan on valaszok v  (cost=0.00..78215.98
rows=2080998 width=8) (actual time=59.714..5009.171 rows=2080998 loops=1)
                ->  Hash  (cost=3823.42..3823.42 rows=350 width=1378)
(actual time=12.553..12.553 rows=74 loops=1)
                      ->  Nested Loop  (cost=14.98..3823.42 rows=350
width=1378) (actual time=0.714..12.253 rows=74 loops=1)
                            ->  Nested Loop  (cost=14.98..1056.38
rows=350 width=830) (actual time=0.498..5.952 rows=117 loops=1)
                                  ->  Seq Scan on kategoriak kg
(cost=0.00..1.30 rows=1 width=278) (actual time=0.066..0.076 rows=1 loops=1)
                                        Filter: (idn = 15)
                                  ->  Bitmap Heap Scan on kerdesek k
(cost=14.98..1051.58 rows=350 width=560) (actual time=0.374..5.430
rows=117 loops=1)
                                        Recheck Cond: (15 = kategoria_id)
                                        ->  Bitmap Index Scan on
kategoria_id_id_idx  (cost=0.00..14.89 rows=350 width=0) (actual
time=0.212..0.212 rows=117 loops=1)
                                              Index Cond: (15 =
kategoria_id)
                            ->  Index Scan using users_pkey on users u
(cost=0.00..7.89 rows=1 width=552) (actual time=0.047..0.048 rows=1
loops=117)
                                  Index Cond: (k.user_id = u.idn)
  Total runtime: 8536.936 ms



So it run through more than 2 mill lines... but why? It should only
count those lines which has the category_id = 15...

What am I doing wrong?



--
Adam PAPAI

В списке pgsql-performance по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Performance 8.4.0
Следующее
От: Suvankar Roy
Дата:
Сообщение: Greenplum MapReduce