Apparently useless bitmap scans

Поиск
Список
Период
Сортировка
От Peter Eisentraut
Тема Apparently useless bitmap scans
Дата
Msg-id 200705091410.57312.peter_e@gmx.net
обсуждение исходный текст
Ответы Re: Apparently useless bitmap scans  (Alvaro Herrera <alvherre@commandprompt.com>)
Re: Apparently useless bitmap scans  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
There's another odd thing about this plan from yesterday.

Query:

        SELECT
                eh_subj.header_body AS subject,
                count(distinct eh_from.header_body)
        FROM
                email JOIN mime_part USING (email_id)
                JOIN email_header eh_subj USING (email_id, mime_part_id)
                JOIN email_header eh_from USING (email_id, mime_part_id)
        WHERE
                eh_subj.header_name = 'subject'
                AND eh_from.header_name = 'from'
                AND mime_part_id = 0
                AND (time >= timestamp '2007-05-05 17:01:59' AND time < timestamp '2007-05-05 17:01:59' + interval '60
min')
        GROUP BY
                eh_subj.header_body;

Plan:

                                                                                        QUERY PLAN
                                                                   

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=1920309.81..1920534.21 rows=11220 width=184) (actual time=5349.493..5587.536 rows=35000 loops=1)
   ->  Sort  (cost=1920309.81..1920337.86 rows=11220 width=184) (actual time=5349.427..5392.110 rows=35000 loops=1)
         Sort Key: eh_subj.header_body
         ->  Nested Loop  (cost=15576.58..1919555.05 rows=11220 width=184) (actual time=537.938..5094.377 rows=35000
loops=1)
               ->  Nested Loop  (cost=15576.58..475387.23 rows=11020 width=120) (actual time=537.858..4404.330
rows=35000loops=1) 
                     ->  Nested Loop  (cost=15576.58..430265.44 rows=11092 width=112) (actual time=537.768..4024.184
rows=35000loops=1) 
                           ->  Bitmap Heap Scan on email_header eh_from  (cost=15576.58..16041.55 rows=107156
width=104)(actual time=537.621..1801.032 rows=280990 loops=1) 
                                 Recheck Cond: ((mime_part_id = 0) AND (header_name = 'from'::text))
                                 ->  BitmapAnd  (cost=15576.58..15576.58 rows=160 width=0) (actual
time=500.006..500.006rows=0 loops=1) 
                                       ->  Bitmap Index Scan on dummy_index  (cost=0.00..3724.22 rows=107156 width=0)
(actualtime=85.025..85.025 rows=280990 loops=1) 
                                       ->  Bitmap Index Scan on idx__email_header__from_local  (cost=0.00..5779.24
rows=107156width=0) (actual time=173.006..173.006 rows=280990 loops=1) 
                                       ->  Bitmap Index Scan on dummy2_index  (cost=0.00..5992.25 rows=107156 width=0)
(actualtime=174.463..174.463 rows=280990 loops=1) 
                           ->  Index Scan using email_pkey on email  (cost=0.00..3.85 rows=1 width=8) (actual
time=0.005..0.005rows=0 loops=280990) 
                                 Index Cond: (email.email_id = eh_from.email_id)
                                 Filter: (("time" >= '2007-05-05 17:01:59'::timestamp without time zone) AND ("time" <
'2007-05-0518:01:59'::timestamp without time zone)) 
                     ->  Index Scan using mime_part_pkey on mime_part  (cost=0.00..4.06 rows=1 width=12) (actual
time=0.005..0.006rows=1 loops=35000) 
                           Index Cond: ((email.email_id = mime_part.email_id) AND (mime_part.mime_part_id = 0))
               ->  Index Scan using idx__email_header__email_id__mime_part_id on email_header eh_subj
(cost=0.00..130.89rows=13 width=104) (actual time=0.009..0.015 rows=1 loops=35000) 
                     Index Cond: ((email.email_id = eh_subj.email_id) AND (0 = eh_subj.mime_part_id))
                     Filter: (header_name = 'subject'::text)
 Total runtime: 5625.024 ms


I'm wondering what it wants to achieve with these three index scans:

    ->  Bitmap Index Scan on dummy_index  (cost=0.00..3724.22 rows=107156 width=0) (actual time=85.025..85.025
rows=280990loops=1) 
    ->  Bitmap Index Scan on idx__email_header__from_local  (cost=0.00..5779.24 rows=107156 width=0) (actual
time=173.006..173.006rows=280990 loops=1) 
    ->  Bitmap Index Scan on dummy2_index  (cost=0.00..5992.25 rows=107156 width=0) (actual time=174.463..174.463
rows=280990loops=1) 

The indexes in question are:

CREATE INDEX dummy_index ON email_header ((555)) WHERE mime_part_id = 0 AND header_name = 'from';
CREATE INDEX dummy2_index ON email_header (substr(header_body,5)) WHERE mime_part_id = 0 AND header_name = 'from';
CREATE INDEX idx__email_header__from_local ON email_header (get_localpart(header_body)) WHERE mime_part_id = 0 AND
header_name= 'from'; 

It appears to want to use these indexes to get the restriction

                AND eh_from.header_name = 'from'
                AND mime_part_id = 0

from the query, but why does it need three of them to do it, when all
of them have the same predicate and none of them has an indexed
expression that appears in the query?

There are more partial indexes with the same predicate, but it appears
to always use three.  (The two "dummy" indexes are just leftovers from
these experiments.)

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

В списке pgsql-performance по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: Nested loops overpriced
Следующее
От: "Valentine Gogichashvili"
Дата:
Сообщение: Cannot make GIN intarray index be used by the planner