Encouraging multi-table join order

Поиск
Список
Период
Сортировка
От Dan Harris
Тема Encouraging multi-table join order
Дата
Msg-id 443AAC26.4070802@drivefaster.net
обсуждение исходный текст
Ответы Re: Encouraging multi-table join order
Список pgsql-performance
I have a query that is intended to select from multiple "small tables"
to get a limited subset of "incidentid" and then join with a "very
large" table.  One of the operations will require a sequential scan, but
the planner is doing the scan on the very large table before joining the
small ones, resulting in a huge amount of disk I/O.  How would I make
this query join the large table only after narrowing down the possible
selections from the smaller tables?  This is running on version 8.0.3.

Thanks for any ideas.

-Dan


QUERY
########################################
 explain analyze
        select distinct
                 eventmain.incidentid,
                eventmain.entrydate,
                eventgeo.eventlocation,
                 recordtext as retdata
         from
                 eventactivity
         join (
                 select
                         incidentid
                  from
                         k_h
                 where
                         id = 33396 and
                         k_h.entrydate >= '2006-1-1 00:00' and
                         k_h.entrydate < '2006-4-8 00:00'
                 ) id_keywords using ( incidentid ) ,

                 eventmain,
                 eventgeo
      where
                 eventmain.incidentid = eventactivity.incidentid and
                 eventmain.incidentid = eventgeo.incidentid and
                 ( ' ' || recordtext || ' ' like '%HAL%' ) and
                 eventactivity.entrydate >= '2006-1-1 00:00' and
                 eventactivity.entrydate < '2006-4-8 00:00'
         order by
                 eventmain.entrydate limit 10000;



EXPLAIN ANALYZE OUTPUT
########################################
 Limit  (cost=2521191.65..2521191.90 rows=6 width=187) (actual
time=1360935.787..1361072.277 rows=1400 loops=1)
   ->  Unique  (cost=2521191.65..2521191.90 rows=6 width=187) (actual
time=1360935.779..1361067.853 rows=1400 loops=1)
         ->  Sort  (cost=2521191.65..2521191.66 rows=6 width=187)
(actual time=1360935.765..1360958.258 rows=16211 loops=1)
               Sort Key: eventmain.entrydate, eventmain.incidentid,
eventactivity.recordtext, eventgeo.eventlocation
               ->  Nested Loop  (cost=219.39..2521191.57 rows=6
width=187) (actual time=1123.115..1360579.798 rows=16211 loops=1)
                     ->  Nested Loop  (cost=219.39..2521173.23 rows=6
width=154) (actual time=1105.773..1325907.716 rows=16211 loops=1)
                           ->  Hash Join  (cost=219.39..2521153.37
rows=6 width=66) (actual time=1069.476..1289608.261 rows=16211 loops=1)
                                 Hash Cond: (("outer".incidentid)::text
= ("inner".incidentid)::text)
                                 ->  Seq Scan on eventactivity
(cost=0.00..2518092.06 rows=1532 width=52) (actual
time=57.205..1288514.530 rows=2621 loops=1)
                                       Filter: ((((' '::text ||
(recordtext)::text) || ' '::text) ~~ '%HAL%'::text) AND (entrydate >=
'2006-01-01 00:00:00'::timestamp without time zone) AND (entrydate <
'2006-04-08 00:00:00'::timestamp without time zone))
                                 ->  Hash  (cost=217.53..217.53 rows=741
width=14) (actual time=899.128..899.128 rows=0 loops=1)
                                       ->  Index Scan using k_h_id_idx
on k_h  (cost=0.00..217.53 rows=741 width=14) (actual
time=55.097..893.883 rows=1162 loops=1)
                                             Index Cond: (id = 33396)
                                             Filter: ((entrydate >=
'2006-01-01 00:00:00'::timestamp without time zone) AND (entrydate <
'2006-04-08 00:00:00'::timestamp without time zone))
                           ->  Index Scan using eventmain_incidentid_idx
on eventmain  (cost=0.00..3.30 rows=1 width=88) (actual
time=1.866..2.227 rows=1 loops=16211)
                                 Index Cond:
((eventmain.incidentid)::text = ("outer".incidentid)::text)
                     ->  Index Scan using eventgeo_incidentid_idx on
eventgeo  (cost=0.00..3.04 rows=1 width=75) (actual time=1.770..2.126
rows=1 loops=16211)
                           Index Cond: ((eventgeo.incidentid)::text =
("outer".incidentid)::text)
 Total runtime: 1361080.787 ms
(19 rows)


В списке pgsql-performance по дате отправления:

Предыдущее
От: Vivek Khera
Дата:
Сообщение: Re: Restore performance?
Следующее
От: Brad Nicholson
Дата:
Сообщение: Re: pg 8.1.3, AIX, huge box, painfully slow.