query faster using LEFT OUTER join?

Поиск
Список
Период
Сортировка
От Drew
Тема query faster using LEFT OUTER join?
Дата
Msg-id B8825A6D-229D-4130-9091-7386F2B383DD@fastmail.fm
обсуждение исходный текст
Ответы Re: query faster using LEFT OUTER join?
Список pgsql-sql
I have a planner question about a multi-join query, where if I  
rewrite the query excluding the table that is left joined to the  
other tables my query speed increases by 75%.

I've tested this query on both 8.0.9 and 8.2b3, w/ similar results  
(except 8.2b3 is 50% faster!)

I'm wondering why the first query is planned a faster way and if  
there is anything I could do to my 2nd query faster (which should be  
faster because it doesn't have to scan the first table).


Here's my initial query that runs in 22.961ms:
SELECT    train_family_id, object_type,  COUNT(*) AS frequency
FROM    translation_pair tp    LEFT OUTER JOIN instance i USING(translation_pair_id)    JOIN context c
USING(context_id)   JOIN loc_submission ls USING(loc_submission_id)    JOIN loc_submission_train_info lsti
USING(loc_submission_id)
WHERE    translation_pair_id IN (640352, 6144, 1023028, 18155, 240244,  
50157)
GROUP BY    train_family_id,    object_type;


If I just remove the translation_pair table and join directly against  
the instance table, the query speed drops 50% to 36.6ms
SELECT    train_family_id, object_type, COUNT(*) AS frequency
FROM    instance i    JOIN context c USING(context_id)    JOIN loc_submission ls USING(loc_submission_id)    JOIN
loc_submission_train_infolsti USING(loc_submission_id)
 
WHERE    translation_pair_id IN (640352, 6144, 1023028, 18155, 240244,  
50157)
GROUP BY    train_family_id,    object_type;


Looking at the query plans for the 2 of them, it looks like the inner  
"Bitmap Index Scan on instance_translation_pair_id" is much slower  
when using "IN (x,y,z)" instead of comparing against table  
(tp.translation_pair_id = i.translation_pair_id.

Is there anything else I should notice? Can I force the planner to  
use the translation_pair table first?


Here's the plan for the first query (using 8.2b3)
------------------------------------------------------------------------ 
------------------------------------------------------------------------ 
-----------------------------------
HashAggregate  (cost=3223.17..3223.41 rows=19 width=18) (actual  
time=34.913..34.978 rows=122 loops=1)   ->  Nested Loop Left Join  (cost=29.53..3223.03 rows=19 width=18)  
(actual time=0.606..33.994 rows=770 loops=1)         ->  Nested Loop Left Join  (cost=29.53..3108.80 rows=19  
width=22) (actual time=0.552..26.918 rows=770 loops=1)               ->  Nested Loop Left Join  (cost=29.53..2995.26  
rows=19 width=18) (actual time=0.497..19.924 rows=768 loops=1)                     ->  Nested Loop Left Join
(cost=29.53..2879.59 
 
rows=19 width=8) (actual time=0.431..6.395 rows=768 loops=1)                           ->  Bitmap Heap Scan on
translation_pair 
 
tp  (cost=24.27..48.10 rows=6 width=4) (actual time=0.304..0.376  
rows=6 loops=1)                                 Recheck Cond: (translation_pair_id  
= ANY ('{640352,6144,1023028,18155,240244,50157}'::oid[]))                                 ->  Bitmap Index Scan on  
translation_pair_pkey  (cost=0.00..24.27 rows=6 width=0) (actual  
time=0.254..0.254 rows=6 loops=1)                                       Index Cond:  
(translation_pair_id = ANY  
('{640352,6144,1023028,18155,240244,50157}'::oid[]))                           ->  Bitmap Heap Scan on instance i   
(cost=5.26..470.38 rows=123 width=12) (actual time=0.119..0.922  
rows=128 loops=6)                                 Recheck Cond:  
(tp.translation_pair_id = i.translation_pair_id)                                 ->  Bitmap Index Scan on  
instance_translation_pair_id  (cost=0.00..5.26 rows=123 width=0)  
(actual time=0.098..0.098 rows=128 loops=6)                                       Index Cond:  
(tp.translation_pair_id = i.translation_pair_id)                     ->  Index Scan using context_pkey on context c   
(cost=0.00..6.07 rows=1 width=18) (actual time=0.015..0.016 rows=1  
loops=768)                           Index Cond: (i.context_id = c.context_id)               ->  Index Scan using
loc_submission_train_info_pkey 
 
on loc_submission_train_info lsti  (cost=0.00..5.96 rows=1 width=8)  
(actual time=0.007..0.008 rows=1 loops=768)                     Index Cond: (i.loc_submission_id =  
lsti.loc_submission_id)         ->  Index Scan using loc_submission_pkey on loc_submission  
ls  (cost=0.00..6.00 rows=1 width=4) (actual time=0.007..0.008 rows=1  
loops=770)               Index Cond: (i.loc_submission_id = ls.loc_submission_id)


And here's the plan for the second query without the first LEFT OUTER  
join (using 8.2b3)
------------------------------------------------------------------------ 
------------------------------------------------------------------------ 
------------------------
HashAggregate  (cost=8927.87..8936.89 rows=721 width=18) (actual  
time=39.484..39.551 rows=122 loops=1)   ->  Nested Loop Left Join  (cost=3706.71..8922.47 rows=721  
width=18) (actual time=25.246..38.654 rows=770 loops=1)         ->  Hash Left Join  (cost=3706.71..4000.93 rows=721  
width=8) (actual time=25.215..31.169 rows=770 loops=1)               Hash Cond: (i.loc_submission_id =
ls.loc_submission_id)              ->  Merge Left Join  (cost=3420.14..3481.22 rows=721  
 
width=12) (actual time=15.088..20.018 rows=770 loops=1)                     Merge Cond: (i.loc_submission_id =  
lsti.loc_submission_id)                     ->  Sort  (cost=2576.82..2578.62 rows=720  
width=8) (actual time=2.287..2.438 rows=768 loops=1)                           Sort Key: i.loc_submission_id
              ->  Bitmap Heap Scan on instance i   
 
(cost=25.24..2542.65 rows=720 width=8) (actual time=0.381..1.623  
rows=768 loops=1)                                 Recheck Cond: (translation_pair_id  
= ANY ('{640352,6144,1023028,18155,240244,50157}'::oid[]))                                 ->  Bitmap Index Scan on  
instance_translation_pair_id  (cost=0.00..25.24 rows=720 width=0)  
(actual time=0.293..0.293 rows=768 loops=1)                                       Index Cond:  
(translation_pair_id = ANY  
('{640352,6144,1023028,18155,240244,50157}'::oid[]))                     ->  Sort  (cost=843.32..868.47 rows=10059  
width=8) (actual time=12.782..14.312 rows=10530 loops=1)                           Sort Key: lsti.loc_submission_id
                     ->  Seq Scan on loc_submission_train_info  
 
lsti  (cost=0.00..174.59 rows=10059 width=8) (actual  
time=0.012..3.708 rows=10059 loops=1)               ->  Hash  (cost=261.46..261.46 rows=10046 width=4)  
(actual time=10.061..10.061 rows=10046 loops=1)                     ->  Seq Scan on loc_submission ls   
(cost=0.00..261.46 rows=10046 width=4) (actual time=0.013..5.235  
rows=10046 loops=1)         ->  Index Scan using context_pkey on context c   
(cost=0.00..6.81 rows=1 width=18) (actual time=0.008..0.009 rows=1  
loops=770)               Index Cond: (i.context_id = c.context_id)


Thanks for your help,

Drew


В списке pgsql-sql по дате отправления:

Предыдущее
От: "lms"
Дата:
Сообщение: How convert UNICODE
Следующее
От: Drew
Дата:
Сообщение: Re: How convert UNICODE