Re: Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?

Поиск
Список
Период
Сортировка
От Bryce Nesbitt
Тема Re: Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?
Дата
Msg-id 4B765E36.20002@obviously.com
обсуждение исходный текст
Ответ на Re: Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
So how about the removal of the "AND" clause? On a test server, this
drops the query from 201204 to 438 ms.
Is this just random, or is it a real solution that might apply to any
arbitrary combination of words?

Attached are three test runs:
Total runtime: 201204.972 ms
Total runtime: 437.766 ms
Total runtime: 341.727 ms

preproduction-20091214=# ALTER TABLE article_words ALTER COLUMN word_key SET STATISTICS 50;
ALTER TABLE
preproduction-20091214=# ANALYZE VERBOSE article_words;
INFO:  analyzing "public.article_words"
INFO:  "article_words": scanned 75000 of 1342374 pages, containing 16124750 live rows and 825250 dead rows; 75000 rows
insample, 288605935 estimated total rows 
ANALYZE


preproduction-20091214=# EXPLAIN ANALYZE
SELECT contexts.context_key FROM contexts
JOIN articles ON (articles.context_key=contexts.context_key)
JOIN matview_82034 ON (contexts.context_key=matview_82034.context_key)
WHERE contexts.context_key IN
    (SELECT context_key FROM article_words JOIN words using (word_key) WHERE word = 'insider'
     INTERSECT
     SELECT context_key FROM article_words JOIN words using (word_key) WHERE word = 'trading')
AND contexts.context_key IN
         (SELECT a.context_key FROM virtual_ancestors a JOIN bp_categories ON (a.ancestor_key =
bp_categories.context_key)
         WHERE lower(bp_categories.category) = 'law') AND articles.indexed;
                                                                                              QUERY PLAN
                                               

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=12665.41..61953.71 rows=2 width=4) (actual time=140.799..201204.025 rows=546 loops=1)
   ->  Nested Loop IN Join  (cost=12665.41..61951.82 rows=1 width=16) (actual time=140.786..201196.526 rows=546
loops=1)
         Join Filter: (a.context_key = articles.context_key)
         ->  Nested Loop  (cost=12665.41..12678.98 rows=2 width=12) (actual time=55.453..116.790 rows=1306 loops=1)
               ->  Nested Loop  (cost=12665.41..12675.18 rows=2 width=8) (actual time=55.439..98.132 rows=1306 loops=1)
                     ->  Subquery Scan "IN_subquery"  (cost=12665.41..12665.68 rows=5 width=4) (actual
time=55.415..76.025rows=1473 loops=1) 
                           ->  SetOp Intersect  (cost=12665.41..12665.64 rows=5 width=4) (actual time=55.413..73.885
rows=1473loops=1) 
                                 ->  Sort  (cost=12665.41..12665.52 rows=46 width=4) (actual time=55.406..62.222
rows=17892loops=1) 
                                       Sort Key: "*SELECT* 1".context_key
                                       Sort Method:  quicksort  Memory: 1607kB
                                       ->  Append  (cost=0.00..12664.14 rows=46 width=4) (actual time=0.060..42.065
rows=17892loops=1) 
                                             ->  Subquery Scan "*SELECT* 1"  (cost=0.00..6332.07 rows=23 width=4)
(actualtime=0.059..6.962 rows=3583 loops=1) 
                                                   ->  Nested Loop  (cost=0.00..6331.84 rows=23 width=4) (actual
time=0.058..5.148rows=3583 loops=1) 
                                                         ->  Index Scan using words_word on words  (cost=0.00..2.22
rows=1width=4) (actual time=0.018..0.019 rows=1 loops=1) 
                                                               Index Cond: ((word)::text = 'insider'::text)
                                                         ->  Index Scan using article_words_wc on article_words
(cost=0.00..6227.18rows=8195 width=8) (actual time=0.036..3.275 rows=3583 loops=1) 
                                                               Index Cond: (public.article_words.word_key =
public.words.word_key)
                                             ->  Subquery Scan "*SELECT* 2"  (cost=0.00..6332.07 rows=23 width=4)
(actualtime=0.037..27.136 rows=14309 loops=1) 
                                                   ->  Nested Loop  (cost=0.00..6331.84 rows=23 width=4) (actual
time=0.035..19.912rows=14309 loops=1) 
                                                         ->  Index Scan using words_word on words  (cost=0.00..2.22
rows=1width=4) (actual time=0.015..0.017 rows=1 loops=1) 
                                                               Index Cond: ((word)::text = 'trading'::text)
                                                         ->  Index Scan using article_words_wc on article_words
(cost=0.00..6227.18rows=8195 width=8) (actual time=0.018..12.464 rows=14309 loops=1) 
                                                               Index Cond: (public.article_words.word_key =
public.words.word_key)
                     ->  Index Scan using article_key_idx on articles  (cost=0.00..1.89 rows=1 width=4) (actual
time=0.011..0.012rows=1 loops=1473) 
                           Index Cond: (articles.context_key = "IN_subquery".context_key)
                           Filter: articles.indexed
               ->  Index Scan using contexts_pkey on contexts  (cost=0.00..1.89 rows=1 width=4) (actual
time=0.010..0.011rows=1 loops=1306) 
                     Index Cond: (contexts.context_key = articles.context_key)
         ->  Nested Loop  (cost=0.00..59848.83 rows=1124834 width=4) (actual time=0.012..142.771 rows=39193 loops=1306)
               ->  Seq Scan on bp_categories  (cost=0.00..1231.49 rows=16669 width=4) (actual time=0.006..19.719
rows=14552loops=1306) 
                     Filter: (lower(category) = 'law'::text)
               ->  Index Scan using virtual_ancestor_key_idx on virtual_ancestors a  (cost=0.00..2.57 rows=76 width=8)
(actualtime=0.004..0.006 rows=3 loops=19004304) 
                     Index Cond: (a.ancestor_key = bp_categories.context_key)
   ->  Index Scan using matview_82034_ck on matview_82034  (cost=0.00..1.88 rows=1 width=4) (actual time=0.010..0.011
rows=1loops=546) 
         Index Cond: (matview_82034.context_key = articles.context_key)
 Total runtime: 201204.972 ms
(36 rows)





preproduction-20091214=# EXPLAIN ANALYZE
SELECT contexts.context_key FROM contexts
JOIN articles ON (articles.context_key=contexts.context_key)
JOIN matview_82034 ON (contexts.context_key=matview_82034.context_key)
WHERE contexts.context_key IN
    (SELECT context_key FROM article_words JOIN words using (word_key) WHERE word = 'insider'
     INTERSECT
     SELECT context_key FROM article_words JOIN words using (word_key) WHERE word = 'trading')
AND contexts.context_key IN
         (SELECT a.context_key FROM virtual_ancestors a JOIN bp_categories ON (a.ancestor_key =
bp_categories.context_key)
         WHERE lower(bp_categories.category) = 'law');

                                                                                              QUERY PLAN
                                               

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=90968.66..95382.31 rows=5 width=4) (actual time=284.789..437.241 rows=648 loops=1)
   ->  Nested Loop  (cost=90968.66..95378.53 rows=2 width=16) (actual time=284.778..338.030 rows=649 loops=1)
         ->  Hash IN Join  (cost=90968.66..95374.73 rows=2 width=12) (actual time=284.764..333.145 rows=649 loops=1)
               Hash Cond: (contexts.context_key = a.context_key)
               ->  Nested Loop  (cost=12665.41..12675.20 rows=5 width=8) (actual time=55.557..81.206 rows=1473 loops=1)
                     ->  Subquery Scan "IN_subquery"  (cost=12665.41..12665.68 rows=5 width=4) (actual
time=55.537..69.892rows=1473 loops=1) 
                           ->  SetOp Intersect  (cost=12665.41..12665.64 rows=5 width=4) (actual time=55.535..69.075
rows=1473loops=1) 
                                 ->  Sort  (cost=12665.41..12665.52 rows=46 width=4) (actual time=55.528..60.439
rows=17892loops=1) 
                                       Sort Key: "*SELECT* 1".context_key
                                       Sort Method:  quicksort  Memory: 1607kB
                                       ->  Append  (cost=0.00..12664.14 rows=46 width=4) (actual time=0.040..42.014
rows=17892loops=1) 
                                             ->  Subquery Scan "*SELECT* 1"  (cost=0.00..6332.07 rows=23 width=4)
(actualtime=0.038..6.953 rows=3583 loops=1) 
                                                   ->  Nested Loop  (cost=0.00..6331.84 rows=23 width=4) (actual
time=0.038..5.129rows=3583 loops=1) 
                                                         ->  Index Scan using words_word on words  (cost=0.00..2.22
rows=1width=4) (actual time=0.019..0.020 rows=1 loops=1) 
                                                               Index Cond: ((word)::text = 'insider'::text)
                                                         ->  Index Scan using article_words_wc on article_words
(cost=0.00..6227.18rows=8195 width=8) (actual time=0.015..3.292 rows=3583 loops=1) 
                                                               Index Cond: (public.article_words.word_key =
public.words.word_key)
                                             ->  Subquery Scan "*SELECT* 2"  (cost=0.00..6332.07 rows=23 width=4)
(actualtime=0.037..27.131 rows=14309 loops=1) 
                                                   ->  Nested Loop  (cost=0.00..6331.84 rows=23 width=4) (actual
time=0.036..19.717rows=14309 loops=1) 
                                                         ->  Index Scan using words_word on words  (cost=0.00..2.22
rows=1width=4) (actual time=0.016..0.018 rows=1 loops=1) 
                                                               Index Cond: ((word)::text = 'trading'::text)
                                                         ->  Index Scan using article_words_wc on article_words
(cost=0.00..6227.18rows=8195 width=8) (actual time=0.017..12.360 rows=14309 loops=1) 
                                                               Index Cond: (public.article_words.word_key =
public.words.word_key)
                     ->  Index Scan using contexts_pkey on contexts  (cost=0.00..1.89 rows=1 width=4) (actual
time=0.006..0.006rows=1 loops=1473) 
                           Index Cond: (contexts.context_key = "IN_subquery".context_key)
               ->  Hash  (cost=59848.83..59848.83 rows=1124834 width=4) (actual time=220.171..220.171 rows=48207
loops=1)
                     ->  Nested Loop  (cost=0.00..59848.83 rows=1124834 width=4) (actual time=0.029..175.278 rows=48207
loops=1)
                           ->  Seq Scan on bp_categories  (cost=0.00..1231.49 rows=16669 width=4) (actual
time=0.015..23.290rows=16669 loops=1) 
                                 Filter: (lower(category) = 'law'::text)
                           ->  Index Scan using virtual_ancestor_key_idx on virtual_ancestors a  (cost=0.00..2.57
rows=76width=8) (actual time=0.004..0.007 rows=3 loops=16669) 
                                 Index Cond: (a.ancestor_key = bp_categories.context_key)
         ->  Index Scan using article_key_idx on articles  (cost=0.00..1.89 rows=1 width=4) (actual time=0.006..0.006
rows=1loops=649) 
               Index Cond: (articles.context_key = contexts.context_key)
   ->  Index Scan using matview_82034_ck on matview_82034  (cost=0.00..1.88 rows=1 width=4) (actual time=0.151..0.152
rows=1loops=649) 
         Index Cond: (matview_82034.context_key = articles.context_key)
 Total runtime: 437.766 ms
(36 rows)


preproduction-20091214=# EXPLAIN ANALYZE
SELECT contexts.context_key FROM articles JOIN contexts using (context_key)
WHERE articles.context_key IN
(
SELECT contexts.context_key FROM contexts
JOIN matview_82034 ON (contexts.context_key=matview_82034.context_key)
WHERE contexts.context_key IN
    (SELECT context_key FROM article_words JOIN words using (word_key) WHERE word = 'insider'
     INTERSECT
     SELECT context_key FROM article_words JOIN words using (word_key) WHERE word = 'trading')
AND contexts.context_key IN
    (SELECT a.context_key FROM virtual_ancestors a JOIN bp_categories ON (a.ancestor_key = bp_categories.context_key)
     WHERE lower(bp_categories.category) = 'law')
)
AND articles.indexed;
                                                                                                    QUERY PLAN
                                               

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=95378.52..95391.87 rows=2 width=4) (actual time=332.545..341.194 rows=546 loops=1)
   ->  Nested Loop  (cost=95378.52..95388.07 rows=2 width=20) (actual time=332.531..337.526 rows=546 loops=1)
         ->  HashAggregate  (cost=95378.52..95378.57 rows=5 width=16) (actual time=332.514..332.830 rows=648 loops=1)
               ->  Nested Loop  (cost=90968.66..95378.51 rows=5 width=16) (actual time=285.869..331.943 rows=648
loops=1)
                     ->  Hash IN Join  (cost=90968.66..95374.73 rows=2 width=12) (actual time=285.858..327.625 rows=649
loops=1)
                           Hash Cond: (public.contexts.context_key = a.context_key)
                           ->  Nested Loop  (cost=12665.41..12675.20 rows=5 width=8) (actual time=55.625..79.734
rows=1473loops=1) 
                                 ->  Subquery Scan "IN_subquery"  (cost=12665.41..12665.68 rows=5 width=4) (actual
time=55.602..69.942rows=1473 loops=1) 
                                       ->  SetOp Intersect  (cost=12665.41..12665.64 rows=5 width=4) (actual
time=55.601..69.115rows=1473 loops=1) 
                                             ->  Sort  (cost=12665.41..12665.52 rows=46 width=4) (actual
time=55.593..60.422rows=17892 loops=1) 
                                                   Sort Key: "*SELECT* 1".context_key
                                                   Sort Method:  quicksort  Memory: 1607kB
                                                   ->  Append  (cost=0.00..12664.14 rows=46 width=4) (actual
time=0.039..41.913rows=17892 loops=1) 
                                                         ->  Subquery Scan "*SELECT* 1"  (cost=0.00..6332.07 rows=23
width=4)(actual time=0.039..6.911 rows=3583 loops=1) 
                                                               ->  Nested Loop  (cost=0.00..6331.84 rows=23 width=4)
(actualtime=0.037..5.043 rows=3583 loops=1) 
                                                                     ->  Index Scan using words_word on words
(cost=0.00..2.22rows=1 width=4) (actual time=0.018..0.019 rows=1 loops=1) 
                                                                           Index Cond: ((word)::text = 'insider'::text)
                                                                     ->  Index Scan using article_words_wc on
article_words (cost=0.00..6227.18 rows=8195 width=8) (actual time=0.015..3.202 rows=3583 loops=1) 
                                                                           Index Cond: (public.article_words.word_key =
public.words.word_key)
                                                         ->  Subquery Scan "*SELECT* 2"  (cost=0.00..6332.07 rows=23
width=4)(actual time=0.035..27.132 rows=14309 loops=1) 
                                                               ->  Nested Loop  (cost=0.00..6331.84 rows=23 width=4)
(actualtime=0.034..20.008 rows=14309 loops=1) 
                                                                     ->  Index Scan using words_word on words
(cost=0.00..2.22rows=1 width=4) (actual time=0.014..0.015 rows=1 loops=1) 
                                                                           Index Cond: ((word)::text = 'trading'::text)
                                                                     ->  Index Scan using article_words_wc on
article_words (cost=0.00..6227.18 rows=8195 width=8) (actual time=0.018..12.502 rows=14309 loops=1) 
                                                                           Index Cond: (public.article_words.word_key =
public.words.word_key)
                                 ->  Index Scan using contexts_pkey on contexts  (cost=0.00..1.89 rows=1 width=4)
(actualtime=0.005..0.005 rows=1 loops=1473) 
                                       Index Cond: (public.contexts.context_key = "IN_subquery".context_key)
                           ->  Hash  (cost=59848.83..59848.83 rows=1124834 width=4) (actual time=221.465..221.465
rows=48207loops=1) 
                                 ->  Nested Loop  (cost=0.00..59848.83 rows=1124834 width=4) (actual
time=0.030..174.941rows=48207 loops=1) 
                                       ->  Seq Scan on bp_categories  (cost=0.00..1231.49 rows=16669 width=4) (actual
time=0.015..23.377rows=16669 loops=1) 
                                             Filter: (lower(category) = 'law'::text)
                                       ->  Index Scan using virtual_ancestor_key_idx on virtual_ancestors a
(cost=0.00..2.57rows=76 width=8) (actual time=0.004..0.007 rows=3 loops=16669) 
                                             Index Cond: (a.ancestor_key = bp_categories.context_key)
                     ->  Index Scan using matview_82034_ck on matview_82034  (cost=0.00..1.88 rows=1 width=4) (actual
time=0.005..0.006rows=1 loops=649) 
                           Index Cond: (matview_82034.context_key = public.contexts.context_key)
         ->  Index Scan using article_key_idx on articles  (cost=0.00..1.89 rows=1 width=4) (actual time=0.006..0.006
rows=1loops=648) 
               Index Cond: (articles.context_key = public.contexts.context_key)
               Filter: articles.indexed
   ->  Index Scan using contexts_pkey on contexts  (cost=0.00..1.89 rows=1 width=4) (actual time=0.005..0.006 rows=1
loops=546)
         Index Cond: (public.contexts.context_key = articles.context_key)
 Total runtime: 341.727 ms
(41 rows)




preproduction-20091214=# select count(*),indexed from articles group by indexed;
 count  | indexed
--------+---------
 499730 | f
 341930 | t

В списке pgsql-performance по дате отправления:

Предыдущее
От: Bryce Nesbitt
Дата:
Сообщение: Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?
Следующее
От: Scott Marlowe
Дата:
Сообщение: Re: Dell PERC H700/H800