extending ORDER BY make query much slower

Поиск
Список
Период
Сортировка
От Dan Langille
Тема extending ORDER BY make query much slower
Дата
Msg-id 3E6F45D8.15284.5994716@localhost
обсуждение исходный текст
Ответы Re: extending ORDER BY make query much slower  (Stephan Szabo <sszabo@megazone23.bigpanda.com>)
Re: extending ORDER BY make query much slower  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-sql
NOTE: I've also put the analyse output at 
http://www.freshports.org/tmp/faster.txt in case it's easier to read 
than the format below.

I've found that adding another field to the ORDER BY clause, times go 
through the roof.

This query is pretty fast.

explain analyse SELECT element.id,        commit_log.id,        commit_log.message_id,        commit_log.committer,
  commit_log.commit_date,        commit_log.description,        commit_log.encoding_losses,
commit_log.message_subject,       element.name,        commit_log_elements.revision_name,
element_pathname(element.id)AS element_pathname   FROM commit_log, commit_log_elements, element  WHERE
commit_log_elements.commit_log_id= commit_log.id    AND commit_log_elements.element_id    = element.id
 
ORDER BY commit_log.commit_date DESC
LIMIT 100;

Limit  (cost=0.00..468.79 rows=100 width=261) (actual 
time=1.80..78.10 rows=100 loops=1)  ->  Nested Loop  (cost=0.00..1248045.68 rows=266228 width=261) 
(actual time=1.78..77.19 rows=101 loops=1)        ->  Nested Loop  (cost=0.00..307745.31 rows=266228 
width=244) (actual time=0.36..5.41 rows=101 loops=1)              ->  Index Scan Backward using commit_log_commit_date

on commit_log  (cost=0.00..34305.03 rows=70543 width=227) (actual 
time=0.30..1.17 rows=32 loops=1)              ->  Index Scan using commit_log_elements_clid on 
commit_log_elements  (cost=0.00..3.46 rows=33 width=17) (actual 
time=0.03..0.07 rows=3 loops=32)                    Index Cond: (commit_log_elements.commit_log_id = 
"outer".id)        ->  Index Scan using element_pkey on element  
(cost=0.00..3.52 rows=1 width=17) (actual time=0.08..0.09 rows=1 
loops=101)              Index Cond: ("outer".element_id = element.id)Total runtime: 79.52 msec

If I add another item to the ORDER BY, the times sky rocket.  I 
haven't let the query run to completion, but here's the analyse 
output:

explain analyse SELECT element.id,        commit_log.id AS my_commit_log_id,        commit_log.message_id,
to_char(commit_log.commit_date- SystemTimeAdjust(), 'DD Mon 
 
YYYY')  AS commit_date,        to_char(commit_log.commit_date - SystemTimeAdjust(), 
'HH24:MI')      AS commit_time,        commit_log.committer,        commit_log.description,
commit_log.encoding_losses,       commit_log.message_subject,        element.name,
commit_log_elements.revision_name,       element_pathname(element.id) AS element_pathname   FROM commit_log,
commit_log_elements,element  WHERE commit_log_elements.commit_log_id = commit_log.id    AND
commit_log_elements.element_id   = element.id
 
ORDER BY commit_log.commit_date DESC,         my_commit_log_id
LIMIT 100;


explain  SELECT element.id,        commit_log.id AS my_commit_log_id,        commit_log.message_id,
commit_log.committer,       commit_log.description,        commit_log.encoding_losses,
commit_log.message_subject,       element.name,        commit_log_elements.revision_name,
element_pathname(element.id)AS element_pathname   FROM commit_log, commit_log_elements, element  WHERE
commit_log_elements.commit_log_id= commit_log.id    AND commit_log_elements.element_id    = element.id
 
ORDER BY commit_log.commit_date DESC,         my_commit_log_id
LIMIT 100;

                                            QUERY PLANLimit  (cost=196855.28..196855.53 rows=100 width=261)  ->  Sort
(cost=196855.28..197520.85rows=266228 width=261)        Sort Key: commit_log.commit_date, commit_log.id        ->  Hash
Join (cost=9478.25..38465.07 rows=266228 
 
width=261)              Hash Cond: ("outer".commit_log_id = "inner".id)              ->  Hash Join
(cost=3503.91..19098.46rows=266228 
 
width=34)                    Hash Cond: ("outer".element_id = "inner".id)                    ->  Seq Scan on
commit_log_elements 
 
(cost=0.00..4703.28 rows=266228 width=17)                    ->  Hash  (cost=2463.73..2463.73 rows=132073 
width=17)                          ->  Seq Scan on element  
(cost=0.00..2463.73 rows=132073 width=17)              ->  Hash  (cost=3133.43..3133.43 rows=70543 width=227)
        ->  Seq Scan on commit_log  (cost=0.00..3133.43 
 
rows=70543 width=227)
(12 rows)

If I restructure the query, I can get times pretty close to the 
original query:

explain analyse SELECT * FROM ( SELECT element.id,        commit_log.id AS my_commit_log_id,
commit_log.message_id,       commit_log.commit_date,        commit_log.committer,        commit_log.description,
commit_log.encoding_losses,       commit_log.message_subject,        element.name,
commit_log_elements.revision_name,       element_pathname(element.id) AS element_pathname   FROM commit_log,
commit_log_elements,element  WHERE commit_log_elements.commit_log_id = commit_log.id    AND
commit_log_elements.element_id   = element.id
 
ORDER BY commit_log.commit_date DESC 
LIMIT 100) AS TEMP
order by commit_date, my_commit_log_id;

Sort  (cost=472.11..472.36 rows=100 width=261) (actual 
time=80.01..80.40 rows=100 loops=1)  Sort Key: commit_date, my_commit_log_id  ->  Subquery Scan "temp"
(cost=0.00..468.79rows=100 width=261) 
 
(actual time=1.66..78.64 rows=100 loops=1)        ->  Limit  (cost=0.00..468.79 rows=100 width=261) (actual 
time=1.63..76.75 rows=100 loops=1)              ->  Nested Loop  (cost=0.00..1248045.68 rows=266228 
width=261) (actual time=1.62..75.85 rows=101 loops=1)                    ->  Nested Loop  (cost=0.00..307745.31 
rows=266228 width=244) (actual time=0.34..5.38 rows=101 loops=1)                          ->  Index Scan Backward using

commit_log_commit_date on commit_log  (cost=0.00..34305.03 rows=70543 
width=227) (actual time=0.27..1.17 rows=32 loops=1)                          ->  Index Scan using 
commit_log_elements_clid on commit_log_elements  (cost=0.00..3.46 
rows=33 width=17) (actual time=0.03..0.07 rows=3 loops=32)                                Index Cond: 
(commit_log_elements.commit_log_id = "outer".id)                    ->  Index Scan using element_pkey on element  
(cost=0.00..3.52 rows=1 width=17) (actual time=0.08..0.09 rows=1 
loops=101)                          Index Cond: ("outer".element_id = 
element.id)Total runtime: 81.04 msec
(12 rows)

Any comments?
-- 
Dan Langille : http://www.langille.org/



В списке pgsql-sql по дате отправления:

Предыдущее
От: "Chad Thompson"
Дата:
Сообщение: Re: filtering out doubles
Следующее
От: mila
Дата:
Сообщение: Re: LEFT JOIN and missing values