Help interpreting the output of EXPLAIN

Поиск
Список
Период
Сортировка
От Mayers, Philip J
Тема Help interpreting the output of EXPLAIN
Дата
Msg-id A0F836836670D41183A800508BAF190B35E231@icex1.cc.ic.ac.uk
обсуждение исходный текст
Ответы Re: Help interpreting the output of EXPLAIN  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
All,

I have an 4-5 table SQL database where I do queries of various expense. The
most expensive query involves a UNION of two inner joins. I'm having trouble
interpreting the output of the EXPLAIN in order to optimise it. Some
problems I'm having:

1) The costs seem to change radically without me taking any action - do I
have to run them on an unloaded machine?
2) VACUUM seemed to massively increase the cost
3) VACUUM ANALYZE decreased it - so, should I always do "VACUUM; VACUUM
ANALYZE;"
4) I'm using LIMIT after the outer join - should I LIMIT the two inner joins
to the same amount, since or will Postgres do this itself?

After a VACUUM ANALYZE and using the default indexing scheme (probably
what's slowing me down) EXPLAIN gives this:

 1 Unique  (cost=27403.90..27559.07 rows=1034 width=84)
 2   ->  Sort  (cost=27403.90..27403.90 rows=10345 width=84)
 3         ->  Append  (cost=716.18..26447.76 rows=10345 width=84)
 4                 ->  Hash Join  (cost=716.18..20480.67 rows=9209 width=72)
 5                       ->  Nested Loop  (cost=0.00..18931.96 rows=9209
width=60)
 6                             ->  Seq Scan on host  (cost=0.00..211.09
rows=9209 width=42)
 7                             ->  Index Scan using interface_pkey on
interface  (cost=0.00..2.02 rows=1 width=18)
 8                       ->  Hash  (cost=604.94..604.94 rows=17294 width=12)
 9                             ->  Seq Scan on machine  (cost=0.00..604.94
rows=17294 width=12)
10                 ->  Hash Join  (cost=4656.18..5967.08 rows=1136 width=84)
11                       ->  Seq Scan on machine  (cost=0.00..604.94
rows=17294 width=12)
12                       ->  Hash  (cost=4653.34..4653.34 rows=1136
width=72)
13                             ->  Nested Loop  (cost=0.00..4653.34
rows=1136 width=72)
14                                   ->  Nested Loop  (cost=0.00..2343.98
rows=1136 width=54)
15                                         ->  Seq Scan on alias
(cost=0.00..26.36 rows=1136 width=36)
16                                         ->  Index Scan using host_pkey on
host  (cost=0.00..2.03 rows=1 width=18)
17                                   ->  Index Scan using interface_pkey on
interface  (cost=0.00..2.02 rows=1 width=18)

Am I right in thinking that the loop on line 5 is costing most, and what's
costing most under that is the sequential scan on host? That corresponds to
a join of the form "host.mac = interface.mac", so I should hash index the
host.mac column? Unfortunately, mac is of type "macaddr" and this happens:

hdb=> create index host_mac on host using hash (mac);
ERROR:  fmgr_info: function 0: cache lookup failed
hdb=>

I'm more used to optimising C than SQL - how can you tell from an explain
which type of index will decrease the cost (or how to restructure your query
to lower the cost). Will postgres order WHERE clauses such that the least
expensive happens first?

Regards,
Phil

+----------------------------------+
| Phil Mayers, Network Support     |
| Centre for Computing Services    |
| Imperial College                 |
+----------------------------------+

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Problems with vacuum
Следующее
От: "Bryan \(Mailing Lists\)"
Дата:
Сообщение: How do I remove a foreign key constraint?