Single table forcing sequential scans on query plans

Поиск
Список
Период
Сортировка
От Cristian Gafton
Тема Single table forcing sequential scans on query plans
Дата
Msg-id Pine.LNX.4.64.0803152002420.22485@alienpad.rpath.com
обсуждение исходный текст
Ответы Re: Single table forcing sequential scans on query plans  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
I have a weird query execution plan problem I am trying to debug on 
Postgresql 8.2.6. I have a query that joins against a temporary table that 
has very few rows. If the temporary table is analyzed before the query runs, 
all is well. If the temporary table is not analyzed before running the query, 
the execution plan chosen by Postgres is comprised entirely of sequential 
scans, and the whole query can take many hours to execute.

In the application exhibiting this issue there is an "analyze tmpInstanceId" 
coded right before executing this query. But sometimes PostgreSQL chooses to 
execute the query as if the analyze has not been run, or its results 
discarded, or I don't know why every once in a while this query goes off on 
the all sequential scan plan. (I have put some debugging code in the 
application, and when the query takes longer to execute than expected, I 
print the explain right after the query finished executing, and the plan it 
comes up with is the second one I'm including. When it's fast, it is the 
first one)

My questions are:
- what would make the analyze operation "fail" in the eyes of the planner?
- why joining to a single unanalyzed table disables any and all indexes from 
the other tables references in the query?

create temporary table tmpInstanceId(    idx         serial primary key,    instanceId  integer
);
create index tmpInstanceIdIdx on tmpInstanceId(instanceId);
insert into tmpInstanceId(instanceId) values (492121), (492125);


--analyze tmpInstanceId;

explain
--analyze         select tmpInstanceId.idx, Items.item, Versions.version,    Flavors.flavor, tt.flags, Nodes.timeStamps
       from tmpInstanceId         join TroveTroves as tt using(instanceId)         join Instances on tt.includedId =
Instances.instanceId        join Items on Instances.itemId = Items.itemId         join Versions on Instances.versionId
=Versions.versionId         join Flavors on Instances.flavorId = Flavors.flavorId         join Nodes on
Instances.itemId= Nodes.itemId and             Instances.versionId = Nodes.versionId ;
 

explain analyze when analyzing the temporary table first:
QUERYPLAN
 

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Nested
Loop (cost=3258.50..28509.89 rows=217 width=333) (actual time=359.328..671.078 rows=10571 loops=1)  ->  Nested Loop
(cost=3258.50..27812.10rows=217 width=322) (actual time=331.048..557.858 rows=10571 loops=1)        ->  Hash Join
(cost=3258.50..27112.73rows=217 width=276) (actual time=330.947..503.239 rows=10571 loops=1)              Hash Cond:
(instances.flavorid= flavors.flavorid)              ->  Nested Loop  (cost=3222.89..27074.13 rows=217 width=45) (actual
time=329.451..489.734rows=10571 loops=1)                    ->  Hash Join  (cost=3222.89..14374.81 rows=3916 width=20)
(actualtime=329.295..412.439 rows=10571 loops=1)                          Hash Cond: (instances.instanceid =
tt.includedid)                         ->  Seq Scan on instances  (cost=0.00..9317.19 rows=478819 width=16) (actual
time=0.037..185.202rows=478819 loops=1)                          ->  Hash  (cost=3173.94..3173.94 rows=3916 width=12)
(actualtime=16.738..16.738 rows=10571 loops=1)                                ->  Nested Loop  (cost=0.00..3173.94
rows=3916width=12) (actual time=0.159..11.248 rows=10571 loops=1)                                      ->  Seq Scan on
tmpinstanceid (cost=0.00..1.02 rows=2 width=8) (actual time=0.005..0.007 rows=2 loops=1)
     ->  Index Scan using trovetrovesinstanceincluded_uq on trovetroves tt  (cost=0.00..1561.99 rows=1958 width=12)
(actualtime=0.102..3.400 rows=5286 loops=2)                                            Index Cond:
(tmpinstanceid.instanceid= tt.instanceid)                    ->  Index Scan using nodesversionid_fk on nodes
(cost=0.00..3.23rows=1 width=25) (actual time=0.005..0.006 rows=1 loops=10571)                          Index Cond:
((instances.versionid= nodes.versionid) AND (instances.itemid = nodes.itemid))              ->  Hash
(cost=28.05..28.05rows=605 width=239) (actual time=1.406..1.406 rows=605 loops=1)                    ->  Seq Scan on
flavors (cost=0.00..28.05 rows=605 width=239) (actual time=0.062..0.744 rows=605 loops=1)        ->  Index Scan using
versions_pkeyon versions  (cost=0.00..3.21 rows=1 width=58) (actual time=0.003..0.004 rows=1 loops=10571)
IndexCond: (instances.versionid = versions.versionid)  ->  Index Scan using items_pkey on items  (cost=0.00..3.20
rows=1width=23) (actual time=0.009..0.010 rows=1 loops=10571)        Index Cond: (instances.itemid = items.itemid)Total
runtime:673.460 ms
 


explain without analyzing the temporaray table first:
QUERYPLAN
 

-----------------------------------------------------------------------------------------------------------------------------------------Hash
Join (cost=2380399.89..2448745.27 rows=210188 width=333)  Hash Cond: (instances.itemid = items.itemid)  ->  Hash Join
(cost=2379801.92..2443155.34rows=210188 width=322)        Hash Cond: (instances.versionid = versions.versionid)
-> Hash Join  (cost=2378320.77..2437733.16 rows=210188 width=276)              Hash Cond: (instances.flavorid =
flavors.flavorid)             ->  Hash Join  (cost=2378285.15..2434807.46 rows=210188 width=45)                    Hash
Cond:(tmpinstanceid.instanceid = tt.instanceid)                    ->  Seq Scan on tmpinstanceid  (cost=0.00..29.40
rows=1940width=8)                    ->  Hash  (cost=2260473.14..2260473.14 rows=5291201 width=45)
   ->  Hash Join  (cost=95937.67..2260473.14 rows=5291201 width=45)                                Hash Cond:
(tt.includedid= instances.instanceid)                                ->  Seq Scan on trovetroves tt
(cost=0.00..1753045.32rows=95620632 width=12)                                ->  Hash  (cost=95606.47..95606.47
rows=26496width=41)                                      ->  Merge Join  (cost=89458.93..95606.47 rows=26496 width=41)
                                         Merge Cond: ((instances.versionid = nodes.versionid) AND (instances.itemid =
nodes.itemid))                                           ->  Sort  (cost=54491.66..55688.71 rows=478819 width=16)
                                          Sort Key: instances.versionid, instances.itemid
                  ->  Seq Scan on instances  (cost=0.00..9317.19 rows=478819 width=16)
         ->  Sort  (cost=34967.27..35731.14 rows=305546 width=25)                                                  Sort
Key:nodes.versionid, nodes.itemid                                                  ->  Seq Scan on nodes
(cost=0.00..7130.46rows=305546 width=25)              ->  Hash  (cost=28.05..28.05 rows=605 width=239)
 ->  Seq Scan on flavors  (cost=0.00..28.05 rows=605 width=239)        ->  Hash  (cost=946.07..946.07 rows=42807
width=58)             ->  Seq Scan on versions  (cost=0.00..946.07 rows=42807 width=58)  ->  Hash  (cost=354.65..354.65
rows=19465width=23)        ->  Seq Scan on items  (cost=0.00..354.65 rows=19465 width=23)
 
(28 rows)

Thanks,

Cristian
-- 
Cristian Gafton
rPath, Inc.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Small bug in new backend build method
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Commit fest?