Обсуждение: or kills performance
Does it make sense for a simple or in a where clause to kill performance? The statement with the OR takes 2500 ms and the statement without the OR takes 190 ms: select c.partid,c.pnid,c.leadstateid,e.stock from assembliesbatch a join partsassembly b on a.assemblyid=b.assemblyid join manufacturerpartpn c on c.partid=b.partid join assemblies d on d.assemblyid=a.assemblyid join stock e on e.pnid=c.pnid where a.batchid=382 and e.stock>0 and e.ownerid=1 and --THIS LINE CAUSING PROBLEM ((d.leadfree and leadstateid in (1,3)) or (not d.leadfree and leadstateid in (2,3,4))) order by partid,leadstateid In this particular query, the first condition does not return any rows. When I run it as and not d.leadfree and leadstateid in (2,3,4) then it takes only 190 ms and when I use only the first clause it takes less then that and returns no rows. Following are the 2 explain analyze results. (With Or first) "Sort (cost=253.83..253.83 rows=1 width=16) (actual time=8368.145..8368.352 rows=62 loops=1)" " Sort Key: c.partid, c.leadstateid" " -> Nested Loop (cost=9.61..253.82 rows=1 width=16) (actual time=14.358..8367.822 rows=62 loops=1)" " -> Nested Loop (cost=9.61..243.77 rows=1 width=24) (actual time=14.295..8353.241 rows=699 loops=1)" " -> Nested Loop (cost=9.61..94.60 rows=16 width=20) (actual time=0.100..3605.150 rows=291711 loops=1)" " Join Filter: (((NOT "inner".leadfree) AND (("outer".leadstateid = 2) OR ("outer".leadstateid = 3) OR ("outer".leadstateid = 4))) OR ("inner".leadfree AND (("outer".leadstateid = 1) OR ("outer".leadstateid = 3))))" " -> Nested Loop (cost=0.00..44.14 rows=3 width=16) (actual time=0.070..27.924 rows=793 loops=1)" " -> Index Scan using ownerids on stock e (cost=0.00..26.13 rows=3 width=8) (actual time=0.039..4.433 rows=793 loops=1)" " Index Cond: (ownerid = 1)" " Filter: (stock > 0)" " -> Index Scan using manufacturerpartpn_pkey on manufacturerpartpn c (cost=0.00..5.99 rows=1 width=12) (actual time=0.010..0.015 rows=1 loops=793)" " Index Cond: ("outer".pnid = c.pnid)" " -> Materialize (cost=9.61..13.80 rows=419 width=5) (actual time=0.003..1.444 rows=419 loops=793)" " -> Seq Scan on assemblies d (cost=0.00..9.19 rows=419 width=5) (actual time=0.008..1.931 rows=419 loops=1)" " -> Index Scan using ix_080c8ff0_5017_42a2_a174_28095b85106e_ on assembliesbatch a (cost=0.00..9.31 rows=1 width=4) (actual time=0.009..0.009 rows=0 loops=291711)" " Index Cond: ("outer".assemblyid = a.assemblyid)" " Filter: (batchid = 382)" " -> Index Scan using idx_u_assidpartid on partsassembly b (cost=0.00..10.02 rows=2 width=8) (actual time=0.012..0.012 rows=0 loops=699)" " Index Cond: (("outer".partid = b.partid) AND (b.assemblyid = "outer".assemblyid))" "Total runtime: 8368.708 ms" (Without OR) "Sort (cost=1251.95..1251.95 rows=1 width=16) (actual time=634.110..634.333 rows=62 loops=1)" " Sort Key: c.partid, c.leadstateid" " -> Nested Loop (cost=9.71..1251.94 rows=1 width=16) (actual time=3.455..633.817 rows=62 loops=1)" " -> Hash Join (cost=9.71..1168.03 rows=9 width=24) (actual time=3.428..370.329 rows=16405 loops=1)" " Hash Cond: ("outer".assemblyid = "inner".assemblyid)" " -> Nested Loop (cost=0.00..1154.62 rows=368 width=20) (actual time=0.080..235.833 rows=16472 loops=1)" " -> Nested Loop (cost=0.00..44.16 rows=1 width=16) (actual time=0.053..25.756 rows=699 loops=1)" " -> Index Scan using ownerids on stock e (cost=0.00..26.13 rows=3 width=8) (actual time=0.023..4.123 rows=793 loops=1)" " Index Cond: (ownerid = 1)" " Filter: (stock > 0)" " -> Index Scan using manufacturerpartpn_pkey on manufacturerpartpn c (cost=0.00..6.00 rows=1 width=12) (actual time=0.010..0.013 rows=1 loops=793)" " Index Cond: ("outer".pnid = c.pnid)" " Filter: ((leadstateid = 2) OR (leadstateid = 3) OR (leadstateid = 4))" " -> Index Scan using partidpa on partsassembly b (cost=0.00..1105.87 rows=368 width=8) (actual time=0.011..0.115 rows=24 loops=699)" " Index Cond: ("outer".partid = b.partid)" " -> Hash (cost=9.19..9.19 rows=210 width=4) (actual time=3.324..3.324 rows=0 loops=1)" " -> Seq Scan on assemblies d (cost=0.00..9.19 rows=210 width=4) (actual time=0.014..1.735 rows=417 loops=1)" " Filter: (NOT leadfree)" " -> Index Scan using ix_080c8ff0_5017_42a2_a174_28095b85106e_ on assembliesbatch a (cost=0.00..9.31 rows=1 width=4) (actual time=0.009..0.009 rows=0 loops=16405)" " Index Cond: ("outer".assemblyid = a.assemblyid)" " Filter: (batchid = 382)" "Total runtime: 634.672 ms" Thank You Sim
"Sim Zacks" <sim@compulab.co.il> writes: > Does it make sense for a simple or in a where clause to kill performance? Did you get the parenthesization correct? Remember that AND binds more tightly than OR. regards, tom lane
My goal is a specific batchid, stock>0, ownerid=1 and either leadfree with a state of 1 or 3 OR not leadfree with a stateof 2,3 or 4 The parenthesis are correct: where a.batchid=382 and e.stock>0 and e.ownerid=1 and ( (d.leadfree and leadstateid in (1,3) ) or (not d.leadfree and leadstateid in (2,3,4) ) ) order by partid,leadstateid I checked my indices and every field in both the where clause and the join is already indexed. I tried unioning the 2 queries and it was much faster then with the OR statement. (Took 200 ms as opposed to 2000 ms). Theunion will work, but it seems like overkill for a simple or clause. Is this the recommended way to do it? select c.partid,c.pnid,c.leadstateid,e.stock from assemblies d join assembliesbatch a on d.assemblyid=a.assemblyid join partsassembly b on b.assemblyid=d.assemblyid join manufacturerpartpn c on c.partid=b.partid join stock e on e.pnid=c.pnid where a.batchid=382 and e.stock>0 and e.ownerid=1 and not d.leadfree and leadstateid in (2,3,4) union select c.partid,c.pnid,c.leadstateid,e.stock from assemblies d join assembliesbatch a on d.assemblyid=a.assemblyid join partsassembly b on b.assemblyid=d.assemblyid join manufacturerpartpn c on c.partid=b.partid join stock e on e.pnid=c.pnid where a.batchid=382 and e.stock>0 and e.ownerid=1 and d.leadfree and leadstateid in (1,3) order by partid,leadstateid ________________________________________________________________________________ "Sim Zacks" <sim@compulab.co.il> writes: > Does it make sense for a simple or in a where clause to kill performance? Did you get the parenthesization correct? Remember that AND binds more tightly than OR. regards, tom lane
Sim Zacks <sim@compulab.co.il> writes: > The parenthesis are correct: > where a.batchid=382 and e.stock>0 and e.ownerid=1 and > ( > (d.leadfree and leadstateid in (1,3) ) > or > (not d.leadfree and leadstateid in (2,3,4) ) > ) [ goes back and looks more closely ] The row count estimates in your EXPLAINs are so far off that it's a wonder you got an OK plan for either query. Have you ANALYZEd these tables recently? The direct source of the problem seems to be that leadfree and leadstateid come from different tables, so you're taking what had been independent filter conditions for the two tables and converting them into a join condition that can't be applied until after the join. However it doesn't look to me like that is really a huge deal, because apparently these conditions are pretty unselective and you'd be reading most of each table anyway. What is really causing the problem is the choice to join partsassembly last in the slow query; in the faster query, that's joined before joining to assemblies and assembliesbatch, and apparently that reduces the number of joins to assembliesbatch very considerably. With the rowcount estimates so far off, though, it's really just luck that the planner makes a good join order choice in either case. And it doesn't look like the conditions are too hard for the planner to figure out ... I think it must be working with obsolete statistics. regards, tom lane
I am using autovacuum, and I assumed it was working correctly. Is there a way to tell if it is not doing what it is supposedto? This is the result of ps for the autovacuum PID. The docs say that it should automatically analyze. PID TTY STAT TIME COMMAND 10184 ? Ss 0:32 /usr/bin/pg_autovacuum -D -v 1000 -V 2 -s 300 -S 2 -L /var/lib/postgresql/data/pg_autovacuum.log My log files show nothing other then an error because I renamed a (different) database and it received a fatal error fora month, but that stopped last week (before I sent this message) when I rebooted the server There has been nothing inthe log file since that. This is running on gentoo and the pg_autovacuum is the default settings that starts automatically when the server boots. Is there a way to check when the last analyze was done? I don't want to run analyze now until I can check that to see wherethe problem is coming from. Thank you Sim ________________________________________________________________________________ Sim Zacks <sim@compulab.co.il> writes: > [ goes back and looks more closely ] The row count estimates in your > EXPLAINs are so far off that it's a wonder you got an OK plan for either > query. Have you ANALYZEd these tables recently? > > > > regards, tom lane