Обсуждение: Fwd: Help required on query performance
Hello, I have this query in my system which takes around 2.5 seconds to run. I have diagnosed that the problem is actually a hashjoin on perm and s_ast_role tables. Is there a way I can avoid that join? I just want to change the query and no environment change. SELECT ai.aid, SUM(ai.score) as search_score FROM sq_sch_idx ai, ( SELECT a.aid FROM t a INNER JOIN slink l ON l.minorid = a.aid INNER JOIN slink_tree t ON t.linkid = l.linkid WHERE (t.treeid LIKE '0005%') AND a.status >= '16' AND a.type_code IN ( SELECT type_code FROM t_typ_inhd WHERE inhd_type_code IN ('page') OR type_code IN ('file', 'page_rss_feed') ) AND a.aid IN ( SELECT p.aid FROM perm p LEFT JOIN s_ast_role r ON (p.userid = r.roleid) WHERE ( p.userid IN ('7') OR r.userid IN ('7') ) AND ( ( p.permission = '1' AND ( ( p.userid <> '7' OR (r.userid IS NULL OR r.userid <> '7') ) OR (p.userid = '7' AND granted = '1') OR (r.userid = '7' AND granted = '1') ) ) OR (p.permission > '1' AND p.granted = '1') ) AND a.aid=p.aid GROUP BY p.aid HAVING MIN(p.granted) <> '0' ) ) asset_check WHERE (ai.aid=asset_check.aid) AND (ai.value LIKE '%download%') GROUP BY ai.aid ; Thanks
On Sun, Jan 31, 2010 at 5:50 PM, Dave Clements <dclements89@gmail.com> wrote: > Hello, I have this query in my system which takes around 2.5 seconds > to run. I have diagnosed that the problem is actually a hashjoin on > perm and s_ast_role tables. Is there a way I can avoid that join? I > just want to change the > query and no environment change. What does explain analyze select ... (rest of your query) say?
Hi, following the output from explain analyze. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------HashAggregate (cost=38145.19..38145.20 rows=1 width=149) (actual time=2635.965..2636.086 rows=243 loops=1) -> Nested Loop (cost=15.00..38145.18 rows=1 width=149) (actual time=4.417..2635.086 rows=598 loops=1) -> Nested Loop (cost=4.13..37993.95 rows=8 width=153) (actual time=0.781..310.579 rows=975 loops=1) -> Nested Loop IN Join (cost=0.00..37260.18 rows=1 width=168) (actual time=0.747..298.686 rows=532 loops=1) -> Nested Loop (cost=0.00..37216.90 rows=2 width=236) (actual time=0.731..292.449 rows=563 loops=1) -> Seq Scan on sq_sch_idx ai (cost=0.00..20921.47 rows=10 width=149) (actual time=0.616..260.601 rows=677 loops=1) Filter: ((value)::text ~~ '%download%'::text) -> Index Scan using sq_ast_pkey on sq_ast a (cost=0.00..1629.53 rows=1 width=87) (actual time=0.045..0.046 rows=1 loops=677) Index Cond: (("outer".assetid)::text = (a.assetid)::text) Filter: ((status >= 16::smallint) AND (subplan)) SubPlan -> HashAggregate (cost=1623.50..1623.52 rows=1 width=150) (actual time=0.031..0.031 rows=1 loops=586) Filter: (min("granted") <> '0'::bpchar) -> Hash Left Join (cost=21.32..1619.40 rows=820 width=150) (actual time=0.023..0.028 rows=1 loops=586) Hash Cond: (("outer".userid)::text = ("inner".roleid)::text) Filter: (((("outer".userid)::text = '7'::text) OR (("inner".userid)::text = '7'::text)) AND ((("outer".permission = 1::smallint) AND ((("outer".userid)::text <> '7'::text) OR ("inner".userid IS NULL) OR (("inner".userid)::text <> '7'::text) OR ((("outer".userid)::text = '7'::text) AND ("outer"."granted" = '1'::bpchar)) OR ((("inner".userid)::text = '7'::text) AND ("outer"."granted" = '1'::bpchar)))) OR (("outer".permission > 1::smallint) AND ("outer"."granted" = '1'::bpchar)))) -> Bitmap Heap Scan on sq_ast_perm p (cost=7.87..1521.54 rows=820 width=297) (actual time=0.019..0.023 rows=2 loops=586) Recheck Cond: (($0)::text = (assetid)::text) -> Bitmap Index Scan on sq_ast_perm_assetid (cost=0.00..7.87 rows=820 width=0) (actual time=0.014..0.014 rows=2 loops=586) Index Cond: (($0)::text = (assetid)::text) -> Hash (cost=12.88..12.88 rows=229 width=164) (actual time=0.001..0.001 rows=0 loops=1) -> Seq Scan on sq_ast_role (cost=0.00..12.88 rows=229 width=164) (actual time=0.001..0.001 rows=0 loops=1) Filter: ((userid)::text <> '0'::text) -> Index Scan using sq_ast_typ_inhd_type_code on sq_ast_typ_inhd (cost=0.00..21.62 rows=1 width=68) (actual time=0.010..0.010 rows=1 loops=563) Index Cond: (("outer".type_code)::text = (sq_ast_typ_inhd.type_code)::text) Filter: (((inhd_type_code)::text = 'page'::text) OR ((type_code)::text = 'file'::text) OR ((type_code)::text = 'page_rss_feed'::text)) -> Bitmap Heap Scan on sq_ast_lnk l (cost=4.13..729.73 rows=324 width=23) (actual time=0.016..0.019 rows=2 loops=532) Recheck Cond: ((l.minorid)::text = ("outer".assetid)::text) -> Bitmap IndexScan on sq_ast_lnk_minorid (cost=0.00..4.13 rows=324 width=0) (actual time=0.012..0.012 rows=2 loops=532) Index Cond: ((l.minorid)::text = ("outer".assetid)::text) -> Bitmap Heap Scan on sq_ast_lnk_tree t (cost=10.87..18.88 rows=2 width=4) (actual time=2.382..2.382 rows=1 loops=975) Recheck Cond: (t.linkid = "outer".linkid) Filter: (treeid ~~ '0005%'::bytea) -> BitmapAnd (cost=10.87..10.87 rows=2 width=0) (actual time=2.379..2.379 rows=0 loops=975) -> Bitmap Index Scan on sq_ast_lnk_tree_linkid (cost=0.00..4.33 rows=381 width=0) (actual time=0.005..0.005 rows=1 loops=975) Index Cond: (t.linkid = "outer".linkid) -> Bitmap Index Scan on sq_ast_lnk_tree_pkey (cost=0.00..6.28 rows=381 width=0) (actual time=3.521..3.521 rows=16476 loops=657) Index Cond: ((treeid >= '0005'::bytea) AND (treeid < '0006'::bytea))Total runtime: 2636.294 ms thanks On Mon, Feb 1, 2010 at 11:54 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > On Sun, Jan 31, 2010 at 5:50 PM, Dave Clements <dclements89@gmail.com> wrote: >> Hello, I have this query in my system which takes around 2.5 seconds >> to run. I have diagnosed that the problem is actually a hashjoin on >> perm and s_ast_role tables. Is there a way I can avoid that join? I >> just want to change the >> query and no environment change. > > What does > > explain analyze select ... (rest of your query) > > say? >
On Sun, Jan 31, 2010 at 6:02 PM, Dave Clements <dclements89@gmail.com> wrote: > Hi, following the output from explain analyze. Without doing any heavy analysis, it looks like your row estimates are way off. Have you cranked up stats target and re-analyzed yet?
I did the re-analyze serveral times, using the command: ANALYZE tablename; Is there any other command as well or another way to do that? On Mon, Feb 1, 2010 at 12:04 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > On Sun, Jan 31, 2010 at 6:02 PM, Dave Clements <dclements89@gmail.com> wrote: >> Hi, following the output from explain analyze. > > Without doing any heavy analysis, it looks like your row estimates are > way off. Have you cranked up stats target and re-analyzed yet? >
On Sun, Jan 31, 2010 at 6:09 PM, Dave Clements <dclements89@gmail.com> wrote: > I did the re-analyze serveral times, using the command: > > ANALYZE tablename; > > Is there any other command as well or another way to do that? It's important that the stats target get increased as well, it looks like you're not getting enough buckets to get a good estimate of rows to be returned for various conditions. # show default_statistics_target ;default_statistics_target ---------------------------10 # alter database smarlowe set default_statistics_target=200; ALTER DATABASE # analyze; Then run the explain analyze again and see if your row estimates are closer and if the plan changes.
After doing an analyze on the database, it improved a lot :) On Mon, Feb 1, 2010 at 12:13 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > On Sun, Jan 31, 2010 at 6:09 PM, Dave Clements <dclements89@gmail.com> wrote: >> I did the re-analyze serveral times, using the command: >> >> ANALYZE tablename; >> >> Is there any other command as well or another way to do that? > > It's important that the stats target get increased as well, it looks > like you're not getting enough buckets to get a good estimate of rows > to be returned for various conditions. > > # show default_statistics_target ; > default_statistics_target > --------------------------- > 10 > > # alter database smarlowe set default_statistics_target=200; > ALTER DATABASE > # analyze; > > Then run the explain analyze again and see if your row estimates are > closer and if the plan changes. >
Dave Clements <dclements89@gmail.com> writes: > Hello, I have this query in my system which takes around 2.5 seconds > to run. I have diagnosed that the problem is actually a hashjoin on > perm and s_ast_role tables. Is there a way I can avoid that join? BTW, just for the record, that diagnosis was completely off. The upper level of your explain results is HashAggregate (cost=38145.19..38145.20 rows=1 width=149) (actual time=2635.965..2636.086 rows=243 loops=1) -> Nested Loop (cost=15.00..38145.18 rows=1 width=149) (actual time=4.417..2635.086 rows=598 loops=1) -> Nested Loop (cost=4.13..37993.95rows=8 width=153) (actual time=0.781..310.579 rows=975 loops=1) ... -> Bitmap HeapScan on sq_ast_lnk_tree t (cost=10.87..18.88 rows=2 width=4) (actual time=2.382..2.382 rows=1 loops=975) ... from which we can see that the main problem is doing the sq_ast_lnk_tree scan over again 975 times, once per row coming out of the other side of the join. That accounted for 975*2.382 = 2322.450 msec, or the vast majority of the runtime. The planner wouldn't have picked this plan except that it thought that only 8 rows would come out of the other side of the join; repeating the scan 8 times seemed better than the alternatives. After you improved the statistics, it most likely switched *to* a hash join (or possibly a merge join) for this step, rather than switching away from one. regards, tom lane
After the analyze I am getting the time 3.20 ms but there is not HashJoin there. Still all of them are NestLoops. But that is fine. Now the only problem is the sequence scan on sq_sch_idx table. I have a query like this: explain analyze select count(*) from sq_sch_idx where value = '%download%'; This query does a sequence scan on the table. Is there a way I can create an index for this? thanks On Mon, Feb 1, 2010 at 2:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Dave Clements <dclements89@gmail.com> writes: >> Hello, I have this query in my system which takes around 2.5 seconds >> to run. I have diagnosed that the problem is actually a hashjoin on >> perm and s_ast_role tables. Is there a way I can avoid that join? > > BTW, just for the record, that diagnosis was completely off. The > upper level of your explain results is > > HashAggregate (cost=38145.19..38145.20 rows=1 width=149) (actual time=2635.965..2636.086 rows=243 loops=1) > -> Nested Loop (cost=15.00..38145.18 rows=1 width=149) (actual time=4.417..2635.086 rows=598 loops=1) > -> Nested Loop (cost=4.13..37993.95 rows=8 width=153) (actual time=0.781..310.579 rows=975 loops=1) > ... > -> Bitmap Heap Scan on sq_ast_lnk_tree t (cost=10.87..18.88 rows=2 width=4) (actual time=2.382..2.382 rows=1loops=975) > ... > > from which we can see that the main problem is doing the sq_ast_lnk_tree > scan over again 975 times, once per row coming out of the other side of > the join. That accounted for 975*2.382 = 2322.450 msec, or the vast > majority of the runtime. The planner wouldn't have picked this plan > except that it thought that only 8 rows would come out of the other side > of the join; repeating the scan 8 times seemed better than the > alternatives. After you improved the statistics, it most likely > switched *to* a hash join (or possibly a merge join) for this step, > rather than switching away from one. > > regards, tom lane >
On Sun, Jan 31, 2010 at 9:25 PM, Dave Clements <dclements89@gmail.com> wrote: > After the analyze I am getting the time 3.20 ms but there is not > HashJoin there. Still all of them are NestLoops. But that is fine. > > Now the only problem is the sequence scan on sq_sch_idx table. > I have a query like this: > > explain analyze select count(*) from sq_sch_idx where value = '%download%'; > > This query does a sequence scan on the table. Is there a way I can > create an index for this? If it's not left anchored ( value like 'download%') then not with regular old methods. If you need to do text searching you might need to look into the full text search indexing in pgsql.