Обсуждение: Fwd: Help required on query performance

Поиск
Список
Период
Сортировка

Fwd: Help required on query performance

От
Dave Clements
Дата:
Hello, I have this query in my system which takes around 2.5 seconds
to run. I have diagnosed that the problem is actually a hashjoin on
perm and s_ast_role tables. Is there a way I can avoid that join? I
just want to change the
query and no environment change.

SELECT
 ai.aid,
 SUM(ai.score) as search_score
FROM
 sq_sch_idx ai,
 (
   SELECT
     a.aid
   FROM
     t a
     INNER JOIN slink l ON l.minorid = a.aid
     INNER JOIN slink_tree t ON t.linkid = l.linkid
   WHERE
     (t.treeid LIKE '0005%')
     AND a.status  >= '16'
     AND a.type_code IN
     (
       SELECT type_code FROM t_typ_inhd WHERE inhd_type_code IN
('page') OR type_code IN ('file', 'page_rss_feed')
     )
     AND a.aid IN
     (
       SELECT
         p.aid
       FROM
         perm p LEFT JOIN s_ast_role r ON (p.userid = r.roleid)
       WHERE
         (
           p.userid IN ('7') OR r.userid IN ('7')
         ) AND
         (
           (
             p.permission = '1'
             AND
             (
               (
                 p.userid <> '7' OR
                 (r.userid IS NULL OR r.userid <> '7')
               )
               OR
               (p.userid = '7' AND granted = '1')
               OR
               (r.userid = '7' AND granted = '1')
             )
           )
           OR
           (p.permission > '1' AND p.granted = '1')
         )
         AND a.aid=p.aid
       GROUP BY
         p.aid
       HAVING
         MIN(p.granted) <> '0'
     )
 ) asset_check
WHERE
 (ai.aid=asset_check.aid)
 AND (ai.value LIKE '%download%')
GROUP BY
 ai.aid
;


Thanks


Re: Fwd: Help required on query performance

От
Scott Marlowe
Дата:
On Sun, Jan 31, 2010 at 5:50 PM, Dave Clements <dclements89@gmail.com> wrote:
> Hello, I have this query in my system which takes around 2.5 seconds
> to run. I have diagnosed that the problem is actually a hashjoin on
> perm and s_ast_role tables. Is there a way I can avoid that join? I
> just want to change the
> query and no environment change.

What does

explain analyze select ... (rest of your query)

say?


Re: Fwd: Help required on query performance

От
Dave Clements
Дата:
Hi, following the output from explain analyze.




--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------HashAggregate
(cost=38145.19..38145.20 rows=1 width=149) (actual 
time=2635.965..2636.086 rows=243 loops=1)  ->  Nested Loop  (cost=15.00..38145.18 rows=1 width=149) (actual
time=4.417..2635.086 rows=598 loops=1)        ->  Nested Loop  (cost=4.13..37993.95 rows=8 width=153)
(actual time=0.781..310.579 rows=975 loops=1)              ->  Nested Loop IN Join  (cost=0.00..37260.18 rows=1
width=168) (actual time=0.747..298.686 rows=532 loops=1)                    ->  Nested Loop  (cost=0.00..37216.90
rows=2
width=236) (actual time=0.731..292.449 rows=563 loops=1)                          ->  Seq Scan on sq_sch_idx ai
(cost=0.00..20921.47 rows=10 width=149) (actual time=0.616..260.601
rows=677 loops=1)                                Filter: ((value)::text ~~ '%download%'::text)
-> Index Scan using sq_ast_pkey on sq_ast 
a  (cost=0.00..1629.53 rows=1 width=87) (actual time=0.045..0.046
rows=1 loops=677)                                Index Cond: (("outer".assetid)::text
= (a.assetid)::text)                                Filter: ((status >= 16::smallint) AND
(subplan))                                SubPlan                                  ->  HashAggregate
(cost=1623.50..1623.52 rows=1 width=150) (actual time=0.031..0.031
rows=1 loops=586)                                        Filter: (min("granted") <> '0'::bpchar)
               ->  Hash Left Join 
(cost=21.32..1619.40 rows=820 width=150) (actual time=0.023..0.028
rows=1 loops=586)                                              Hash Cond:
(("outer".userid)::text = ("inner".roleid)::text)                                              Filter:
(((("outer".userid)::text = '7'::text) OR (("inner".userid)::text =
'7'::text)) AND ((("outer".permission = 1::smallint) AND
((("outer".userid)::text <> '7'::text) OR ("inner".userid IS NULL) OR
(("inner".userid)::text <> '7'::text) OR ((("outer".userid)::text =
'7'::text) AND ("outer"."granted" = '1'::bpchar)) OR
((("inner".userid)::text = '7'::text) AND ("outer"."granted" =
'1'::bpchar)))) OR (("outer".permission > 1::smallint) AND
("outer"."granted" = '1'::bpchar))))                                              ->  Bitmap Heap Scan on
sq_ast_perm p  (cost=7.87..1521.54 rows=820 width=297) (actual
time=0.019..0.023 rows=2 loops=586)                                                    Recheck Cond:
(($0)::text = (assetid)::text)                                                    ->  Bitmap Index
Scan on sq_ast_perm_assetid  (cost=0.00..7.87 rows=820 width=0)
(actual time=0.014..0.014 rows=2 loops=586)                                                          Index Cond:
(($0)::text = (assetid)::text)                                              ->  Hash
(cost=12.88..12.88 rows=229 width=164) (actual time=0.001..0.001
rows=0 loops=1)                                                    ->  Seq Scan on
sq_ast_role  (cost=0.00..12.88 rows=229 width=164) (actual
time=0.001..0.001 rows=0 loops=1)                                                          Filter:
((userid)::text <> '0'::text)                    ->  Index Scan using sq_ast_typ_inhd_type_code on
sq_ast_typ_inhd  (cost=0.00..21.62 rows=1 width=68) (actual
time=0.010..0.010 rows=1 loops=563)                          Index Cond: (("outer".type_code)::text =
(sq_ast_typ_inhd.type_code)::text)                          Filter: (((inhd_type_code)::text =
'page'::text) OR ((type_code)::text = 'file'::text) OR
((type_code)::text = 'page_rss_feed'::text))              ->  Bitmap Heap Scan on sq_ast_lnk l
(cost=4.13..729.73 rows=324 width=23) (actual time=0.016..0.019 rows=2
loops=532)                    Recheck Cond: ((l.minorid)::text = ("outer".assetid)::text)                    ->  Bitmap
IndexScan on sq_ast_lnk_minorid 
(cost=0.00..4.13 rows=324 width=0) (actual time=0.012..0.012 rows=2
loops=532)                          Index Cond: ((l.minorid)::text =
("outer".assetid)::text)        ->  Bitmap Heap Scan on sq_ast_lnk_tree t  (cost=10.87..18.88
rows=2 width=4) (actual time=2.382..2.382 rows=1 loops=975)              Recheck Cond: (t.linkid = "outer".linkid)
       Filter: (treeid ~~ '0005%'::bytea)              ->  BitmapAnd  (cost=10.87..10.87 rows=2 width=0) 
(actual time=2.379..2.379 rows=0 loops=975)                    ->  Bitmap Index Scan on sq_ast_lnk_tree_linkid
(cost=0.00..4.33 rows=381 width=0) (actual time=0.005..0.005 rows=1
loops=975)                          Index Cond: (t.linkid = "outer".linkid)                    ->  Bitmap Index Scan on
sq_ast_lnk_tree_pkey
(cost=0.00..6.28 rows=381 width=0) (actual time=3.521..3.521
rows=16476 loops=657)                          Index Cond: ((treeid >= '0005'::bytea) AND
(treeid < '0006'::bytea))Total runtime: 2636.294 ms


thanks


On Mon, Feb 1, 2010 at 11:54 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Sun, Jan 31, 2010 at 5:50 PM, Dave Clements <dclements89@gmail.com> wrote:
>> Hello, I have this query in my system which takes around 2.5 seconds
>> to run. I have diagnosed that the problem is actually a hashjoin on
>> perm and s_ast_role tables. Is there a way I can avoid that join? I
>> just want to change the
>> query and no environment change.
>
> What does
>
> explain analyze select ... (rest of your query)
>
> say?
>


Re: Fwd: Help required on query performance

От
Scott Marlowe
Дата:
On Sun, Jan 31, 2010 at 6:02 PM, Dave Clements <dclements89@gmail.com> wrote:
> Hi, following the output from explain analyze.

Without doing any heavy analysis, it looks like your row estimates are
way off.  Have you cranked up stats target and re-analyzed yet?


Re: Fwd: Help required on query performance

От
Dave Clements
Дата:
I did the re-analyze serveral times, using the command:

ANALYZE tablename;

Is there any other command as well or another way to do that?




On Mon, Feb 1, 2010 at 12:04 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Sun, Jan 31, 2010 at 6:02 PM, Dave Clements <dclements89@gmail.com> wrote:
>> Hi, following the output from explain analyze.
>
> Without doing any heavy analysis, it looks like your row estimates are
> way off.  Have you cranked up stats target and re-analyzed yet?
>


Re: Fwd: Help required on query performance

От
Scott Marlowe
Дата:
On Sun, Jan 31, 2010 at 6:09 PM, Dave Clements <dclements89@gmail.com> wrote:
> I did the re-analyze serveral times, using the command:
>
> ANALYZE tablename;
>
> Is there any other command as well or another way to do that?

It's important that the stats target get increased as well, it looks
like you're not getting enough buckets to get a good estimate of rows
to be returned for various conditions.

# show default_statistics_target ;default_statistics_target
---------------------------10

# alter database smarlowe set default_statistics_target=200;
ALTER DATABASE
# analyze;

Then run the explain analyze again and see if your row estimates are
closer and if the plan changes.


Re: Fwd: Help required on query performance

От
Dave Clements
Дата:
After doing an analyze on the database, it improved a lot :)



On Mon, Feb 1, 2010 at 12:13 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Sun, Jan 31, 2010 at 6:09 PM, Dave Clements <dclements89@gmail.com> wrote:
>> I did the re-analyze serveral times, using the command:
>>
>> ANALYZE tablename;
>>
>> Is there any other command as well or another way to do that?
>
> It's important that the stats target get increased as well, it looks
> like you're not getting enough buckets to get a good estimate of rows
> to be returned for various conditions.
>
> # show default_statistics_target ;
>  default_statistics_target
> ---------------------------
>  10
>
> # alter database smarlowe set default_statistics_target=200;
> ALTER DATABASE
> # analyze;
>
> Then run the explain analyze again and see if your row estimates are
> closer and if the plan changes.
>


Re: Fwd: Help required on query performance

От
Tom Lane
Дата:
Dave Clements <dclements89@gmail.com> writes:
> Hello, I have this query in my system which takes around 2.5 seconds
> to run. I have diagnosed that the problem is actually a hashjoin on
> perm and s_ast_role tables. Is there a way I can avoid that join?

BTW, just for the record, that diagnosis was completely off.  The
upper level of your explain results is
HashAggregate  (cost=38145.19..38145.20 rows=1 width=149) (actual time=2635.965..2636.086 rows=243 loops=1)  ->  Nested
Loop (cost=15.00..38145.18 rows=1 width=149) (actual time=4.417..2635.086 rows=598 loops=1)        ->  Nested Loop
(cost=4.13..37993.95rows=8 width=153) (actual time=0.781..310.579 rows=975 loops=1)            ...        ->  Bitmap
HeapScan on sq_ast_lnk_tree t  (cost=10.87..18.88 rows=2 width=4) (actual time=2.382..2.382 rows=1 loops=975)
...
 

from which we can see that the main problem is doing the sq_ast_lnk_tree
scan over again 975 times, once per row coming out of the other side of
the join.  That accounted for 975*2.382 = 2322.450 msec, or the vast
majority of the runtime.  The planner wouldn't have picked this plan
except that it thought that only 8 rows would come out of the other side
of the join; repeating the scan 8 times seemed better than the
alternatives.  After you improved the statistics, it most likely
switched *to* a hash join (or possibly a merge join) for this step,
rather than switching away from one.
        regards, tom lane


Re: Fwd: Help required on query performance

От
Dave Clements
Дата:
After the analyze I am getting the time 3.20 ms but there is not
HashJoin there. Still all of them are NestLoops. But that is fine.

Now the only problem is the sequence scan on sq_sch_idx table.
I have a query like this:

explain analyze select count(*) from sq_sch_idx where value = '%download%';

This query does a sequence scan on the table. Is there a way I can
create an index for this?


thanks

On Mon, Feb 1, 2010 at 2:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Dave Clements <dclements89@gmail.com> writes:
>> Hello, I have this query in my system which takes around 2.5 seconds
>> to run. I have diagnosed that the problem is actually a hashjoin on
>> perm and s_ast_role tables. Is there a way I can avoid that join?
>
> BTW, just for the record, that diagnosis was completely off.  The
> upper level of your explain results is
>
>  HashAggregate  (cost=38145.19..38145.20 rows=1 width=149) (actual time=2635.965..2636.086 rows=243 loops=1)
>   ->  Nested Loop  (cost=15.00..38145.18 rows=1 width=149) (actual time=4.417..2635.086 rows=598 loops=1)
>         ->  Nested Loop  (cost=4.13..37993.95 rows=8 width=153) (actual time=0.781..310.579 rows=975 loops=1)
>             ...
>         ->  Bitmap Heap Scan on sq_ast_lnk_tree t  (cost=10.87..18.88 rows=2 width=4) (actual time=2.382..2.382
rows=1loops=975) 
>             ...
>
> from which we can see that the main problem is doing the sq_ast_lnk_tree
> scan over again 975 times, once per row coming out of the other side of
> the join.  That accounted for 975*2.382 = 2322.450 msec, or the vast
> majority of the runtime.  The planner wouldn't have picked this plan
> except that it thought that only 8 rows would come out of the other side
> of the join; repeating the scan 8 times seemed better than the
> alternatives.  After you improved the statistics, it most likely
> switched *to* a hash join (or possibly a merge join) for this step,
> rather than switching away from one.
>
>                        regards, tom lane
>


Re: Fwd: Help required on query performance

От
Scott Marlowe
Дата:
On Sun, Jan 31, 2010 at 9:25 PM, Dave Clements <dclements89@gmail.com> wrote:
> After the analyze I am getting the time 3.20 ms but there is not
> HashJoin there. Still all of them are NestLoops. But that is fine.
>
> Now the only problem is the sequence scan on sq_sch_idx table.
> I have a query like this:
>
> explain analyze select count(*) from sq_sch_idx where value = '%download%';
>
> This query does a sequence scan on the table. Is there a way I can
> create an index for this?

If it's not left anchored ( value like 'download%') then not with
regular old methods.  If you need to do text searching you might need
to look into the full text search indexing in pgsql.