Обсуждение: Help optimize view

Поиск
Список
Период
Сортировка

Help optimize view

От
"Relyea, Mike"
Дата:
I'm have the following view as part of a larger, aggregate query that is
running slower than I'd like.  There are 4 views total, each very
similar to this one.  Each of the views is then left joined with data
from some other tables to give me the final result that I'm looking for.

I'm hoping that if I can get some insight in to how to make this view
execute faster, I can apply that learning to the other 3 views and
thereby decrease the run time for my aggregate query.

I'm running 8.2.4 on Windows XP with a single 10K rpm disk dedicated to
the data directory.
shared_buffers = 12288
work_mem = 262144
maintenance_work_mem = 131072
max_fsm_pages = 204800
random_page_cost = 2.0
effective_cache_size = 10000
autovacuum = on


SELECT "PrintSamples"."MachineID", "PrintSamples"."PrintCopyID",
"tblColors"."ColorID", avg("ParameterValues"."ParameterValue") AS
"Mottle_NMF"

FROM "PrintSamples", "DigitalImages", "PrintSampleAnalyses",
"Measurements", "ParameterValues", "tblTPNamesAndColors", "tblColors",
"AnalysisModules", "ParameterNames"

WHERE "DigitalImages"."ImageID" = "PrintSampleAnalyses"."ImageID"
AND "PrintSamples"."PrintSampleID" = "DigitalImages"."PrintSampleID"
AND "PrintSampleAnalyses"."psaID" = "Measurements"."psaID"
AND "Measurements"."MeasurementID" = "ParameterValues"."MeasurementID"
AND "AnalysisModules"."MetricID" = "Measurements"."MetricID"
AND "ParameterNames"."ParameterID" = "ParameterValues"."ParameterID"
AND "tblTPNamesAndColors"."TestPatternName" =
"PrintSamples"."TestPatternName"
AND "tblColors"."ColorID" = "tblTPNamesAndColors"."ColorID"

GROUP BY "PrintSamples"."MachineID", "PrintSamples"."PrintCopyID",
"tblColors"."ColorID", "AnalysisModules"."AnalysisModuleName",
"ParameterNames"."ParameterName", "PrintSamples"."TestPatternName"

HAVING "PrintSamples"."MachineID" = 4741 OR "PrintSamples"."MachineID" =
4745 AND "AnalysisModules"."AnalysisModuleName" = 'NMF' AND
"ParameterNames"."ParameterName" = 'NMF' AND "tblColors"."ColorID" <> 3
AND "PrintSamples"."TestPatternName" LIKE 'IQAF-TP8%';


EXPLAIN ANALYZE
HashAggregate (cost=6069.71..6069.82 rows=9 width=70) (actual
time=3230.868..3230.923 rows=31 loops=1)
 -> Nested Loop (cost=1.77..6069.55 rows=9 width=70) (actual
time=367.959..3230.476 rows=31 loops=1)
    Join Filter: ("ParameterNames"."ParameterID" =
"ParameterValues"."ParameterID")
    -> Seq Scan on "ParameterNames" (cost=0.00..1.94 rows=1 width=17)
(actual time=0.020..0.032 rows=1 loops=1)
       Filter: (("ParameterName")::text = 'NMF'::text)
    -> Nested Loop (cost=1.77..6059.09 rows=682 width=61) (actual
time=367.905..3230.154 rows=124 loops=1)
       -> Hash Join (cost=1.77..2889.96 rows=151 width=57) (actual
time=119.748..1447.130 rows=31 loops=1)
          Hash Cond: ("Measurements"."MetricID" =
"AnalysisModules"."MetricID")
          -> Nested Loop (cost=0.00..2880.22 rows=1722 width=48) (actual
time=55.278..1444.801 rows=1656 loops=1)
             -> Nested Loop (cost=0.00..226.25 rows=18 width=44) (actual
time=10.080..13.951 rows=31 loops=1)
                -> Nested Loop (cost=0.00..151.33 rows=18 width=44)
(actual time=5.030..8.266 rows=31 loops=1)
                   -> Nested Loop (cost=0.00..74.21 rows=18 width=44)
(actual time=2.253..4.822 rows=31 loops=1)
                      Join Filter: ("tblColors"."ColorID" =
"tblTPNamesAndColors"."ColorID")
                      -> Nested Loop (cost=0.00..48.11 rows=24 width=44)
(actual time=2.232..3.619 rows=43 loops=1)
                         -> Index Scan using "PSMachineID_idx" on
"PrintSamples" (cost=0.00..7.99 rows=29 width=40)
(actual time=2.204..2.515 rows=43 loops=1)
                            Index Cond: ("MachineID" = 4741)
                            Filter: (("TestPatternName")::text ~~
'IQAF-TP8%'::text)
                         -> Index Scan using "TPNTestPatternName" on
"tblTPNamesAndColors"

(cost=0.00..1.37 rows=1 width=30)

(actual time=0.011..0.015 rows=1 loops=43)
                            Index Cond:
(("tblTPNamesAndColors"."TestPatternName")::text =
("PrintSamples"."TestPatternName")::text)
                      -> Seq Scan on "tblColors" (cost=0.00..1.05 rows=3
width=4)
                                (actual
time=0.004..0.010 rows=3 loops=43)
                         Filter: ("ColorID" <> 3)
                   -> Index Scan using "DIPrintSampleID_idx" on
"DigitalImages"

(cost=0.00..4.27 rows=1 width=8)

(actual time=0.100..0.102 rows=1 loops=31)
                      Index Cond: ("PrintSamples"."PrintSampleID" =
"DigitalImages"."PrintSampleID")
                -> Index Scan using "PSAImageID_idx" on
"PrintSampleAnalyses" (cost=0.00..4.15 rows=1 width=8)

(actual time=0.171..0.174 rows=1 loops=31)
                   Index Cond: ("DigitalImages"."ImageID" =
"PrintSampleAnalyses"."ImageID")
             -> Index Scan using "MpsaID_idx" on "Measurements"
(cost=0.00..120.33 rows=2169 width=12)

(actual time=19.381..46.016 rows=53 loops=31)
                Index Cond: ("PrintSampleAnalyses"."psaID" =
"Measurements"."psaID")
          -> Hash (cost=1.71..1.71 rows=5 width=17) (actual
time=0.073..0.073 rows=5 loops=1)
             -> Seq Scan on "AnalysisModules" (cost=0.00..1.71 rows=5
width=17) (actual time=0.013..0.030 rows=5 loops=1)
                Filter: (("AnalysisModuleName")::text = 'NMF'::text)
       -> Index Scan using "PVMeasurementID_idx" on "ParameterValues"
(cost=0.00..16.56 rows=354 width=12)

(actual time=56.359..57.495 rows=4 loops=31)
          Index Cond: ("Measurements"."MeasurementID" =
"ParameterValues"."MeasurementID")
Total runtime: 3231.331 ms

Re: Help optimize view

От
"Kevin Grittner"
Дата:
>>> On Fri, Aug 10, 2007 at 11:57 AM, in message
<1806D1F73FCB7F439F2C842EE0627B18065BEC18@usa0300ms01.na.xerox.net>, "Relyea,
Mike" <Mike.Relyea@xerox.com> wrote:
> I'm have the following view as part of a larger, aggregate query that is
> running slower than I'd like.
> . . .
> HAVING "PrintSamples"."MachineID" = 4741 OR "PrintSamples"."MachineID" =
> 4745 AND "AnalysisModules"."AnalysisModuleName" = 'NMF' AND
> "ParameterNames"."ParameterName" = 'NMF' AND "tblColors"."ColorID" <> 3
> AND "PrintSamples"."TestPatternName" LIKE 'IQAF-TP8%';

First off, let's make sure we're optimizing the query you really want to run.
AND binds tighter than OR, so as you have it written, it is the same as:

  HAVING "PrintSamples"."MachineID" = 4741
      OR (   "PrintSamples"."MachineID" = 4745
         AND "AnalysisModules"."AnalysisModuleName" = 'NMF'
         AND "ParameterNames"."ParameterName" = 'NMF'
         AND "tblColors"."ColorID" <> 3
         AND "PrintSamples"."TestPatternName" LIKE 'IQAF-TP8%';
         )

I fear you may really want it evaluate to:

  HAVING ("PrintSamples"."MachineID" = 4741 OR "PrintSamples"."MachineID" = 4745)
     AND "AnalysisModules"."AnalysisModuleName" = 'NMF'
     AND "ParameterNames"."ParameterName" = 'NMF'
     AND "tblColors"."ColorID" <> 3
     AND "PrintSamples"."TestPatternName" LIKE 'IQAF-TP8%';

-Kevin




Re: Help optimize view

От
"Kevin Grittner"
Дата:
>>> On Fri, Aug 10, 2007 at 11:57 AM, in message
<1806D1F73FCB7F439F2C842EE0627B18065BEC18@usa0300ms01.na.xerox.net>, "Relyea,
Mike" <Mike.Relyea@xerox.com> wrote:
> HAVING "PrintSamples"."MachineID" = 4741 OR "PrintSamples"."MachineID" =
> 4745 AND . . .

On top of the issue in my prior email, I don't see any test for 4745 in the
EXPLAIN ANALYZE output, which makes me think it doesn't go with the posted
query.

-Kevin




Re: Help optimize view

От
"Relyea, Mike"
Дата:
> From: Kevin Grittner [mailto:Kevin.Grittner@wicourts.gov]
>
> First off, let's make sure we're optimizing the query you
> really want to run.
> AND binds tighter than OR, so as you have it written, it is
> the same as:
>
>   HAVING "PrintSamples"."MachineID" = 4741
>       OR (   "PrintSamples"."MachineID" = 4745
>          AND "AnalysisModules"."AnalysisModuleName" = 'NMF'
>          AND "ParameterNames"."ParameterName" = 'NMF'
>          AND "tblColors"."ColorID" <> 3
>          AND "PrintSamples"."TestPatternName" LIKE 'IQAF-TP8%';
>          )
>
> I fear you may really want it evaluate to:
>
>   HAVING ("PrintSamples"."MachineID" = 4741 OR
> "PrintSamples"."MachineID" = 4745)
>      AND "AnalysisModules"."AnalysisModuleName" = 'NMF'
>      AND "ParameterNames"."ParameterName" = 'NMF'
>      AND "tblColors"."ColorID" <> 3
>      AND "PrintSamples"."TestPatternName" LIKE 'IQAF-TP8%';

The query I really want to run is several times larger than this.  I
didn't think people would want to wade through pages and pages worth of
SQL and then explain analyze results - especially when I'm fairly
certain that optimizing this smaller part of the overall aggregate query
would provide me the help I was looking for.

You're right about what I really want the query to evaluate to.  I'll
give your suggestion a try.  Thanks.

Mike