Обсуждение: CPU usage goes to 100%, query seems to ran forever

Поиск
Список
Период
Сортировка

CPU usage goes to 100%, query seems to ran forever

От
"Andrus"
Дата:
I have small database running in  8.1.3 in W2K server.
The following query causes Postgres process to use 100% CPU and seems to run
forever.
If I change '1EEKPANT' to less frequently used item code, it runs fast.

How to speed it up ?

set search_path to public,firma2;
 select      rid.toode
   FROM dok JOIN rid USING (dokumnr)
   JOIN toode USING (toode)
   LEFT JOIN artliik ON toode.grupp=artliik.grupp and
    toode.liik=artliik.liik
   WHERE (NOT '0'  or dok.kinnitatud)
  AND dok.kuupaev BETWEEN '2006-04-08' AND '2006-04-27'
  AND rid.toode='1EEKPANT'
 AND (NOT dok.eimuuda or '0' ) and
 dok.laonr='1'::float8  and
 POSITION( dok.doktyyp IN 'OSIDVGYKIF')!=0 AND
 ( ( ('1'  OR (POSITION(dok.doktyyp IN 'TUNH')=0 and
       (rid.kogus<0 or
    ('1'  and rid.kogus=0))))
     and
         POSITION(dok.doktyyp IN 'VGYKITDNHMEBARCFJ' )!=0
          AND CASE WHEN NOT dok.objrealt OR dok.doktyyp='I' THEN dok.yksus
ELSE rid.kuluobjekt END LIKE 'LADU%' ESCAPE '!'
      )
     OR
     (POSITION(dok.doktyyp IN 'OSIUDP' )!=0
          AND CASE WHEN dok.objrealt THEN rid.kuluobjekt ELSE dok.sihtyksus
END LIKE 'LADU%' ESCAPE '!'
      )
   )
  AND dok.kuupaev||dok.kellaaeg BETWEEN '2006-04-08' AND '2006-04-2723 59'
 AND ('0'  or ( length(trim(rid.toode))>2 AND
     rid.toode is NOT NULL))

 AND ( LENGTH('' )=0 OR rid.partii=''  OR (dok.doktyyp='I' AND
        rid.kulupartii=''  ) )
 AND (NOT dok.inventuur or rid.kogus!=0)
   AND dok.dokumnr!= 0
 AND ( artliik.arttyyp NOT IN ('Teenus', 'Komplekt' ) OR artliik.arttyyp IS
NULL)


explain returns:

"Nested Loop Left Join  (cost=0.00..1828.18 rows=1 width=24)"
"  Filter: ((("inner".arttyyp <> 'Teenus'::bpchar) AND ("inner".arttyyp <>
'Komplekt'::bpchar)) OR ("inner".arttyyp IS NULL))"
"  ->  Nested Loop  (cost=0.00..1822.51 rows=1 width=43)"
"        ->  Nested Loop  (cost=0.00..1816.56 rows=1 width=24)"
"              Join Filter: (("outer".dokumnr = "inner".dokumnr) AND
((("position"('VGYKITDNHMEBARCFJ'::text, ("outer".doktyyp)::text) <> 0) AND
(CASE WHEN ((NOT ("outer".objrealt)::boolean) OR ("outer".doktyyp =
'I'::bpchar)) THEN "outer".yksus ELSE "inner (..)"
"              ->  Seq Scan on dok  (cost=0.00..787.80 rows=1 width=39)"
"                    Filter: ((kuupaev >= '2006-04-08'::date) AND (kuupaev
<= '2006-04-27'::date) AND (NOT (eimuuda)::boolean) AND ((laonr)::double
precision = 1::double precision) AND ("position"('OSIDVGYKIF'::text,
(doktyyp)::text) <> 0) AND (((kuupaev):: (..)"
"              ->  Seq Scan on rid  (cost=0.00..1019.42 rows=249 width=51)"
"                    Filter: ((toode = '1EEKPANT'::bpchar) AND
(length(btrim((toode)::text)) > 2) AND (toode IS NOT NULL))"
"        ->  Index Scan using toode_pkey on toode  (cost=0.00..5.94 rows=1
width=43)"
"              Index Cond: ('1EEKPANT'::bpchar = toode)"
"  ->  Index Scan using artliik_pkey on artliik  (cost=0.00..5.65 rows=1
width=88)"
"        Index Cond: (("outer".grupp = artliik.grupp) AND ("outer".liik =
artliik.liik))"


Andrus.



Re: CPU usage goes to 100%, query seems to ran forever

От
Tom Lane
Дата:
"Andrus" <eetasoft@online.ee> writes:
> I have small database running in  8.1.3 in W2K server.
> The following query causes Postgres process to use 100% CPU and seems to run
> forever.
> If I change '1EEKPANT' to less frequently used item code, it runs fast.

You have ANALYZEd all these tables recently, I hope?  The planner
certainly doesn't think this query will take very long.

To find out what's wrong, you're going to have to be patient enough to
let an EXPLAIN ANALYZE run to completion.  Plain EXPLAIN won't tell.

            regards, tom lane

Re: CPU usage goes to 100%, query seems to ran forever

От
"Andrus"
Дата:
> You have ANALYZEd all these tables recently, I hope?  The planner
> certainly doesn't think this query will take very long.

I have autovacuum running so I expect it takes care of ANALYZE, isn't it ?

I ran also analyze command before running explain analyze.

> To find out what's wrong, you're going to have to be patient enough to
> let an EXPLAIN ANALYZE run to completion.  Plain EXPLAIN won't tell.

Here it is running in my local computer. I'm expecting run time no more 1
second

"Nested Loop Left Join  (cost=0.00..1829.95 rows=1 width=24) (actual
time=492064.990..492064.990 rows=0 loops=1)"
"  Filter: ((("inner".arttyyp <> 'Teenus'::bpchar) AND ("inner".arttyyp <>
'Komplekt'::bpchar)) OR ("inner".arttyyp IS NULL))"
"  ->  Nested Loop  (cost=0.00..1825.01 rows=1 width=43) (actual
time=492064.983..492064.983 rows=0 loops=1)"
"        ->  Nested Loop  (cost=0.00..1819.04 rows=1 width=24) (actual
time=492064.978..492064.978 rows=0 loops=1)"
"              Join Filter: (("outer".dokumnr = "inner".dokumnr) AND
((("position"('VGYKITDNHMEBARCFJ'::text, ("outer".doktyyp)::text) <> 0) AND
(CASE WHEN ((NOT ("outer".objrealt)::boolean) OR ("outer".doktyyp =
'I'::bpchar)) THEN "outer".yksus ELSE "inner (..)"
"              ->  Seq Scan on dok  (cost=0.00..787.80 rows=1 width=39)
(actual time=0.152..878.198 rows=7670 loops=1)"
"                    Filter: ((kuupaev >= '2006-04-08'::date) AND (kuupaev
<= '2006-04-27'::date) AND (NOT (eimuuda)::boolean) AND ((laonr)::double
precision = 1::double precision) AND ("position"('OSIDVGYKIF'::text,
(doktyyp)::text) <> 0) AND (((kuupaev):: (..)"
"              ->  Seq Scan on rid  (cost=0.00..1019.42 rows=315 width=51)
(actual time=22.003..62.216 rows=839 loops=7670)"
"                    Filter: ((toode = '1EEKPANT'::bpchar) AND
(length(btrim((toode)::text)) > 2) AND (toode IS NOT NULL))"
"        ->  Index Scan using toode_pkey on toode  (cost=0.00..5.96 rows=1
width=43) (never executed)"
"              Index Cond: ('1EEKPANT'::bpchar = toode)"
"  ->  Index Scan using artliik_pkey on artliik  (cost=0.00..4.92 rows=1
width=31) (never executed)"
"        Index Cond: (("outer".grupp = artliik.grupp) AND ("outer".liik =
artliik.liik))"
"Total runtime: 492065.840 ms"


Andrus.



Re: CPU usage goes to 100%, query seems to ran forever

От
Tom Lane
Дата:
"Andrus" <eetasoft@online.ee> writes:
> Here it is running in my local computer. I'm expecting run time no more 1
> second

Something seems to have truncated your EXPLAIN output, but anyway we
can see where the problem is:

> "              ->  Seq Scan on dok  (cost=0.00..787.80 rows=1 width=39)
> (actual time=0.152..878.198 rows=7670 loops=1)"
> "                    Filter: ((kuupaev >= '2006-04-08'::date) AND (kuupaev
> <= '2006-04-27'::date) AND (NOT (eimuuda)::boolean) AND ((laonr)::double
> precision = 1::double precision) AND ("position"('OSIDVGYKIF'::text,
> (doktyyp)::text) <> 0) AND (((kuupaev):: (..)"

The planner is expecting to get one row from "dok" passing the filter
condition, and hence chooses a plan that is suitable for a small number
of rows ... but in reality there are 7670 rows matching the filter
condition, and that's what blows the runtime out of the water.  (Most of
the runtime is actually going into 7670 repeated scans of "rid", which
wouldn't have happened with another plan type.)

So you need to see about getting that estimate to be more accurate.
First thing is to make sure that "dok" has been ANALYZEd --- just do it
by hand.  If that doesn't change the EXPLAIN plan then you need to work
harder.  I can see at least three things you are doing that are
unnecessarily destroying the planner's ability to estimate the number of
matching rows:

 dok.laonr='1'::float8  and

Since laonr apparently isn't float8, this forces a runtime type
conversion as well as interfering with statistics use.  (The planner
will have ANALYZE stats about dok.laonr, but the connection to
dok.laonr::float8 escapes it.)  Just write the constant with quotes
and no type coercion.

 POSITION( dok.doktyyp IN 'OSIDVGYKIF')!=0 AND

This is completely unestimatable given the available statistics, and it
doesn't look to me like it is all that great a semantic representation
either.  Perhaps the query that's really meant here is "dok.doktypp IN
('O','S','I', ...)"?  If so, you should say what you mean, not play
games with converting the query into some strange string operation.

  AND dok.kuupaev||dok.kellaaeg BETWEEN '2006-04-08' AND '2006-04-2723 59'

This is another case where the planner is not going to have any ability
to make a useful estimate, and it's because you are using a crummy
representation of your data.  You should merge those two columns into
one timestamp column and just do a simple BETWEEN test.


By and large, unnatural representations of data that you use in WHERE
clauses are going to cost you big-time in SQL queries.  It's worth
taking time up front to design a clean table schema, and taking time
to revise it when requirements change.

            regards, tom lane

Re: CPU usage goes to 100%, query seems to ran forever

От
"Andrus"
Дата:
> Something seems to have truncated your EXPLAIN output, but anyway we
> can see where the problem is:

I copied it from pgAdmin in 640x480  screen resolution in XP
Maybe pgAdmin bug ?

> The planner is expecting to get one row from "dok" passing the filter
> condition, and hence chooses a plan that is suitable for a small number
> of rows ... but in reality there are 7670 rows matching the filter
> condition, and that's what blows the runtime out of the water.  (Most of
> the runtime is actually going into 7670 repeated scans of "rid", which
> wouldn't have happened with another plan type.)

I added index

CREATE INDEX rid_toode_idx  ON firma2.rid  USING btree  (toode);

and query start working fast !

> So you need to see about getting that estimate to be more accurate.
> First thing is to make sure that "dok" has been ANALYZEd --- just do it
> by hand.

As I wrote I have autovacuum running. Is'nt this sufficient ?

>  I can see at least three things you are doing that are
> unnecessarily destroying the planner's ability to estimate the number of
> matching rows:
>
> dok.laonr='1'::float8  and
> Since laonr apparently isn't float8, this forces a runtime type
> conversion as well as interfering with statistics use.  (The planner
> will have ANALYZE stats about dok.laonr, but the connection to
> dok.laonr::float8 escapes it.)  Just write the constant with quotes
> and no type coercion.

I re-wrote it as

dok.laonr=1

this query is automatically generated by VFP and ODBC parameter substitution
which adds those type conversions.
VFP has only float8 type and it probably forces ODBC driver convert numbers
to float8

> POSITION( dok.doktyyp IN 'OSIDVGYKIF')!=0 AND
>
> This is completely unestimatable given the available statistics, and it
> doesn't look to me like it is all that great a semantic representation
> either.  Perhaps the query that's really meant here is "dok.doktypp IN
> ('O','S','I', ...)"?  If so, you should say what you mean, not play
> games with converting the query into some strange string operation.

'OSID ...' is a string parameter substituted to SELECT template.
changing this to IN ( 'O', 'S', .. requires re-writing parts of code and I'm
not sure it makes code faster.

>  AND dok.kuupaev||dok.kellaaeg BETWEEN '2006-04-08' AND '2006-04-2723 59'

> This is another case where the planner is not going to have any ability
> to make a useful estimate, and it's because you are using a crummy
> representation of your data.  You should merge those two columns into
> one timestamp column and just do a simple BETWEEN test.
> By and large, unnatural representations of data that you use in WHERE
> clauses are going to cost you big-time in SQL queries.  It's worth
> taking time up front to design a clean table schema, and taking time
> to revise it when requirements change.

date range test in other part of where clause

dok.kuupaev   BETWEEN ....

 is optimizable.

AND dok.kuupaev||dok.kellaaeg adds time range test to date range.
There are less that some thousands documents per day.

Wasting time to re-engineer database and deployed application seems not
reasonable in this case.

Andrus.