Обсуждение: Slow query
Please help me speed up the following query. It used to run in 2-5 sec., but now it takes 2-3 mins!
I ran VACUUM FULL ANALYZE and REINDEX.
SELECT * FROM media m
WHERE m.mediatype = (SELECT objectid FROM mediatype WHERE medianame='Audio')
AND EXISTS
(SELECT * FROM
(SELECT objectid AS mediaid
FROM media
WHERE activity='347667'
UNION
SELECT ism.media AS mediaid
FROM intsetmedia ism, set s
WHERE ism.set = s.objectid
AND s.activity='347667' ) AS a1
WHERE a1.mediaid = m.objectid
LIMIT 1)
ORDER BY medianame ASC, status DESC
Basically it tries to find all Audios that are either explicitly attached to the given activity, or attached to the given activity via a many-to-many relationship intsetmedia which links records in table Interaction, Set, and Media.
I attached the output of EXPLAIN and schemas and indexes on the tables involved. Most of the fields are not relevant to the query, but I listed them anyways. I discarded trigger information, though.
Thanks for your help.
Oleg
*************************************
This email may contain privileged or confidential material intended for the named recipient only.
If you are not the named recipient, delete this message and all attachments.
Any review, copying, printing, disclosure or other use is prohibited.
We reserve the right to monitor email sent through our network.
*************************************
Вложения
Oleg, > Please help me speed up the following query. It used to run in 2-5 sec., > but now it takes 2-3 mins! > I ran VACUUM FULL ANALYZE and REINDEX. > SELECT * FROM media m > WHERE m.mediatype = (SELECT objectid FROM mediatype WHERE This is a repost, isn't it? -- Josh Berkus Aglio Database Solutions San Francisco
No, I don't believe so. My previous question regarding performance was solved by VACUUM FULL and REINDEX. The current one, I believe, is more related to query structure and planner stats. -----Original Message----- From: Josh Berkus [mailto:josh@agliodbs.com] Sent: Monday, March 24, 2003 11:55 AM To: Oleg Lebedev; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Slow query Oleg, > Please help me speed up the following query. It used to run in 2-5 > sec., but now it takes 2-3 mins! I ran VACUUM FULL ANALYZE and > REINDEX. SELECT * FROM media m > WHERE m.mediatype = (SELECT objectid FROM mediatype WHERE This is a repost, isn't it? -- Josh Berkus Aglio Database Solutions San Francisco ************************************* This email may contain privileged or confidential material intended for the named recipient only. If you are not the named recipient, delete this message and all attachments. Any review, copying, printing, disclosure or other use is prohibited. We reserve the right to monitor email sent through our network. *************************************
On Mon, 24 Mar 2003, Oleg Lebedev wrote: > Please help me speed up the following query. It used to run in 2-5 sec., > but now it takes 2-3 mins! EXPLAIN ANALYZE output would be useful to see where the time is actually taking place (rather than an estimate thereof).
EXPLAIN ANALYZE plan is shown below.
I also attached it as a file.
One thing that might help is that the query produces 27 rows, which is
much less than predicted 1963.
QUERY PLAN
Sort (cost=553657.66..553662.57 rows=1963 width=218) (actual
time=133036.73..133036.75 rows=27 loops=1)
Sort Key: medianame, status
InitPlan
-> Seq Scan on mediatype (cost=0.00..1.29 rows=1 width=8) (actual
time=0.12..0.14 rows=1 loops=1)
Filter: (medianame = 'Audio'::character varying)
-> Index Scan using media_mtype_index on media m (cost=0.00..553550.28
rows=1963 width=218) (actual time=5153.36..133036.00 rows=27 loops=1)
Index Cond: (mediatype = $0)
Filter: (subplan)
SubPlan
-> Limit (cost=138.92..138.93 rows=1 width=24) (actual time=2.92..2.92
rows=0 loops=44876)
-> Subquery Scan a1 (cost=138.92..138.93 rows=1 width=24) (actual
time=2.92..2.92 rows=0 loops=44876)
-> Unique (cost=138.92..138.93 rows=1 width=24) (actual
time=2.91..2.91 rows=0 loops=44876)
-> Sort (cost=138.92..138.93 rows=2 width=24) (actual time=2.91..2.91
rows=0 loops=44876)
Sort Key: mediaid
-> Append (cost=0.00..138.91 rows=2 width=24) (actual time=2.80..2.81
rows=0 loops=44876)
-> Subquery Scan "*SELECT* 1" (cost=0.00..5.11 rows=1 width=8) (actual
time=0.06..0.06 rows=0 loops=44876)
-> Index Scan using media_pkey on media (cost=0.00..5.11 rows=1
width=8) (actual time=0.05..0.05 rows=0 loops=44876)
Index Cond: (objectid = $1)
Filter: (activity = 347667::bigint)
-> Subquery Scan "*SELECT* 2" (cost=24.25..133.80 rows=1 width=24)
(actual time=2.73..2.73 rows=0 loops=44876)
-> Hash Join (cost=24.25..133.80 rows=1 width=24) (actual
time=2.72..2.72 rows=0 loops=44876)
Hash Cond: ("outer"."set" = "inner".objectid)
-> Index Scan using intsetmedia_media_index on intsetmedia ism
(cost=0.00..109.26 rows=38 width=16) (actual time=0.04..0.04 rows=1
loops=44876)
Index Cond: (media = $1)
-> Hash (cost=24.24..24.24 rows=6 width=8) (actual time=0.14..0.14
rows=0 loops=44876)
-> Index Scan using set_act_index on "set" s (cost=0.00..24.24 rows=6
width=8) (actual time=0.11..0.13 rows=2 loops=44876)
Index Cond: (activity = 347667::bigint)
Total runtime: 133037.49 msec
-----Original Message-----
From: Stephan Szabo [mailto:sszabo@megazone23.bigpanda.com]
Sent: Monday, March 24, 2003 12:04 PM
To: Oleg Lebedev
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Slow query
On Mon, 24 Mar 2003, Oleg Lebedev wrote:
> Please help me speed up the following query. It used to run in 2-5
> sec., but now it takes 2-3 mins!
EXPLAIN ANALYZE output would be useful to see where the time is actually
taking place (rather than an estimate thereof).
*************************************
This email may contain privileged or confidential material intended for the named recipient only.
If you are not the named recipient, delete this message and all attachments.
Any review, copying, printing, disclosure or other use is prohibited.
We reserve the right to monitor email sent through our network.
*************************************
Вложения
Stephan, Hmmm ... I'm a bit confused by the new EXPLAIN output. Stefan, does Oleg's output show the time for *one* subplan execution, executed for 44,000 loops, or does it show the total time? The former would make more sense given his query, but I'm just not sure .... -- -Josh Berkus Aglio Database Solutions San Francisco
I decided that it might help to list the cardinalities of the pertinent tables: Intsetmedia: 90,000 rows Interaction: 26,000 rows Set: 7,000 rows Media: 80,000 rows -----Original Message----- From: Josh Berkus [mailto:josh@agliodbs.com] Sent: Monday, March 24, 2003 12:47 PM To: Stephan Szabo; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Slow query Importance: Low Stephan, Hmmm ... I'm a bit confused by the new EXPLAIN output. Stefan, does Oleg's output show the time for *one* subplan execution, executed for 44,000 loops, or does it show the total time? The former would make more sense given his query, but I'm just not sure .... -- -Josh Berkus Aglio Database Solutions San Francisco ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster ************************************* This email may contain privileged or confidential material intended for the named recipient only. If you are not the named recipient, delete this message and all attachments. Any review, copying, printing, disclosure or other use is prohibited. We reserve the right to monitor email sent through our network. *************************************
Oleg Lebedev <oleg.lebedev@waterford.org> writes:
> SELECT * FROM media m
> WHERE m.mediatype =3D (SELECT objectid FROM mediatype WHERE
> medianame=3D'Audio')=20
> AND EXISTS=20
> (SELECT * FROM=20
> (SELECT objectid AS mediaid=20
> FROM media=20
> WHERE activity=3D'347667'=20
> UNION=20
> SELECT ism.media AS mediaid=20
> FROM intsetmedia ism, set s=20
> WHERE ism.set =3D s.objectid=20
> AND s.activity=3D'347667' ) AS a1=20
> WHERE a1.mediaid =3D m.objectid=20
> LIMIT 1)=20
> ORDER BY medianame ASC, status DESC=20
Well, one observation is that the LIMIT clause is useless and probably
counterproductive; EXISTS takes only one row from the subselect anyway.
Another is that the UNION is doing it the hard way; UNION implies doing
a duplicate-elimination step, which you don't need here. UNION ALL
would be a little quicker. But what I would do is split it into two
EXISTS:
SELECT * FROM media m
WHERE m.mediatype = (SELECT objectid FROM mediatype WHERE
medianame='Audio')
AND ( EXISTS(SELECT 1
FROM media
WHERE activity='347667'
AND objectid = m.objectid)
OR EXISTS(SELECT 1
FROM intsetmedia ism, set s
WHERE ism.set = s.objectid
AND s.activity='347667'
AND ism.media = m.objectid))
ORDER BY medianame ASC, status DESC
regards, tom lane
I just ran the query you sent me and attached the output of EXPLAIN
ANALYZE as TOMs_plan.txt
It did not speed up the query significantly.
It always seemed to me that UNION is faster than OR, so I tried your
suggestion to use UNION ALL with the original query without
counter-productive LIMIT 1 in EXISTS clause. This reduced the cost of
the plan by 50%, but slowed down the query. Weird ... The plan is shown
in UNION_ALL_plan.txt
AFAIK, the only change I've done since the time when the query took 3
sec. to run was adding more indexes and increasing the size of data by
about 25%. It sounds kind of stupid, but I remember that adding indexes
sometimes slowed down my queries. I will try to drop all the indexes and
add them back again one by one.
Any other ideas?
Thanks.
Oleg
-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Monday, March 24, 2003 1:48 PM
To: Oleg Lebedev
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Slow query
Oleg Lebedev <oleg.lebedev@waterford.org> writes:
> SELECT * FROM media m
> WHERE m.mediatype =3D (SELECT objectid FROM mediatype WHERE
> medianame=3D'Audio')=20 AND EXISTS=20
> (SELECT * FROM=20
> (SELECT objectid AS mediaid=20
> FROM media=20
> WHERE activity=3D'347667'=20
> UNION=20
> SELECT ism.media AS mediaid=20
> FROM intsetmedia ism, set s=20
> WHERE ism.set =3D s.objectid=20
> AND s.activity=3D'347667' ) AS a1=20
> WHERE a1.mediaid =3D m.objectid=20
> LIMIT 1)=20
> ORDER BY medianame ASC, status DESC=20
Well, one observation is that the LIMIT clause is useless and probably
counterproductive; EXISTS takes only one row from the subselect anyway.
Another is that the UNION is doing it the hard way; UNION implies doing
a duplicate-elimination step, which you don't need here. UNION ALL
would be a little quicker. But what I would do is split it into two
EXISTS:
SELECT * FROM media m
WHERE m.mediatype = (SELECT objectid FROM mediatype WHERE
medianame='Audio')
AND ( EXISTS(SELECT 1
FROM media
WHERE activity='347667'
AND objectid = m.objectid)
OR EXISTS(SELECT 1
FROM intsetmedia ism, set s
WHERE ism.set = s.objectid
AND s.activity='347667'
AND ism.media = m.objectid))
ORDER BY medianame ASC, status DESC
regards, tom lane
*************************************
This email may contain privileged or confidential material intended for the named recipient only.
If you are not the named recipient, delete this message and all attachments.
Any review, copying, printing, disclosure or other use is prohibited.
We reserve the right to monitor email sent through our network.
*************************************
Вложения
Oleg Lebedev <oleg.lebedev@waterford.org> writes:
> I just ran the query you sent me and attached the output of EXPLAIN
> ANALYZE as TOMs_plan.txt
> It did not speed up the query significantly.
Nope. I was hoping to see a faster-start plan, but given the number of
rows involved I guess it won't change its mind. You're going to have to
think about a more intelligent approach, rather than minor tweaks.
One question: since objectid is evidently a primary key, why are you
doing a subselect for the first part? Wouldn't it give the same result
just to say "m.activity = '347667'" in the top-level WHERE?
As for the second part, I think you'll have to try to rewrite it as a
join with the media table.
regards, tom lane
You are right. I rewrote the query using JOINs and it increased
performance from 123 sec. to 20msec. I betcha I screwed smth up, but I
list the rewritten query below anyways. I also attached the new plan.
Thank you.
SELECT * FROM media m
JOIN
((SELECT objectid AS mediaid
FROM media
WHERE activity='347667')
UNION
(SELECT ism.media AS mediaid
FROM intsetmedia ism, set s
WHERE ism.set = s.objectid
AND s.activity='347667' )) a1
ON
m.mediatype = (SELECT objectid FROM mediatype WHERE medianame='Audio')
AND m.objectid=mediaid
ORDER BY medianame ASC, status DESC
-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Monday, March 24, 2003 3:09 PM
To: Oleg Lebedev
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Slow query
Oleg Lebedev <oleg.lebedev@waterford.org> writes:
> I just ran the query you sent me and attached the output of EXPLAIN
> ANALYZE as TOMs_plan.txt It did not speed up the query significantly.
Nope. I was hoping to see a faster-start plan, but given the number of
rows involved I guess it won't change its mind. You're going to have to
think about a more intelligent approach, rather than minor tweaks.
One question: since objectid is evidently a primary key, why are you
doing a subselect for the first part? Wouldn't it give the same result
just to say "m.activity = '347667'" in the top-level WHERE?
As for the second part, I think you'll have to try to rewrite it as a
join with the media table.
regards, tom lane
*************************************
This email may contain privileged or confidential material intended for the named recipient only.
If you are not the named recipient, delete this message and all attachments.
Any review, copying, printing, disclosure or other use is prohibited.
We reserve the right to monitor email sent through our network.
*************************************
Вложения
Oleg,
My guess is that the query runs slow because by adding
data you exceeded what your database can do in memory
and you need to do some kind of disk sort.
How about rewriting your query without the UNION and
the EXISTS to something like
SELECT * FROM media m
WHERE m.mediatype = (SELECT objectid FROM mediatype
WHERE medianame='Audio')
AND ( m.activity='347667'
OR m.objectid IN (
SELECT s.objectid
FROM intsetmedia ism, set s
WHERE ism.set = s.objectid
AND s.activity='347667'))
ORDER BY medianame ASC, status DESC
Regards,
Nikolaus Dilger
On Mon, 24 Mar 2003, Oleg Lebedev wrote:
Message
Please help me speed
up the following query. It used to run in 2-5 sec., but
now it takes 2-3
mins!
I ran VACUUM FULL
ANALYZE and REINDEX.
SELECT * FROM media
m
WHERE m.mediatype =
(SELECT objectid FROM mediatype WHERE
medianame='Audio')
AND EXISTS
(SELECT * FROM
(SELECT objectid AS mediaid
FROM media
WHERE activity='347667'
UNION
SELECT ism.media AS mediaid
FROM intsetmedia ism, set s
WHERE ism.set = s.objectid
AND s.activity='347667' ) AS a1
WHERE a1.mediaid = m.objectid
LIMIT 1)
ORDER BY medianame ASC, status DESC
Basically it tries
to find all Audios that are either explicitly attached
to the given activity, or
attached to the given activity via a many-to-many
relationship intsetmedia which
links records in table Interaction, Set, and Media.
I attached the
output of EXPLAIN and schemas and indexes on the tables
involved. Most of the
fields are not relevant to the query, but I listed them
anyways. I discarded
trigger information, though.
Thanks for your
help.
Oleg
*************************************
This email may contain privileged or confidential
material intended for the named recipient only.
If you are not the named recipient, delete this message
and all attachments.
Any review, copying, printing, disclosure or other use
is prohibited.
We reserve the right to monitor email sent through our
network.
*************************************