Обсуждение: Speeding up subselect ?
Hi there,
i have the following query in a pl/pgsql procedure:
update lr_object_usage
set status = (case status
when ''OPEN_SUCC'' then ''CLOSED_SUCC''
when ''OPEN_FAIL'' then ''CLOSED_FAIL'' end)
where lr_object_usage_id in (select lr_object_usage_id from
lr_locked_objects where context = p_exam_usage_id);
the problem is the result of the subselect is obviously calculated for
every row of lr_object_usage (30K rows) -> the update takes very long.
Here is the explain output with p_exam_usage_id being 38191. Two simple
selects with the result of the first one being used in the second one
vs. the subselect. Divided into two selects
Can I somehow tell the planer not to requery the subselect for every row
?
Many TIA,
peter
--
------------------------------------------------------------------
openacs=# select lr_object_usage_id from lr_locked_objects where context
= 38191;
lr_object_usage_id
--------------------
38192
38193
38194
38195
(4 rows)
openacs=# explain analyze select lr_object_usage_id from
lr_locked_objects where context = 38191;
NOTICE: QUERY PLAN:
Seq Scan on lr_locked_objects (cost=0.00..1.04 rows=1 width=4) (actual
time=0.06..0.09 rows=4 loops=1)
Total runtime: 0.18 msec
EXPLAIN
openacs=# select * from lr_object_usage where lr_object_usage_id in
(38192,38193,38194,38195);
lr_object_usage_id | lr_object_id | access_time |
user_id | status | context
--------------------+--------------+-------------------------------+----
-----+--------+---------
38192 | 13496 | 2002-07-08 17:47:01.001332+02 |
2434 | | 38191
38193 | 13503 | 2002-07-08 17:47:01.001332+02 |
2434 | | 38191
38194 | 13434 | 2002-07-08 17:47:01.001332+02 |
2434 | | 38191
(3 rows)
openacs=# explain analyze select * from lr_object_usage where
lr_object_usage_id in (38192,38193,38194,38195);
NOTICE: QUERY PLAN:
Index Scan using lr_object_usage_lr_object_usage,
lr_object_usage_lr_object_usage, lr_object_usage_lr_object_usage,
lr_object_usage_lr_object_usage on lr_object_usage (cost=0.00..12.18
rows=4 width=39) (actual time=0.06..0.20 rows=3 loops=1)
Total runtime: 0.38 msec
EXPLAIN
Doing the same thing with the subselect:
openacs=# explain analyze select * from lr_object_usage where
lr_object_usage_id in (select lr_object_usage_id from lr_locked_objects
where context = 38191);
NOTICE: QUERY PLAN:
Seq Scan on lr_object_usage (cost=0.00..17860.59 rows=16514 width=39)
(actual time=2640.91..2646.47 rows=3 loops=1)
SubPlan
-> Seq Scan on lr_locked_objects (cost=0.00..1.04 rows=1 width=4)
(actual time=0.03..0.06 rows=4 loops=33080)
Total runtime: 2646.60 msec
EXPLAIN
Peter Alberer wrote:
> Hi there,
>
> i have the following query in a pl/pgsql procedure:
>
> update lr_object_usage
> set status = (case status
> when ''OPEN_SUCC'' then ''CLOSED_SUCC''
> when ''OPEN_FAIL'' then ''CLOSED_FAIL'' end)
> where lr_object_usage_id in (select lr_object_usage_id from
> lr_locked_objects where context = p_exam_usage_id);
>
> the problem is the result of the subselect is obviously calculated for
> every row of lr_object_usage (30K rows) -> the update takes very long.
> Here is the explain output with p_exam_usage_id being 38191. Two simple
> selects with the result of the first one being used in the second one
> vs. the subselect. Divided into two selects
>
> Can I somehow tell the planer not to requery the subselect for every row
Try to recast the subselect as a FROM clause subselect. E.g. will this work?
update lr_object_usage
set status = (case status
when ''OPEN_SUCC'' then ''CLOSED_SUCC''
when ''OPEN_FAIL'' then ''CLOSED_FAIL'' end)
from (select lr_object_usage_id
from lr_locked_objects
where context = p_exam_usage_id) as t1
where lr_object_usage.lr_object_usage_id = t1.lr_object_usage_id;
HTH,
Joe
Is the situation possible where by you can run the subselect first at the very start and put it into a variable. Then put that variable in the query instead of the subselect Darren On Mon, 8 Jul 2002, Peter Alberer wrote: > Hi there, > > i have the following query in a pl/pgsql procedure: > > update lr_object_usage > set status = (case status > when ''OPEN_SUCC'' then ''CLOSED_SUCC'' > when ''OPEN_FAIL'' then ''CLOSED_FAIL'' end) > where lr_object_usage_id in (select lr_object_usage_id from > lr_locked_objects where context = p_exam_usage_id); > > the problem is the result of the subselect is obviously calculated for > every row of lr_object_usage (30K rows) -> the update takes very long. > Here is the explain output with p_exam_usage_id being 38191. Two simple > selects with the result of the first one being used in the second one > vs. the subselect. Divided into two selects > > Can I somehow tell the planer not to requery the subselect for every row > ? > > Many TIA, > > peter > > -- > ------------------------------------------------------------------ > > openacs=# select lr_object_usage_id from lr_locked_objects where context > = 38191; > lr_object_usage_id > -------------------- > 38192 > 38193 > 38194 > 38195 > (4 rows) > > openacs=# explain analyze select lr_object_usage_id from > lr_locked_objects where context = 38191; > NOTICE: QUERY PLAN: > > Seq Scan on lr_locked_objects (cost=0.00..1.04 rows=1 width=4) (actual > time=0.06..0.09 rows=4 loops=1) > Total runtime: 0.18 msec > > EXPLAIN > > openacs=# select * from lr_object_usage where lr_object_usage_id in > (38192,38193,38194,38195); > lr_object_usage_id | lr_object_id | access_time | > user_id | status | context > --------------------+--------------+-------------------------------+---- > -----+--------+--------- > 38192 | 13496 | 2002-07-08 17:47:01.001332+02 | > 2434 | | 38191 > 38193 | 13503 | 2002-07-08 17:47:01.001332+02 | > 2434 | | 38191 > 38194 | 13434 | 2002-07-08 17:47:01.001332+02 | > 2434 | | 38191 > (3 rows) > > openacs=# explain analyze select * from lr_object_usage where > lr_object_usage_id in (38192,38193,38194,38195); > NOTICE: QUERY PLAN: > > Index Scan using lr_object_usage_lr_object_usage, > lr_object_usage_lr_object_usage, lr_object_usage_lr_object_usage, > lr_object_usage_lr_object_usage on lr_object_usage (cost=0.00..12.18 > rows=4 width=39) (actual time=0.06..0.20 rows=3 loops=1) > Total runtime: 0.38 msec > > EXPLAIN > > Doing the same thing with the subselect: > > openacs=# explain analyze select * from lr_object_usage where > lr_object_usage_id in (select lr_object_usage_id from lr_locked_objects > where context = 38191); > NOTICE: QUERY PLAN: > > Seq Scan on lr_object_usage (cost=0.00..17860.59 rows=16514 width=39) > (actual time=2640.91..2646.47 rows=3 loops=1) > SubPlan > -> Seq Scan on lr_locked_objects (cost=0.00..1.04 rows=1 width=4) > (actual time=0.03..0.06 rows=4 loops=33080) > Total runtime: 2646.60 msec > > EXPLAIN > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html > > -- Darren Ferguson
On Mon, Jul 08, 2002 at 06:10:05PM +0200, Peter Alberer wrote: > Hi there, > > i have the following query in a pl/pgsql procedure: > > update lr_object_usage > set status = (case status > when ''OPEN_SUCC'' then ''CLOSED_SUCC'' > when ''OPEN_FAIL'' then ''CLOSED_FAIL'' end) > where lr_object_usage_id in (select lr_object_usage_id from > lr_locked_objects where context = p_exam_usage_id); Read the FAQ. Use EXISTS rather than IN. -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > There are 10 kinds of people in the world, those that can do binary > arithmetic and those that can't.
On Mon, 8 Jul 2002 18:10:05 +0200, "Peter Alberer"
<h9351252@obelix.wu-wien.ac.at> wrote:
>Hi there,
>
>i have the following query in a pl/pgsql procedure:
>
> update lr_object_usage
> set status = (case status
> when ''OPEN_SUCC'' then ''CLOSED_SUCC''
> when ''OPEN_FAIL'' then ''CLOSED_FAIL'' end)
> where lr_object_usage_id in (select lr_object_usage_id from
>lr_locked_objects where context = p_exam_usage_id);
>
>the problem is the result of the subselect is obviously calculated for
>every row of lr_object_usage (30K rows) -> the update takes very long.
Peter, try
UPDATE lr_object_usage
SET status = CASE status
WHEN ''OPEN_SUCC'' THEN ''CLOSED_SUCC''
WHEN ''OPEN_FAIL'' THEN ''CLOSED_FAIL''
END
FROM lr_locked_objects o
WHERE lr_object_usage.lr_object_usage_id = o.lr_object_usage_id
AND <onetable>.context = <othertable>.p_exam_usage_id;
I didn't figure out where context and p_exam_usage_id come from.
Use at your own risk, I did not test it.
Servus
Manfred