Re: Planner matching constants across tables in a join

Поиск
Список
Период
Сортировка
От Richard Huxton
Тема Re: Planner matching constants across tables in a join
Дата
Msg-id 200303051424.12351.dev@archonet.com
обсуждение исходный текст
Ответ на Re: Planner matching constants across tables in a join  (Greg Stark <gsstark@mit.edu>)
Ответы Re: Planner matching constants across tables in a join  (Greg Stark <gsstark@mit.edu>)
Список pgsql-performance
On Wednesday 05 Mar 2003 12:42 pm, Greg Stark wrote:
> Really? They produce virtually the same plan for me.
>
> Why do you think it'll run slower?
> What query are you actually finding slow?

The actual query uses three tables, but isn't very complicated. Apologies for
the wrapping on the explain.

EXPLAIN ANALYSE SELECT a.line_id, a.start_time, a.call_dur, i.cam_id,
i.prod_id, i.chg_per_min, i.rev_per_min
FROM campaign_items i, campaign c, activity a
WHERE
i.cam_id=c.id AND a.line_id=i.line_id
AND a.start_time BETWEEN c.cam_from AND c.cam_to
AND a.line_id='0912345 0004' AND i.line_id='0912345 0004';


QUERY PLAN
----------
 Merge Join  (cost=348.01..348.72 rows=1 width=72) (actual time=115.43..116.27
rows=21 loops=1)
   Merge Cond: ("outer".id = "inner".cam_id)
   Join Filter: (("outer".line_id)::text = ("inner".line_id)::text)
   ->  Sort  (cost=245.45..245.75 rows=118 width=40) (actual time=83.98..84.10
rows=94 loops=1)
         Sort Key: c.id
         ->  Nested Loop  (cost=0.00..241.40 rows=118 width=40) (actual
time=3.83..83.27 rows=94 loops=1)
               Join Filter: (("outer".start_time >=
("inner".cam_from)::timestamp without time zone) AND ("outer".start_time <=
("inner".cam_to)::timestamp without time zone))
               ->  Seq Scan on activity a  (cost=0.00..199.00 rows=11
width=28) (actual time=3.06..54.14 rows=19 loops=1)
                     Filter: ((line_id)::text = '0912345 0004'::text)
               ->  Seq Scan on campaign c  (cost=0.00..2.00 rows=100 width=12)
(actual time=0.02..0.84 rows=100 loops=19)
   ->  Sort  (cost=102.56..102.57 rows=5 width=32) (actual time=31.36..31.39
rows=20 loops=1)
         Sort Key: i.cam_id
         ->  Seq Scan on campaign_items i  (cost=0.00..102.50 rows=5 width=32)
(actual time=17.16..31.11 rows=6 loops=1)
               Filter: ((line_id)::text = '0912345 0004'::text)
 Total runtime: 117.08 msec
(15 rows)


and this is the plan where I just check the one line_id:


EXPLAIN ANALYSE SELECT a.line_id, a.start_time, a.call_dur, i.cam_id,
i.prod_id, i.chg_per_min, i.rev_per_min
FROM campaign_items i, campaign c, activity a
WHERE
i.cam_id=c.id AND a.line_id=i.line_id
AND a.start_time BETWEEN c.cam_from AND c.cam_to
AND i.line_id='0912345 0004';

QUERY PLAN
---------------------------------------
 Hash Join  (cost=2.25..1623.70 rows=6 width=72) (actual time=48.27..974.30
rows=21 loops=1)
   Hash Cond: ("outer".cam_id = "inner".id)
   Join Filter: (("outer".start_time >= ("inner".cam_from)::timestamp without
time zone) AND ("outer".start_time <= ("inner".cam_to)::timestamp without
time zone))
   ->  Nested Loop  (cost=0.00..1619.87 rows=53 width=60) (actual
time=24.49..969.33 rows=114 loops=1)
         Join Filter: (("inner".line_id)::text = ("outer".line_id)::text)
         ->  Seq Scan on campaign_items i  (cost=0.00..102.50 rows=5 width=32)
(actual time=15.72..28.52 rows=6 loops=1)
               Filter: ((line_id)::text = '0912345 0004'::text)
         ->  Seq Scan on activity a  (cost=0.00..174.00 rows=10000 width=28)
(actual time=0.03..101.95 rows=10000 loops=6)
   ->  Hash  (cost=2.00..2.00 rows=100 width=12) (actual time=1.54..1.54
rows=0 loops=1)
         ->  Seq Scan on campaign c  (cost=0.00..2.00 rows=100 width=12)
(actual time=0.06..0.94 rows=100 loops=1)
 Total runtime: 975.13 msec
(11 rows)

Table campaign has 100 rows, campaign_items 5000, activity 10000. My guess is
that the planner starts with "campaign" because of the low number of rows,
but it still looks like filtering on "activity" would help things. Indeed,
testing a.line_id instead of i.line_id does make a difference.


QUERY PLAN
-------------------
 Hash Join  (cost=241.70..457.54 rows=6 width=72) (actual time=161.20..225.68
rows=21 loops=1)
   Hash Cond: ("outer".cam_id = "inner".id)
   Join Filter: (("inner".line_id)::text = ("outer".line_id)::text)
   ->  Seq Scan on campaign_items i  (cost=0.00..90.00 rows=5000 width=32)
(actual time=0.03..72.00 rows=5000 loops=1)
   ->  Hash  (cost=241.40..241.40 rows=118 width=40) (actual time=85.46..85.46
rows=0 loops=1)
         ->  Nested Loop  (cost=0.00..241.40 rows=118 width=40) (actual
time=3.80..84.66 rows=94 loops=1)
               Join Filter: (("outer".start_time >=
("inner".cam_from)::timestamp without time zone) AND ("outer".start_time <=
("inner".cam_to)::timestamp without time zone))
               ->  Seq Scan on activity a  (cost=0.00..199.00 rows=11
width=28) (actual time=3.03..54.48 rows=19 loops=1)
                     Filter: ((line_id)::text = '0912345 0004'::text)
               ->  Seq Scan on campaign c  (cost=0.00..2.00 rows=100 width=12)
(actual time=0.03..0.89 rows=100 loops=19)
 Total runtime: 226.51 msec
(11 rows)

--
  Richard Huxton

В списке pgsql-performance по дате отправления:

Предыдущее
От: Neil Conway
Дата:
Сообщение: Re: OIDs as keys
Следующее
От: Rod Taylor
Дата:
Сообщение: Re: OIDs as keys