Why does adding SUM and GROUP BY destroy performance?

Поиск
Список
Период
Сортировка
От David Link
Тема Why does adding SUM and GROUP BY destroy performance?
Дата
Msg-id 20030917175136.22623.qmail@web13504.mail.yahoo.com
обсуждение исходный текст
Ответы Re: Why does adding SUM and GROUP BY destroy performance?  (Ron Johnson <ron.l.johnson@cox.net>)
Список pgsql-general
Hi,

Why does adding SUM and GROUP BY destroy performance?
details follow.
Thanks, David Link

s1.sql:
SELECT
    t.tid, t.title,
    COALESCE(s0c100r100.units, 0) as w0c100r100units,
    (COALESCE(r1c2r100.units, 0) + COALESCE(y0c2r100.units, 0))
        as r0c2r100units
FROM
    title t
    JOIN upc u1 ON t.tid = u1.tid
    LEFT OUTER JOIN sale_200331 s0c100r100 ON u1.upc = s0c100r100.upc
        AND s0c100r100.week = 200331 AND s0c100r100.channel = 100
        AND s0c100r100.region = 100
    LEFT OUTER JOIN rtd r1c2r100 ON u1.upc = r1c2r100.upc
        AND r1c2r100.year = 2002 AND r1c2r100.channel = 2
        AND r1c2r100.region = 100
    LEFT OUTER JOIN ytd_200331 y0c2r100 ON u1.upc = y0c2r100.upc
        AND y0c2r100.week = 200331 AND y0c2r100.channel = 2
        AND y0c2r100.region = 100
    LEFT OUTER JOIN media m ON t.media = m.key
    LEFT OUTER JOIN screen_format sf ON t.screen_format = sf.key
WHERE
    t.distributor != 'CONTROL LABEL'
ORDER BY
    t.title ASC
LIMIT 50
;


s2.sql:
SELECT
    t.tid, t.title,
    SUM(COALESCE(s0c100r100.units, 0)) as w0c100r100units,
    SUM((COALESCE(r1c2r100.units, 0) + COALESCE(y0c2r100.units, 0)))
        as r0c2r100units
FROM
    title t
    JOIN upc u1 ON t.tid = u1.tid
    LEFT OUTER JOIN sale_200331 s0c100r100 ON u1.upc = s0c100r100.upc
        AND s0c100r100.week = 200331 AND s0c100r100.channel = 100
        AND s0c100r100.region = 100
    LEFT OUTER JOIN rtd r1c2r100 ON u1.upc = r1c2r100.upc
        AND r1c2r100.year = 2002 AND r1c2r100.channel = 2
        AND r1c2r100.region = 100
    LEFT OUTER JOIN ytd_200331 y0c2r100 ON u1.upc = y0c2r100.upc
        AND y0c2r100.week = 200331 AND y0c2r100.channel = 2
        AND y0c2r100.region = 100
    LEFT OUTER JOIN media m ON t.media = m.key
    LEFT OUTER JOIN screen_format sf ON t.screen_format = sf.key
WHERE
    t.distributor != 'CONTROL LABEL'
GROUP BY
    t.tid, t.title
ORDER BY
    t.title ASC
LIMIT 50
;


Times:
 s1.sql takes 0m0.124s
 s2.sql takes 1m1.450s

Stats:
 title table: 68,000 rows
 sale_200331 table: 150,000 rows
 ytd_200331 table: 0 rows
 rtd table: 650,000 rows

Indexes are in place.

s1 explain plan:
                           QUERY PLAN


--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..65105.51 rows=50 width=132)
   ->  Nested Loop  (cost=0.00..91726868.54 rows=70445 width=132)
         Join Filter: ("outer".screen_format = "inner"."key")
         ->  Nested Loop  (cost=0.00..91651668.74 rows=70445 width=127)
               Join Filter: ("outer".media = "inner"."key")
               ->  Nested Loop  (cost=0.00..91578053.95 rows=70445
width=122)
                     ->  Nested Loop  (cost=0.00..91236359.89
rows=70445 width=98)
                           ->  Nested Loop  (cost=0.00..90894665.82
rows=70445 width=74)
                                 ->  Nested Loop
(cost=0.00..90539626.76 rows=70445 width=50)
                                       ->  Index Scan using
title_title_ind on title t  (cost=0.00..193051.67 rows=68775 width=38)
                                             Filter: (distributor <>
'CONTROL LABEL'::character varying)
                                       ->  Index Scan using
davids_tid_index on upc u1  (cost=0.00..1309.24 rows=353 width=12)
                                             Index Cond: ("outer".tid =
u1.tid)
                                 ->  Index Scan using
sale_200331_upc_wk_chl_reg_ind on sale_200331 s0c100r100
(cost=0.00..5.02 rows=1 width=24)
                                       Index Cond: (("outer".upc =
s0c100r100.upc) AND (s0c100r100.week = 200331) AND (s0c100r100.channel
= 100) AND (s0c100r100.region = 100))
                           ->  Index Scan using
rtd_upc_year_chl_reg_ind on rtd r1c2r100  (cost=0.00..4.83 rows=1
width=24)
                                 Index Cond: (("outer".upc =
r1c2r100.upc) AND (r1c2r100."year" = 2002) AND (r1c2r100.channel = 2)
AND (r1c2r100.region = 100))
                     ->  Index Scan using ytd_200331_upc_wkchlreg_ind
on ytd_200331 y0c2r100  (cost=0.00..4.83 rows=1 width=24)
                           Index Cond: (("outer".upc = y0c2r100.upc)
AND (y0c2r100.week = 200331) AND (y0c2r100.channel = 2) AND
(y0c2r100.region = 100))
               ->  Seq Scan on media m  (cost=0.00..1.02 rows=2
width=5)
         ->  Seq Scan on screen_format sf  (cost=0.00..1.03 rows=3
width=5)
(21 rows)


s2 explain plan:

                          QUERY PLAN


--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=403996.99..403997.11 rows=50 width=132)
   ->  Sort  (cost=403996.99..404014.60 rows=7044 width=132)
         Sort Key: t.title
         ->  Aggregate  (cost=402393.74..403274.30 rows=7044 width=132)
               ->  Group  (cost=402393.74..402922.08 rows=70445
width=132)
                     ->  Sort  (cost=402393.74..402569.86 rows=70445
width=132)
                           Sort Key: t.tid, t.title
                           ->  Hash Join  (cost=375382.76..392011.46
rows=70445 width=132)
                                 Hash Cond: ("outer".screen_format =
"inner"."key")
                                 ->  Hash Join
(cost=375381.72..390997.78 rows=70445 width=127)
                                       Hash Cond: ("outer".media =
"inner"."key")
                                       ->  Merge Join
(cost=375380.70..390057.49 rows=70445 width=122)
                                             Merge Cond: ("outer".upc =
"inner".upc)
                                             Join Filter:
(("inner".week = 200331) AND ("inner".channel = 2) AND ("inner".region
= 100))
                                             ->  Merge Join
(cost=375380.70..382782.40 rows=70445 width=98)
                                                   Merge Cond:
("outer".upc = "inner".upc)
                                                   Join Filter:
(("inner"."year" = 2002) AND ("inner".channel = 2) AND ("inner".region
= 100))
                                                   ->  Sort
(cost=375310.87..375486.98 rows=70445 width=74)
                                                         Sort Key:
u1.upc
                                                         ->  Nested
Loop  (cost=6348.20..367282.53 rows=70445 width=74)
                                                               ->  Hash
Join  (cost=6348.20..12243.46 rows=70445 width=50)

Hash Cond: ("outer".tid = "inner".tid)
                                                                     ->
 Seq Scan on upc u1  (cost=0.00..2795.28 rows=70628 width=12)
                                                                     ->
 Hash  (cost=4114.93..4114.93 rows=68775 width=38)

   ->  Seq Scan on title t  (cost=0.00..4114.93 rows=68775 width=38)

         Filter: (distributor <> 'CONTROL LABEL'::character varying)
                                                               ->
Index Scan using sale_200331_upc_wk_chl_reg_ind on sale_200331
s0c100r100  (cost=0.00..5.02 rows=1 width=24)

Index Cond: (("outer".upc = s0c100r100.upc) AND (s0c100r100.week =
200331) AND (s0c100r100.channel = 100) AND (s0c100r100.region = 100))
                                                   ->  Sort
(cost=69.83..72.33 rows=1000 width=24)
                                                         Sort Key:
r1c2r100.upc
                                                         ->  Seq Scan
on rtd r1c2r100  (cost=0.00..20.00 rows=1000 width=24)
                                             ->  Index Scan using
ytd_200331_upc_wkchlreg_ind on ytd_200331 y0c2r100  (cost=0.00..52.00
rows=1000 width=24)
                                       ->  Hash  (cost=1.02..1.02
rows=2 width=5)
                                             ->  Seq Scan on media m
(cost=0.00..1.02 rows=2 width=5)
                                 ->  Hash  (cost=1.03..1.03 rows=3
width=5)
                                       ->  Seq Scan on screen_format sf
 (cost=0.00..1.03 rows=3 width=5)
(36 rows)




__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Database Recovery Procedures
Следующее
От: Dmitry Tkach
Дата:
Сообщение: Weird query plan