Обсуждение: triple self-join crawling
The following self join of a table containing 5800 records is crawling: CREATE TABLE history ( stock VARCHAR(30) NOT NULL, day date NOT NULL, open NUMERIC (6,1) NOT NULL, high NUMERIC (6,1) NOT NULL, low NUMERIC (6,1) NOT NULL, close NUMERIC (6,1) NOT NULL, volume NUMERIC (12) NOT NULL, PRIMARY KEY (stock,day) ); SELECT history.stock, history.day, history.high, history.low, MAX(past_week.high) AS week_high, MAX(past_month.high)AS month_high FROM history INNER JOIN history AS past_month ON (past_month.stock = history.stock AND past_month.day < history.day AND past_month.day >= (history.day - 30)) INNER JOIN history AS past_week ON (past_week.stock = past_month.stock AND past_week.day < history.day AND past_week.day >= (history.day - 7)) GROUP BY history.stock, history.day, history.high, history.low ORDER BY history.stock, history.day DESC How can I speed this up? -- Regards, Tarlika Elisabeth Schmitz
Define "crawling". Also, please post EXPLAIN and, if feasible, EXPLAIN ANALYSE output for your case. A On Sun, Mar 18, 2007 at 07:51:28PM +0000, T E Schmitz wrote: > The following self join of a table containing 5800 records is crawling: > > CREATE TABLE history > ( > stock VARCHAR(30) NOT NULL, > day date NOT NULL, > open NUMERIC (6,1) NOT NULL, > high NUMERIC (6,1) NOT NULL, > low NUMERIC (6,1) NOT NULL, > close NUMERIC (6,1) NOT NULL, > volume NUMERIC (12) NOT NULL, > PRIMARY KEY (stock,day) > ); > > > SELECT > history.stock, history.day, history.high, history.low, > MAX(past_week.high) AS week_high, > MAX(past_month.high) AS month_high > FROM history > INNER JOIN history AS past_month ON (past_month.stock = history.stock > AND past_month.day < history.day AND past_month.day >= (history.day - 30)) > INNER JOIN history AS past_week ON (past_week.stock = > past_month.stock AND past_week.day < history.day AND past_week.day >= > (history.day - 7)) > GROUP BY history.stock, history.day, history.high, history.low > ORDER BY history.stock, history.day DESC > > > How can I speed this up? > > > -- > > > Regards, > > Tarlika Elisabeth Schmitz > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq -- Andrew Sullivan | ajs@crankycanuck.ca Unfortunately reformatting the Internet is a little more painful than reformatting your hard drive when it gets out of whack. --Scott Morris
T E Schmitz wrote: > The following self join of a table containing 5800 records is crawling: > > SELECT > history.stock, history.day, history.high, history.low, > MAX(past_week.high) AS week_high, > MAX(past_month.high) AS month_high > FROM history > INNER JOIN history AS past_month ON (past_month.stock = history.stock > AND past_month.day < history.day AND past_month.day >= (history.day - 30)) > INNER JOIN history AS past_week ON (past_week.stock = > past_month.stock AND past_week.day < history.day AND past_week.day >= > (history.day - 7)) > GROUP BY history.stock, history.day, history.high, history.low > ORDER BY history.stock, history.day DESC Things improved hugely when I changed the JOIN clauses: LEFT OUTER JOIN history AS past_month ON (past_month.stock = history.stock AND past_month.day >= (history.day - 30) AND past_month.day < history.day) LEFT OUTER JOIN history AS past_week ON (past_week.stock = past_month.stock AND past_week.day =past_month.day AND past_week.day >= (history.day - 7)) -- Regards, Tarlika Elisabeth Schmitz
T E Schmitz wrote: > T E Schmitz wrote: > > Things improved hugely when I changed the JOIN clauses: see explain analyze below - can this be improved further? > > LEFT OUTER JOIN history AS past_month ON (past_month.stock = > history.stock AND past_month.day >= (history.day - 30) AND > past_month.day < history.day) > LEFT OUTER JOIN history AS past_week ON (past_week.stock = > past_month.stock AND past_week.day =past_month.day AND past_week.day >= > (history.day - 7)) QUERY PLAN GroupAggregate (cost=1370368.19..1435888.88 rows=5801 width=56) (actual time=11945.030..13163.156 rows=5801 loops=1) -> Sort (cost=1370368.19..1379715.86 rows=3739067 width=56) (actual time=11944.753..12462.623 rows=120117 loops=1) Sort Key: history.stock, history."day", history.high, history.low -> Hash Left Join (cost=160.02..391554.63 rows=3739067 width=56) (actual time=52.746..3778.409 rows=120117 loops=1) Hash Cond: ((("outer".stock)::text = ("inner".stock)::text) AND ("outer"."day" = "inner"."day")) Join Filter: ("inner"."day" >= ("outer"."day" -7)) -> Nested Loop Left Join (cost=0.00..204441.26 rows=3739067 width=57) (actual time=0.077..2313.375 rows=120117 loops=1) Join Filter: (("inner".stock)::text= ("outer".stock)::text) -> Seq Scan on history (cost=0.00..131.01 rows=5801 width=34) (actual time=0.016..34.845 rows=5801 loops=1) -> Index Scan using idx_history_dayon history past_month (cost=0.00..22.32 rows=645 width=23) (actual time=0.020..0.185 rows=21 loops=5801) Index Cond: ((past_month."day" >= ("outer"."day" - 30)) AND (past_month."day" < "outer"."day")) -> Hash (cost=131.01..131.01 rows=5801 width=23) (actual time=52.608..52.608 rows=5801 loops=1) -> Seq Scan on history past_week (cost=0.00..131.01 rows=5801 width=23) (actual time=0.010..25.110 rows=5801 loops=1) Total runtime: 13187.729 ms -- Regards, Tarlika Elisabeth Schmitz
Andrew Sullivan wrote: > Define "crawling". Also, please post EXPLAIN and, if feasible, Total runtime: 191430.537 ms > EXPLAIN ANALYSE output for your case. > > On Sun, Mar 18, 2007 at 07:51:28PM +0000, T E Schmitz wrote: > >>The following self join of a table containing 5800 records is crawling: >> >> >>SELECT >> history.stock, history.day, history.high, history.low, >> MAX(past_week.high) AS week_high, >> MAX(past_month.high) AS month_high >>FROM history >>INNER JOIN history AS past_month ON (past_month.stock = history.stock >>AND past_month.day < history.day AND past_month.day >= (history.day - 30)) >>INNER JOIN history AS past_week ON (past_week.stock = >>past_month.stock AND past_week.day < history.day AND past_week.day >= >>(history.day - 7)) >>GROUP BY history.stock, history.day, history.high, history.low >>ORDER BY history.stock, history.day DESC QUERY PLAN GroupAggregate (cost=1372725715.71..1414901439.85 rows=5801 width=56) (actual time=185248.481..191393.056 rows=5800 loops=1) -> Sort (cost=1372725715.71..1378750806.73 rows=2410036407 width=56) (actual time=185247.212..187763.008 rows=584203 loops=1) Sort Key: history.stock, history."day", history.high,history.low -> Nested Loop (cost=0.00..132245248.41 rows=2410036407 width=56) (actual time=0.124..148136.464 rows=584203 loops=1) Join Filter: (("inner".stock)::text = ("outer".stock)::text) -> Nested Loop (cost=0.00..555931.84 rows=3739067 width=53) (actual time=0.087..137531.941 rows=28147 loops=1) Join Filter: (("outer"."day" >= ("inner"."day"- 7)) AND (("outer".stock)::text = ("inner".stock)::text)) -> Index Scan using history_pkey on history past_week (cost=0.00..266.62 rows=5801 width=23) (actual time=0.034..43.147 rows=5801 loops=1) -> Index Scan using idx_history_day on history (cost=0.00..57.11 rows=1934 width=34) (actual time=0.021..12.709 rows=2900 loops=5801) Index Cond: ("outer"."day" < history."day") -> Index Scan usingidx_history_day on history past_month (cost=0.00..22.32 rows=645 width=23) (actual time=0.019..0.170 rows=21 loops=28147) Index Cond: ((past_month."day" < "outer"."day") AND (past_month."day" >= ("outer"."day" - 30))) Total runtime: 191430.537 ms -- Regards, Tarlika Elisabeth Schmitz
T E Schmitz escribió: > > QUERY PLAN > GroupAggregate (cost=1370368.19..1435888.88 rows=5801 width=56) (actual > time=11945.030..13163.156 rows=5801 loops=1) > -> Sort (cost=1370368.19..1379715.86 rows=3739067 width=56) (actual > time=11944.753..12462.623 rows=120117 loops=1) > Sort Key: history.stock, history."day", history.high, history.low > -> Hash Left Join (cost=160.02..391554.63 rows=3739067 > width=56) (actual time=52.746..3778.409 rows=120117 loops=1) > Hash Cond: ((("outer".stock)::text = > ("inner".stock)::text) AND ("outer"."day" = "inner"."day")) > Join Filter: ("inner"."day" >= ("outer"."day" - 7)) > -> Nested Loop Left Join (cost=0.00..204441.26 > rows=3739067 width=57) (actual time=0.077..2313.375 rows=120117 loops=1) It's estimating >3M, but it finds 120K rows. > Join Filter: (("inner".stock)::text = > ("outer".stock)::text) > -> Seq Scan on history (cost=0.00..131.01 > rows=5801 width=34) (actual time=0.016..34.845 rows=5801 loops=1) > -> Index Scan using idx_history_day on history > past_month (cost=0.00..22.32 rows=645 width=23) (actual > time=0.020..0.185 rows=21 loops=5801) > Index Cond: ((past_month."day" >= > ("outer"."day" - 30)) AND (past_month."day" < "outer"."day")) > -> Hash (cost=131.01..131.01 rows=5801 width=23) (actual > time=52.608..52.608 rows=5801 loops=1) > -> Seq Scan on history past_week (cost=0.00..131.01 > rows=5801 width=23) (actual time=0.010..25.110 rows=5801 loops=1) > > Total runtime: 13187.729 ms Try running a vacuum analyze on the database (or at least the tables which differ in rows estimated and actual (history for example)) -- select 'mmarques' || '@' || 'unl.edu.ar' AS email; --------------------------------------------------------- Martín Marqués | Programador, DBA Centro de Telemática | Administrador Universidad Nacional del Litoral ---------------------------------------------------------