Обсуждение: Very slow inner join query Unacceptable latency.
<span style="font-family:Verdana; color:#000000; font-size:10pt;"><div>The SARS_ACTS table currently has 37,115,515 rows<br/><br />we have indexed: idx_sars_acts_acts_run_id ON SARS_ACTS USING btree (sars_run_id)<br />we have pk constrainton the SARS_ACTS_RUN table; sars_acts_run_pkey PRIMARY KEY (id )<br /><br />serverdb=# explain select count(*)as y0_ from SARS_ACTS this_ inner join SARS_ACTS_RUN tr1_ on this_.SARS_RUN_ID=<a href="http://tr1_.ID">tr1_.ID</a>where tr1_.ALGORITHM='SMAT';<br /> QUERYPLAN<br />--------------------------------------------------------------------------------------------------------------------------<br />Aggregate (cost=4213952.17..4213952.18 rows=1 width=0)<br /> -> Hash Join (cost=230573.06..4213943.93 rows=3296 width=0)<br/> Hash Cond: (this_.SARS_RUN_ID=<a href="http://tr1_.ID">tr1_.ID</a>)<br /> -> Seq Scan om sars_actsthis_ (cost=0.00..3844241.84 rows=37092284 width=8)<br /> -> Hash (cost=230565.81..230565.81 rows=580width=8)<br /> -> Seq Scan on sars_acts_run tr1_ (cost=0.00..230565.81 rows=580 width=8)<br /> Filter: ((algorithm)::text = 'SMAT'::text)<br />(7 rows)<br /><br />This query executes in approximately5.3 minutes to complete, very very slow, our users are not happy.<br /><br />I did add an index on SARS_ACTS_RUN.ALGORITHMcolumn but it didn't improve the run time. <br />The planner just changed the "Filter:" to an "IndexScan:" improving the cost of the Seq Scan <br />on the sars_acts_run table, but the overall run time remained the same.It seems like the bottleneck <br />is in the Seq Scan on the sars_acts table.<br /><br /> -> Seq Scanon sars_acts_run tr1_ (cost=0.00..230565.81 rows=580 width=8)<br /> Filter: ((algorithm)::text ='SMAT'::text)<br /><br />Does anyone have suggestions about how to speed it up?<br /><a href="mailto:pgsql-performance@postgresql.org"target="_blank"></a></div></span>
On Tue, May 21, 2013 at 4:53 PM, <fburgess@radiantblue.com> wrote: > The SARS_ACTS table currently has 37,115,515 rows > > we have indexed: idx_sars_acts_acts_run_id ON SARS_ACTS USING btree > (sars_run_id) > we have pk constraint on the SARS_ACTS_RUN table; sars_acts_run_pkey PRIMARY > KEY (id ) > > serverdb=# explain select count(*) as y0_ from SARS_ACTS this_ inner join > SARS_ACTS_RUN tr1_ on this_.SARS_RUN_ID=tr1_.ID where tr1_.ALGORITHM='SMAT'; can you please show us an EXPLAIN ANALYZE of this query (not only EXPLAIN). please paste it in a file and attach it so it doesn't get reformatted by the mail client. what version of postgres is this? -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación Phone: +593 4 5107566 Cell: +593 987171157
On Wednesday, May 22, 2013 3:24 AM fburgess wrote: > The SARS_ACTS table currently has 37,115,515 rows > we have indexed: idx_sars_acts_acts_run_id ON SARS_ACTS USING btree (sars_run_id) > we have pk constraint on the SARS_ACTS_RUN table; sars_acts_run_pkey PRIMARY KEY (id ) > serverdb=# explain select count(*) as y0_ from SARS_ACTS this_ inner join SARS_ACTS_RUN tr1_ on this_.SARS_RUN_ID=tr1_.IDwhere tr1_.ALGORITHM='SMAT'; > QUERY PLAN > -------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=4213952.17..4213952.18 rows=1 width=0) > -> Hash Join (cost=230573.06..4213943.93 rows=3296 width=0) > Hash Cond: (this_.SARS_RUN_ID=tr1_.ID) > -> Seq Scan om sars_acts this_ (cost=0.00..3844241.84 rows=37092284 width=8) > -> Hash (cost=230565.81..230565.81 rows=580 width=8) > -> Seq Scan on sars_acts_run tr1_ (cost=0.00..230565.81 rows=580 width=8) > Filter: ((algorithm)::text = 'SMAT'::text) > (7 rows) > This query executes in approximately 5.3 minutes to complete, very very slow, our users are not happy. > I did add an index on SARS_ACTS_RUN.ALGORITHM column but it didn't improve the run time. > The planner just changed the "Filter:" to an "Index Scan:" improving the cost of the Seq Scan > on the sars_acts_run table, but the overall run time remained the same. It seems like the bottleneck > is in the Seq Scan on the sars_acts table. > -> Seq Scan on sars_acts_run tr1_ (cost=0.00..230565.81 rows=580 width=8) > Filter: ((algorithm)::text = 'SMAT'::text) > Does anyone have suggestions about how to speed it up? Could you please once trying Analyzing both tables and then run the query to check which plan it uses: Analyze SARS_ACTS; Analyze SARS_ACTS_RUN; With Regards, Amit Kapila.
<span style="font-family:Verdana; color:#000000; font-size:10pt;"><div>PostgreSQL 9.1.6 on linux<br /><br /></div><blockquoteid="replyBlockquote" style="border-left: 2px solid blue; margin-left: 8px; padding-left: 8px; font-size:10pt;color:black; font-family:verdana;" webmail="1"><div id="wmQuoteWrapper"> -------- Original Message --------<br/> Subject: Re: [PERFORM] Very slow inner join query Unacceptable latency.<br /> From: Jaime Casanova <<a href="mailto:jaime@2ndquadrant.com">jaime@2ndquadrant.com</a>><br/> Date: Tue, May 21, 2013 2:59 pm<br /> To: FreddieBurgess <<a href="mailto:fburgess@radiantblue.com">fburgess@radiantblue.com</a>><br /> Cc: psql performancelist <<a href="mailto:pgsql-performance@postgresql.org">pgsql-performance@postgresql.org</a>>, Postgres<br/> General <<a href="mailto:pgsql-general@postgresql.org">pgsql-general@postgresql.org</a>><br /><br />On Tue, May 21, 2013 at 4:53 PM, <<a href="mailto:fburgess@radiantblue.com">fburgess@radiantblue.com</a>> wrote:<br/> > The SARS_ACTS table currently has 37,115,515 rows<br /> ><br /> > we have indexed: idx_sars_acts_acts_run_idON SARS_ACTS USING btree<br /> > (sars_run_id)<br /> > we have pk constraint on the SARS_ACTS_RUNtable; sars_acts_run_pkey PRIMARY<br /> > KEY (id )<br /> ><br /> > serverdb=# explain select count(*)as y0_ from SARS_ACTS this_ inner join<br /> > SARS_ACTS_RUN tr1_ on this_.SARS_RUN_ID=<a href="http://tr1_.ID">tr1_.ID</a>where tr1_.ALGORITHM='SMAT';<br /><br /> can you please show us an EXPLAIN ANALYZE of thisquery (not only<br /> EXPLAIN). please paste it in a file and attach it so it doesn't get<br /> reformatted by the mailclient.<br /><br /> what version of postgres is this?<br /><br /> --<br /> Jaime Casanova <a href="http://www.2ndQuadrant.com">www.2ndQuadrant.com</a><br/> Professional PostgreSQL: Soporte 24x7 y capacitación<br />Phone: +593 4 5107566 Cell: +593 987171157<br /></div></blockquote></span>
On Wed, May 22, 2013 at 7:41 AM, <fburgess@radiantblue.com> wrote:
PostgreSQL 9.1.6 on linux
From the numbers in your attached plan, it seems like it should be doing a nested loop from the 580 rows (it thinks) that match in SARS_ACTS_RUN against the index on sars_run_id to pull out the 3297 rows (again, it think, though it is way of there). I can't see why it would not do that. There were some planner issues in the early 9.2 releases that caused very large indexes to be punished, but I don't think those were in 9.1
Could you "set enable_hashjoin to off" and post the "explain analyze" that that gives?
Cheers,
Jeff
On Wednesday, May 22, 2013 10:03 PM fburgess wrote: > I did perform a explain analyze on the query. Explain analyze doesn't help to collect statistics. You should use Analyze <table_name>. Ideally optimizer should have slected the best plan, but just to check you can once try with SET enable_hashjoin=off; And see what is the plan it chooses and does it pick up index scan on larger table? Could you please output of \d SARS_ACTS and \d SARS_ACTS_RUN? With Regards, Amit Kapila.
<span style="font-family:Verdana; color:#000000; font-size:10pt;"><div>serverdb=# set enable_hashjoin=off;<br />SET<br />serverdb=#explain select count(*) as y0_ from SARS_ACTS this_ inner join <span style="color:rgb(0,0,0);white-space:pre-wrap">SARS_ACTS_RUN</span>tr1_ on this_.SARS_RUN_ID=<a href="http://tr1_.ID">tr1_.ID</a>where tr1.ALGORITHM='SMAT';<br /><br /> QUERY PLAN<br />--------------------------------------------------------------------------------------------------------------------------<br />Aggregate (cost=7765563.69..7765563.70 rows=1 width=0) <br /> Nested Loop (cost=0.00..7765555.35 rows=3336 width=0)<br/> -> Index Scan using idx_sars_acts_run_algorithm on sars_acts_run tr1_ (cost=0.00..44.32 rows=650width=8) <br /> Index Cond: ((algorithm)::text = 'SMAT'::text)<br /> -> Index Scan usingidx_sars_acts_run_id_end_time on sars_acts this_ (cost=0.00..11891.29 rows=4452 width=8) <br /> IndexCond: (SARS_RUN_ID=<a href="http://tr1_.ID">tr1_.ID</a>)<br />(6 rows)<br /><br />serverdb=# \timing<br />TIming ison.<br /><br />serverdb=# select count(*) as y0_ from SARS_ACTS this_ inner join <span style="color:rgb(0,0,0);white-space:pre-wrap">SARS_ACTS_RUN</span>tr1_ on this_.SARS_RUN_ID=<a href="http://tr1_.ID">tr1_.ID</a>where tr1.ALGORITHM='SMAT';<br /> y0_<br />------<br />1481710<br />(1 row)<br /><br />Time:85069.416 ms < 1.4 minutes <-- not great, but much better!<br /><br />Subsequently, runs in the millisecondsonce cached.</div><div><br /></div><div>But what negative impact is disabling hash joins?</div><div><br /></div><div>Sorry,I just executed the explain without the analyze, I'll send out the "explain analyze" next reply.</div><div><br/></div><div>thanks</div><div><br /></div><div>Freddie<br /></div><div><br /></div><blockquote id="replyBlockquote"style="border-left: 2px solid blue; margin-left: 8px; padding-left: 8px; font-size:10pt; color:black;font-family:verdana;" webmail="1"><div id="wmQuoteWrapper"> -------- Original Message --------<br /> Subject:Re: [PERFORM] Very slow inner join query Unacceptable latency.<br /> From: Jeff Janes <<a href="mailto:jeff.janes@gmail.com">jeff.janes@gmail.com</a>><br/> Date: Wed, May 22, 2013 5:17 pm<br /> To: <a href="mailto:fburgess@radiantblue.com">fburgess@radiantblue.com</a><br/> Cc: Jaime Casanova <<a href="mailto:jaime@2ndquadrant.com">jaime@2ndquadrant.com</a>>,psql performance list<br /> <<a href="mailto:pgsql-performance@postgresql.org">pgsql-performance@postgresql.org</a>>,Postgres General<br /> <<a href="mailto:pgsql-general@postgresql.org">pgsql-general@postgresql.org</a>><br/><br /><div dir="ltr">On Wed, May 22,2013 at 7:41 AM, <span dir="ltr"><<a href="mailto:fburgess@radiantblue.com" target="_blank">fburgess@radiantblue.com</a>></span>wrote:<br /><div class="gmail_extra"><div class="gmail_quote"><blockquoteclass="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div><span style="font-size:10pt;font-family:Verdana"><div>PostgreSQL 9.1.6 on linux<br /></div></span></div></blockquote><div><br /></div><div><br/></div><div style="">From the numbers in your attached plan, it seems like it should be doing a nested loopfrom the 580 rows (it thinks) that match in <span style="color:rgb(0,0,0);white-space:pre-wrap">SARS_ACTS_RUN againstthe index on sars_run_id to pull out the </span><span style="color:rgb(0,0,0);white-space:pre-wrap">3297 rows (again,it think, though it is way of there)</span><span style="color:rgb(0,0,0);white-space:pre-wrap">. I can't see why itwould not do that. There were some planner issues in the early 9.2 releases that caused very large indexes to be punished,but I don't think those were in 9.1</span></div><div style=""><span style="color:rgb(0,0,0);white-space:pre-wrap"><br/></span></div><div style=""><font color="#000000"><span style="white-space:pre-wrap">Couldyou "set enable_hashjoin to off" and post the "explain analyze" that that gives?</span></font><br/></div><div style=""><font color="#000000"><span style="white-space:pre-wrap"><br /></span></font></div><divstyle=""><font color="#000000"><span style="white-space:pre-wrap"><br /></span></font></div><divstyle=""><font color="#000000"><span style="white-space:pre-wrap">Cheers,</span></font></div><divstyle=""><font color="#000000"><span style="white-space:pre-wrap"><br/></span></font></div><div style=""><font color="#000000"><span style="white-space:pre-wrap">Jeff</span></font></div><div> </div></div></div></div></div></blockquote></span>
On Thu, May 23, 2013 at 12:21 PM, <fburgess@radiantblue.com> wrote: > > But what negative impact is disabling hash joins? > doing it just for a single query, could be a tool for solving particular problems. setting it in postgresql.conf, therefore affecting all queries, is like using a hammer to change tv channel... it will cause more problems than the one it solves. what you can do is: 1) execute: SET enable_hashjoin TO OFF; SELECT here RESET enable_hashjoin TO ON; 2) in a function: CREATE FUNCTION do_something() RETURNS bigint AS $$ SELECT here $$ LANGUAGE sql SET enable_hashjoin TO OFF STABLE; -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación Phone: +593 4 5107566 Cell: +593 987171157
On Thursday, May 23, 2013 10:51 PM fburgess wrote: > serverdb=# set enable_hashjoin=off; > SET > serverdb=# explain select count(*) as y0_ from SARS_ACTS this_ inner join SARS_ACTS_RUN tr1_ on this_.SARS_RUN_ID=tr1_.IDwhere tr1.ALGORITHM='SMAT'; > QUERY PLAN > -------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=7765563.69..7765563.70 rows=1 width=0) > Nested Loop (cost=0.00..7765555.35 rows=3336 width=0) > -> Index Scan using idx_sars_acts_run_algorithm on sars_acts_run tr1_ (cost=0.00..44.32 rows=650 width=8) > Index Cond: ((algorithm)::text = 'SMAT'::text) > -> Index Scan using idx_sars_acts_run_id_end_time on sars_acts this_ (cost=0.00..11891.29 rows=4452 width=8) > Index Cond: (SARS_RUN_ID=tr1_.ID) >(6 rows) >serverdb=# \timing >TIming is on. >serverdb=# select count(*) as y0_ from SARS_ACTS this_ inner join SARS_ACTS_RUN tr1_ on this_.SARS_RUN_ID=tr1_.ID wheretr1.ALGORITHM='SMAT'; > y0_ >------ >1481710 >(1 row) > Time: 85069.416 ms < 1.4 minutes <-- not great, but much better! > Subsequently, runs in the milliseconds once cached. If I see the plan from your other mail as below where Hash join is selected, the cost of Nested Loop is much more, that isthe reason why optimizer would have selected Hash Join. serverdb=# explain analyze select count(*) as y0_ from SARS_ACTS this_ inner join SARS_ACTS_RUN tr1_ on this_.SARS_RUN_ID=tr1_.IDwhere tr1_.ALGORITHM='SMAT'; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=3983424.05..3983424.06 rows=1 width=0) (actual time=1358298.003..1358298.004 rows=1 loops=1) -> Hash Join (cost=44.93..3983415.81 rows=3297 width=0) (actual time=2593.768..1358041.205 rows 1481710 loops=1) It is quite surprising that after optimizer decided the cost of some plan (Hash Join) to be lower but actual execution costof same is more. There might be some problem with cost calculation model of Hash Join for some cases. By the way which version of PostgreSQL you are using? > But what negative impact is disabling hash joins? I think using it as a temporary fix might be okay, but keeping such code in your application might be risky for you, becauseas the data changes in your tables, it could be quite possible that in future Hash Join might be the best and cheapest way. Can you try reproducing it with small data or else can you attach your schema and data for the tables/indexes used in query? With Regards, Amit Kapila.
Looking at the execution plan makes me wonder what your work_mem is set to. Try cranking it up to test and lowering random_page_cost: set work_mem='500MB'; set random_page_cost=1.2; explain analyze select ... and see what you get.