Обсуждение: Change query join order
Hi , I have a simple query with two tables. ms_data ~ 4500000 rows ms_commands_history ~ 500000 rows I have done analyze and there are indexes. My question is why the planner didn't do the index scan first on ms_data to reduce the rows to ~ 11000 and the use the PK index on ms_commands_history. Now, if I red the explain correctly it first do the seq_scan on ms_commands_history the then the index scan on ms_data. Any Ideas? Thanks in advance. Kaloyan Iliev SELECT version(); version --------------------------------------------------------------------------------------------------------- PostgreSQL 8.4.0 on i386-portbld-freebsd7.2, compiled by GCC cc (GCC) 4.2.1 20070719 [FreeBSD], 32-bit (1 row) explain analyze SELECT COUNT(*) as count FROM ms_data AS DT, ms_commands_history AS CH WHERE DT.ms_command_history_id = CH.id AND CH.ms_device_id = 1 AND DT.ms_value_type_id = 1 AND DT.meassure_date::date >= '2010-01-01' AND DT.meassure_date::date <= '2010-01-08'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------ Aggregate (cost=88778.73..88778.74 rows=1 width=0) (actual time=16979.109..16979.112 rows=1 loops=1) -> Hash Join (cost=63056.45..88750.77 rows=11183 width=0) (actual time=13774.132..16958.507 rows=11093 loops=1) Hash Cond: (dt.ms_command_history_id = ch.id) -> Index Scan using ms_data_meassure_date_idx on ms_data dt (cost=0.01..23485.68 rows=11183 width=8) (actual time=58.869..2701.928 rows=11093 loops=1) Index Cond: (((meassure_date)::date >= '2010-01-01'::date) AND ((meassure_date)::date <= '2010-01-08'::date)) Filter: (ms_value_type_id = 1) -> Hash (cost=55149.22..55149.22 rows=481938 width=8) (actual time=13590.853..13590.853 rows=481040 loops=1) -> Seq Scan on ms_commands_history ch (cost=0.00..55149.22 rows=481938 width=8) (actual time=0.078..12321.037 rows=481040 loops=1) Filter: (ms_device_id = 1) Total runtime: 16979.326 ms (10 rows)
Kaloyan Iliev Iliev <kaloyan@digsys.bg> writes: > My question is why the planner didn't do the index scan first on ms_data > to reduce the rows to ~ 11000 and the use the PK index on > ms_commands_history. 11000 index probes aren't exactly free. If they take more than about 1msec apiece, the planner picked the right plan. regards, tom lane
On Fri, Jan 8, 2010 at 1:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Kaloyan Iliev Iliev <kaloyan@digsys.bg> writes: >> My question is why the planner didn't do the index scan first on ms_data >> to reduce the rows to ~ 11000 and the use the PK index on >> ms_commands_history. > > 11000 index probes aren't exactly free. If they take more than about > 1msec apiece, the planner picked the right plan. The OP could try setting enable_hashjoin to false (just for testing, never for production) and do EXPLAIN ANALYZE again. That might generate the desired plan, and we could see which one is actually faster. If the other plan does turn out to be faster (and I agree with Tom that there is no guarantee of that), then one thing to check is whether seq_page_cost and random_page_cost are set too high. If the data is all cached, the default values of 4 and 1 are three orders of magnitude too large, and they should also be set to equal rather than unequal values. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, Jan 8, 2010 at 1:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> 11000 index probes aren't exactly free. �If they take more than about >> 1msec apiece, the planner picked the right plan. > The OP could try setting enable_hashjoin to false (just for testing, > never for production) and do EXPLAIN ANALYZE again. That might > generate the desired plan, and we could see which one is actually > faster. Right, sorry for the overly brief response. It might switch to a merge join next, in which case try enable_mergejoin = off as well. > If the other plan does turn out to be faster (and I agree with Tom > that there is no guarantee of that), then one thing to check is > whether seq_page_cost and random_page_cost are set too high. If the > data is all cached, the default values of 4 and 1 are three orders of > magnitude too large, and they should also be set to equal rather than > unequal values. Tweaking the cost parameters to suit your local situation is the recommended cure for planner misjudgments; but I'd recommend against changing them on the basis of only one example. You could easily find yourself making other cases worse. Get a collection of common queries for your app and look at the overall effects. regards, tom lane
On Fri, Jan 8, 2010 at 2:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> If the other plan does turn out to be faster (and I agree with Tom >> that there is no guarantee of that), then one thing to check is >> whether seq_page_cost and random_page_cost are set too high. If the >> data is all cached, the default values of 4 and 1 are three orders of >> magnitude too large, and they should also be set to equal rather than >> unequal values. > > Tweaking the cost parameters to suit your local situation is the > recommended cure for planner misjudgments; but I'd recommend against > changing them on the basis of only one example. You could easily > find yourself making other cases worse. Get a collection of common > queries for your app and look at the overall effects. No argument, and well said -- just trying to point out that the default values really are FAR too high for people with databases that fit in OS cache. ...Robert
<tt>Thanks You,<br /> I changed the random_page_cost to 2 and the query plan has changed and speeds up.<br /> I will checkthe other queries but I think I will leave it at this value.<br /><br /> Thank you again.<br /> Kaloyan Iliev<br /><br/></tt><br /> Robert Haas wrote: <blockquote cite="mid:603c8f071001081155w3b7b8042s362837542cfbc42b@mail.gmail.com"type="cite"><pre wrap="">On Fri, Jan 8, 2010 at 2:23PM, Tom Lane <a class="moz-txt-link-rfc2396E" href="mailto:tgl@sss.pgh.pa.us"><tgl@sss.pgh.pa.us></a> wrote: </pre><blockquotetype="cite"><blockquote type="cite"><pre wrap="">If the other plan does turn out to be faster (and I agreewith Tom that there is no guarantee of that), then one thing to check is whether seq_page_cost and random_page_cost are set too high. If the data is all cached, the default values of 4 and 1 are three orders of magnitude too large, and they should also be set to equal rather than unequal values. </pre></blockquote><pre wrap="">Tweaking the cost parameters to suit your local situation is the recommended cure for planner misjudgments; but I'd recommend against changing them on the basis of only one example. You could easily find yourself making other cases worse. Get a collection of common queries for your app and look at the overall effects. </pre></blockquote><pre wrap=""> No argument, and well said -- just trying to point out that the default values really are FAR too high for people with databases that fit in OS cache. ...Robert </pre></blockquote>