Re: Slow query with planner row strange estimation
От | damien hostin |
---|---|
Тема | Re: Slow query with planner row strange estimation |
Дата | |
Msg-id | 4C318983.6040600@axege.com обсуждение исходный текст |
Ответ на | Slow query with planner row strange estimation (damien hostin <damien.hostin@axege.com>) |
Ответы |
Re: Slow query with planner row strange estimation
(damien hostin <damien.hostin@axege.com>)
|
Список | pgsql-performance |
Hello, Before the week end I tried to change the index, but even with the mono-column index on differents columns, the estimated number of rows from dwhinv is 1. Anyone have a suggestion, what can I check ? thx damien hostin a écrit : > Hello, > > I try to make a query run quicker but I don't really know how to give > hints to the planner. > > We are using postgresql 8.4.3 64bit on ubuntu 9.10 server. The > hardware is a 10 SAS drive (15k) on a single RAID 10 array with 8Go RAM. > Queries come from J2EE application (OLAP cube), but running them in > pg_admin perform the same way. > > I made a short example that shows what I think is the problem. The > real query is much longer but with only one join it already cause > problems. > > Here is the short example : > > select rfoadv_8.rfoadvsup as c8, > sum(dwhinv.dwhinvqte) as m0 > from > dwhinv as dwhinv, > rfoadv as rfoadv_8 > where (dwhinv.dwhinv___rforefide = 'HPLUS' > and (dwhinv.dwhinv___rfodomide = 'PMSI' and > dwhinv.dwhinv___rfoindrvs = '1' and > dwhinv.dwhinv___rfoindide='recN3_BB_reel') ) > and dwhinv.dwhinv_p2rfodstide = rfoadv_8.rfoadvinf > and rfoadv_8.rfoadvsup = 'ACTI' > group by rfoadv_8.rfoadvsup > > dwhinv is a table with almost 6.000.000 records > rfoadv is a view with 800.000 records > rfoadv is based on rfoade which is 50.000 records > > Here is the explain analyse : > GroupAggregate (cost=0.00..16.56 rows=1 width=13) (actual > time=2028.452..2028.453 rows=1 loops=1) > -> Nested Loop (cost=0.00..16.54 rows=1 width=13) (actual > time=0.391..1947.432 rows=42664 loops=1) > Join Filter: (((ade2.rfoadegch)::text >= > (ade1.rfoadegch)::text) AND ((ade2.rfoadedrt)::text <= > (ade1.rfoadedrt)::text)) > -> Nested Loop (cost=0.00..12.54 rows=1 width=214) (actual > time=0.304..533.281 rows=114350 loops=1) > -> Index Scan using dwhinv_rdi_idx on dwhinv > (cost=0.00..4.87 rows=1 width=12) (actual time=0.227..16.827 rows=6360 > loops=1) > Index Cond: (((dwhinv___rforefide)::text = > 'HPLUS'::text) AND ((dwhinv___rfodomide)::text = 'PMSI'::text) AND > ((dwhinv___rfoindide)::text = 'recN3_BB_reel'::text) AND > (dwhinv___rfoindrvs = 1)) > -> Index Scan using rfoade_dsi_idx on rfoade ade2 > (cost=0.00..7.63 rows=3 width=213) (actual time=0.007..0.037 rows=18 > loops=6360) > Index Cond: ((ade2.rfoade_i_rfodstide)::text = > (dwhinv.dwhinv_p2rfodstide)::text) > -> Index Scan using rfoade_pk on rfoade ade1 (cost=0.00..3.98 > rows=1 width=213) (actual time=0.008..0.009 rows=0 loops=114350) > Index Cond: (((ade1.rfoade___rforefide)::text = > (ade2.rfoade___rforefide)::text) AND ((ade1.rfoade_i_rfodstide)::text > = 'ACTI'::text) AND ((ade1.rfoade___rfovdeide)::text = > (ade2.rfoade___rfovdeide)::text) AND (ade1.rfoadervs = ade2.rfoadervs)) > > We can see that the planner think that accessing dwhinv with the > dwhinv_rdi_idx index will return 1 row, but in fact there are 6360. So > the nested loop is not done with 1 loop but 6360. With only one Join, > the query runs in about 1.5 sec which is not really long, but with 8 > join, the same mistake is repeated 8 times, the query runs in 30-60 > sec. I try to disable nested loop, hash join and merge join are done > instead of nested loops, example query runs in 0.2 - 0.5 sec, and the > real query no more that 1 sec ! Which is great. > > Here is the execution plan with nested loop off: > > GroupAggregate (cost=12.56..2453.94 rows=1 width=13) (actual > time=817.306..817.307 rows=1 loops=1) > -> Hash Join (cost=12.56..2453.93 rows=1 width=13) (actual > time=42.583..720.746 rows=42664 loops=1) > Hash Cond: (((ade2.rfoade___rforefide)::text = > (ade1.rfoade___rforefide)::text) AND ((ade2.rfoade___rfovdeide)::text > = (ade1.rfoade___rfovdeide)::text) AND (ade2.rfoadervs = ade1.rfoadervs)) > Join Filter: (((ade2.rfoadegch)::text >= > (ade1.rfoadegch)::text) AND ((ade2.rfoadedrt)::text <= > (ade1.rfoadedrt)::text)) > -> Hash Join (cost=4.88..2446.21 rows=1 width=214) (actual > time=42.168..411.962 rows=114350 loops=1) > Hash Cond: ((ade2.rfoade_i_rfodstide)::text = > (dwhinv.dwhinv_p2rfodstide)::text) > -> Seq Scan on rfoade ade2 (cost=0.00..2262.05 > rows=47805 width=213) (actual time=0.057..78.988 rows=47805 loops=1) > -> Hash (cost=4.87..4.87 rows=1 width=12) (actual > time=41.632..41.632 rows=6360 loops=1) > -> Index Scan using dwhinv_rdi_idx on dwhinv > (cost=0.00..4.87 rows=1 width=12) (actual time=0.232..28.199 rows=6360 > loops=1) > Index Cond: (((dwhinv___rforefide)::text = > 'HPLUS'::text) AND ((dwhinv___rfodomide)::text = 'PMSI'::text) AND > ((dwhinv___rfoindide)::text = 'recN3_BB_reel'::text) AND > (dwhinv___rfoindrvs = 1)) > -> Hash (cost=7.63..7.63 rows=3 width=213) (actual > time=0.347..0.347 rows=11 loops=1) > -> Index Scan using rfoade_dsi_idx on rfoade ade1 > (cost=0.00..7.63 rows=3 width=213) (actual time=0.095..0.307 rows=11 > loops=1) > Index Cond: ((rfoade_i_rfodstide)::text = > 'ACTI'::text) > > Even if dwhinv row estimation is wrong, the query is quicker > > > So after looking at dwhinv_rdi_idx statistics, I found that > dwhinv___rfoindide related stats wasn't good, so I try "ALTER TABLE > dwhinv ALTER dwhinv_p2rfodstide SET STATISTICS 2000" and launch an > vaccum analyse to gather more impressive stats. Stats are better but > query plan is the same and query is not optimised. So I try reindex on > DWHINV as a last chance, but it changes nothing ! > > Maybe I'm wrong with the interpretation of the plan but I don't really > think so because with no nested loops this query is really fast ! I do > not plan to disable nested loop on the whole database because > sometimes, nested loops are greats ! > > Now I'm stuck ! I don't know how to make the planner understand there > are 6000 rows. Or maybe the 3 column index is a bad idea... ?! > > Thanks > -- HOSTIN Damien - Equipe R&D Tel:+33(0)4 63 05 95 40 Société Axège 23 rue Saint Simon 63000 Clermont Ferrand www.axege.com
В списке pgsql-performance по дате отправления:
Следующее
От: "Octavio Alvarez"Дата:
Сообщение: Two "equivalent" WITH RECURSIVE queries, one of them slow.