BUG #5885: Strange rows estimation for left join
От | Maxim Boguk |
---|---|
Тема | BUG #5885: Strange rows estimation for left join |
Дата | |
Msg-id | 201102150243.p1F2hHge058708@wwwmaster.postgresql.org обсуждение исходный текст |
Ответы |
Re: BUG #5885: Strange rows estimation for left join
|
Список | pgsql-bugs |
The following bug has been logged online: Bug reference: 5885 Logged by: Maxim Boguk Email address: Maxim.Boguk@gmail.com PostgreSQL version: 8.4.4 Operating system: Linux Description: Strange rows estimation for left join Details: I found that strange effect while helping with slow query on russian postgresql online forum. To be short results looks like: explain analyze select * from references rs left join references vm on vm.reference = rs.reference and vm.attr_id = 20084 where rs.object_id = 9129863193713091717; "Nested Loop Left Join (cost=0.00..1567.71 rows=2129718 width=112) (actual time=14.654..14.672 rows=2 loops=1)" " -> Index Scan using xif02references on references rs (cost=0.00..19.85 rows=11 width=56) (actual time=0.019..0.020 rows=2 loops=1)" " Index Cond: (object_id = 9129863193713091717::numeric)" " -> Index Scan using xif01references on references vm (cost=0.00..140.03 rows=55 width=56) (actual time=7.321..7.321 rows=0 loops=2)" " Index Cond: ((vm.reference = rs.reference) AND (vm.attr_id = 20084::numeric))" "Total runtime: 14.758 ms" E.g. 6 orders of magnitude error in result rows selectivity (2129718 vs 2). Table description: CREATE TABLE references ( attr_id numeric(20,0) NOT NULL, reference numeric(20,0) NOT NULL, object_id numeric(20,0) NOT NULL ); CREATE INDEX xif01references ON references USING btree (reference, attr_id); CREATE INDEX xif02references ON references USING btree (object_id, attr_id, reference); Yes again unfortunate EAV model. Related data from pg_stats "schemaname" "public" "tablename" "references" "attname" "attr_id" "null_frac" 0 "avg_width" 10 "n_distinct" 23 "most_common_vals" "{20084;20085;9127479646713033746;11;14;15;9127479093313033036;457955;91273 06545213993525}" "most_common_freqs" "{0.234067;0.233433;0.227433;0.0998333;0.0476667;0.0449; 0.0361;0.0273333;0.0247}" d"attname" "reference" "null_frac" 0 "avg_width" 14 "n_distinct" 56252 "most_common_vals" "{9129830887313872119;9129830887313872121;9129787365613945251;9129676282313 943149;24332313945759;...}" "most_common_freqs" "{0.2497;0.0138333;0.0002;0.000166667;0.000166667;...}" "attname" "object_id" "null_frac" 0 "avg_width" 15 "n_distinct" 1.23744e+06 "most_common_vals" "{9129846527513534962;9129846051413838763;9129846154413978095;9129846403613 302858;9129846025513792413;...}" "most_common_freqs" "{0.0003;0.000266667;0.000266667;0.000233333;0.0002;0.00 02;...}" What seems very strange for me is final esctimation 2M when estimated 11 rows on right side of the join and 55 rows on left side of the join (so final rows must be 55*11 ~ 600 rows even with worst case of left join selectivity=1). May be that effect already fixed in 8.4.5+ (but I had readed changes list since 8.4.4 and I don't find any related changes). PS: Sorry if I completely missunderstood of planner mechanics in left join rows estimation.
В списке pgsql-bugs по дате отправления: