BUG #5885: Strange rows estimation for left join

Поиск
Список
Период
Сортировка
От Maxim Boguk
Тема BUG #5885: Strange rows estimation for left join
Дата
Msg-id 201102150243.p1F2hHge058708@wwwmaster.postgresql.org
обсуждение исходный текст
Ответы Re: BUG #5885: Strange rows estimation for left join
Список pgsql-bugs
The following bug has been logged online:

Bug reference:      5885
Logged by:          Maxim Boguk
Email address:      Maxim.Boguk@gmail.com
PostgreSQL version: 8.4.4
Operating system:   Linux
Description:        Strange rows estimation for left join
Details:

I found that strange effect while helping with slow query on russian
postgresql online forum.

To be short results looks like:

explain analyze
select *
from references rs
left join references vm on vm.reference = rs.reference and vm.attr_id =
20084
where rs.object_id = 9129863193713091717;

"Nested Loop Left Join  (cost=0.00..1567.71 rows=2129718 width=112) (actual
time=14.654..14.672 rows=2 loops=1)"
"  ->  Index Scan using xif02references on references rs  (cost=0.00..19.85
rows=11 width=56) (actual time=0.019..0.020 rows=2 loops=1)"
"        Index Cond: (object_id = 9129863193713091717::numeric)"
"  ->  Index Scan using xif01references on references vm  (cost=0.00..140.03
rows=55 width=56) (actual time=7.321..7.321 rows=0 loops=2)"
"        Index Cond: ((vm.reference = rs.reference) AND (vm.attr_id =
20084::numeric))"
"Total runtime: 14.758 ms"

E.g. 6 orders of magnitude error in result rows selectivity (2129718 vs 2).

Table description:

CREATE TABLE references
(
  attr_id numeric(20,0) NOT NULL,
  reference numeric(20,0) NOT NULL,
  object_id numeric(20,0) NOT NULL
);

CREATE INDEX xif01references
  ON references
  USING btree
  (reference, attr_id);

CREATE INDEX xif02references
  ON references
  USING btree
  (object_id, attr_id, reference);

Yes again unfortunate EAV model.

Related data from pg_stats

"schemaname"           "public"
"tablename"            "references"

"attname"              "attr_id"
"null_frac"            0
"avg_width"            10
"n_distinct"           23
"most_common_vals"
    "{20084;20085;9127479646713033746;11;14;15;9127479093313033036;457955;91273
06545213993525}"
"most_common_freqs"    "{0.234067;0.233433;0.227433;0.0998333;0.0476667;0.0449;
0.0361;0.0273333;0.0247}"

d"attname"              "reference"
"null_frac"            0
"avg_width"            14
"n_distinct"           56252
"most_common_vals"
    "{9129830887313872119;9129830887313872121;9129787365613945251;9129676282313
943149;24332313945759;...}"
"most_common_freqs"    "{0.2497;0.0138333;0.0002;0.000166667;0.000166667;...}"

"attname"              "object_id"
"null_frac"            0
"avg_width"            15
"n_distinct"           1.23744e+06
"most_common_vals"
    "{9129846527513534962;9129846051413838763;9129846154413978095;9129846403613
302858;9129846025513792413;...}"
"most_common_freqs"    "{0.0003;0.000266667;0.000266667;0.000233333;0.0002;0.00
02;...}"

What seems very strange for me is final esctimation 2M when estimated 11
rows on right side of the join and 55 rows on left side of the join (so
final rows must be 55*11 ~ 600 rows even with worst case of left join
selectivity=1).

May be that effect already fixed in 8.4.5+ (but I had readed changes list
since 8.4.4 and I don't find any related changes).

PS: Sorry if I completely missunderstood of planner mechanics in left join
rows estimation.

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: BUG #5884: i cant select entire array values
Следующее
От: Mike Fowler
Дата:
Сообщение: Re: BUG #5859: XML result in line and column