Обсуждение: Fixed Cardinality estimation with equality predicates between column of the same table

Поиск
Список
Период
Сортировка

Fixed Cardinality estimation with equality predicates between column of the same table

От
desmodemone
Дата:
Hi all,
          I see a strange behavior ( for me ) on 9.2 (but seems the same on 9.1 and 9.3)  of the optimizer on query like that :

/* create a table with random data and 20000 rows */

create table test1 ( id int not null primary key, state1 int not null default 0, state2 int not null default 0, state3 int not null default 0 );

 insert into test1 (id, state1, state2, state3) select i, (random()*3)::int, (random())::int, (random()*100)::int from generate_series (1, 20000) as gs(i) ;

analyze test1 ;

/* between  same columns  */

explain  select * from test1 where state1=state1 ;
                        QUERY PLAN                       
----------------------------------------------------------
 Seq Scan on test1  (cost=0.00..359.00 rows=100 width=16)
   Filter: (state1 = state1)
(2 rows)

test3=# explain  select * from test1 where state2=state2 ;
                        QUERY PLAN                       
----------------------------------------------------------
 Seq Scan on test1  (cost=0.00..359.00 rows=100 width=16)
   Filter: (state2 = state2)
(2 rows)

/* between different columns of same table  */

test3=# explain  select * from test1 where state1=state2 ;
                        QUERY PLAN                       
----------------------------------------------------------
 Seq Scan on test1  (cost=0.00..359.00 rows=100 width=16)
   Filter: (state1 = state2)
(2 rows)

===================================================================

/* create a table with random data and 100000 rows to verify  */

create table test2 ( id int not null primary key, state1 int not null default 0, state2 int not null default 0, state3 int not null default 0 );

 insert into test2 (id, state1, state2, state3) select i, (random()*3)::int, (random())::int, (random()*100)::int from generate_series (1, 100000) as gs(i) ;

test3=#  analyze  test2 ;
ANALYZE
test3=# explain  select * from test2 where state1=state3;                        QUERY PLAN                        
-----------------------------------------------------------
 Seq Scan on test2  (cost=0.00..1791.00 rows=500 width=16)
   Filter: (state1 = state3)
(2 rows)

test3=# explain  select * from test2 where state1=state2;
                        QUERY PLAN                        
-----------------------------------------------------------
 Seq Scan on test2  (cost=0.00..1791.00 rows=500 width=16)
   Filter: (state1 = state2)
(2 rows)

test3=# explain  select * from test2 where state1=state1;
                        QUERY PLAN                        
-----------------------------------------------------------
 Seq Scan on test2  (cost=0.00..1791.00 rows=500 width=16)
   Filter: (state1 = state1)
(2 rows)



It's seems always 0.5% of the rows , and it seems indipendent of the type of data you have in row :

/*add a column where costant value named c3 */

 alter table test1 add c3 int default 1 ;
ALTER TABLE

analyze test1 ;
ANALYZE

explain  select * from test1  where state1=c3;
                        QUERY PLAN                       
----------------------------------------------------------
 Seq Scan on test1  (cost=0.00..378.00 rows=100 width=20)
   Filter: (state1 = c3)
(2 rows)

/*add a column where costant value named c3 */

 alter table test2 add c3 int default 1 ;
ALTER TABLE
 analyze test2 ;
ANALYZE
 explain  select * from test2  where state1=c3;
                        QUERY PLAN                        
-----------------------------------------------------------
 Seq Scan on test2  (cost=0.00..1887.00 rows=500 width=20)
   Filter: (state1 = c3)
(2 rows)

/* add another constant column */

test3=# alter table test2 add c4 int default 1 ;
ALTER TABLE
test3=# analyze test2 ;
ANALYZE
test3=# explain  select * from test2  where c3=c4 ;
                        QUERY PLAN                        
-----------------------------------------------------------
 Seq Scan on test2  (cost=0.00..1887.00 rows=500 width=24)
   Filter: (c3 = c4)

obviously the statistics are ok :



Always 0.5%.

Greetings

Matteo

Re: Fixed Cardinality estimation with equality predicates between column of the same table

От
Josh Berkus
Дата:
On 06/21/2013 02:33 PM, desmodemone wrote:
> Hi all,
>           I see a strange behavior ( for me ) on 9.2 (but seems the same on
> 9.1 and 9.3)  of the optimizer on query like that :
> 

Matteo, I just posted this on -performance.  See Tom's answer.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com