Hi everybody,
I'm having a performance problem, PostgreSQL (7.3.2) is skipping some
optimisation options that it shouldn't IMO. It can be fully reproduced as
follows:
create table foo(
bar char(100),
baz integer
);
Now create a file with 1.2 million empty lines and do a \copy foo (bar)
from 'thatfile'. This should fill the table with 1.2 million rows. Now do:
insert into foo (baz) values (28);
create index foo_idx on foo(baz);
vacuum full analyze foo;
Now, we would expect that PostgreSQL is fully aware that there are not
many rows in foo that have "baz is not null". However:
bsamwel=> explain update foo set baz=null where baz is not null;
QUERY PLAN
---------------------------------------------------------------
Seq Scan on foo (cost=0.00..34470.09 rows=1286146 width=110)
Filter: (baz IS NOT NULL)
(2 rows)
So, it thinks it must do a sequential scan on foo, even though it should
know by now that foo.baz is really mostly null. Even if I disable
sequential scan it still chooses this option! Why doesn't it use the
index? It doesn't use the index either when I try to select all rows that
are not null.
Just for completeness' sake I'll give you the explain analyze:
bsamwel=> explain analyze update foo set baz=null where baz is not null;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Seq Scan on foo (cost=0.00..34470.09 rows=1286146 width=110) (actual
time=19678.82..19678.84 rows=1 loops=1)
Filter: (baz IS NOT NULL)
Total runtime: 19750.21 msec
(3 rows)
Do you guys have any idea?
Regards,
Bart