Ron Mayer <ron@intervideo.com> writes:
> I did quite a bit more playing with this, and no matter what the
> correlation was (1, -0.001), it never seemed to have any effect
> at all on the execution plan.
> Should it? With a high correlation the index scan is a much better choice.
I'm confused. Your examples show the planner correctly estimating the
indexscan as much cheaper than the seqscan.
> logs2=# explain analyze select count(*) from fact_by_dat where dat='2002-03-01';
> NOTICE: QUERY PLAN:
> Aggregate (cost=380347.31..380347.31 rows=1 width=0) (actual time=77785.14..77785.14 rows=1 loops=1)
> -> Seq Scan on fact (cost=0.00..379816.25 rows=212423 width=0) (actual time=20486.16..77420.05 rows=180295
loops=1)
> Total runtime: 77785.28 msec
Cut-and-paste mistake here somewhere, perhaps? The plan refers to fact
not fact_by_dat.
regards, tom lane