We used to have to force sequential scans to be disabled because of a very
non-uniform distribution of keys in an index, to actually use the index. We are
a music site and a very large number of keys simply point to a catch-all of
"Various Artists" or "Soundtrack." The 7.2 beta's statistics and optimizer
seems very much better than previous versions of PostgreSQL. Great job guys!
The table:
cdinfo=# select count(*) from zsong ; count
---------3840513
(1 row)
cdinfo=# select artistid, count(artistid) from zsong group by artistid order by
count(artistid) desc limit 2;artistid | count
-----------+--------100050450 | 461727100036031 | 54699
(2 rows)
In PostgreSQL 7.1.2:
cdinfo=# select version() ; version
---------------------------------------------------------------------PostgreSQL 7.1.2 on i686-pc-linux-gnu, compiled by
GCCegcs-2.91.66
(1 row)
cdinfo=# explain select count(*) from zsong where artistid = 1 ;
NOTICE: QUERY PLAN:
Aggregate (cost=93874.21..93874.21 rows=1 width=0) -> Seq Scan on zsong (cost=0.00..93769.55 rows=41863 width=0)
EXPLAIN
cdinfo=# explain select count(*) from zsong where artistid = 100050450;
NOTICE: QUERY PLAN:
Aggregate (cost=94816.11..94816.11 rows=1 width=0) -> Seq Scan on zsong (cost=0.00..93769.55 rows=418625 width=0)
EXPLAIN
In PostgreSQL 7.2b1
cdinfo=# select version(); version
-------------------------------------------------------------PostgreSQL 7.2b1 on i686-pc-linux-gnu, compiled by GCC
2.96
(1 row)
cdinfo=# explain select count(*) from zsong where artistid = 1 ;
NOTICE: QUERY PLAN:
Aggregate (cost=80.10..80.10 rows=1 width=0) -> Index Scan using zsong_artistid on zsong (cost=0.00..80.00 rows=39
width=0)
EXPLAIN
cdinfo=# explain select count(*) from zsong where artistid = 100050450;
NOTICE: QUERY PLAN:
Aggregate (cost=94899.78..94899.78 rows=1 width=0) -> Seq Scan on zsong (cost=0.00..93664.41 rows=494146 width=0)
EXPLAIN