Обсуждение: Another "why am I not using the indicies?"
Here is all the information about the system and the steps I took. If you
can tell me or point me in the right direction I would appreciate it. There
are 624306 records in this table, so it only takes a few seconds, but this
is going to grow to 5M rows/month when this works properly.
Clinton James
#select version();
version
----------------------------------------------------------------------
PostgreSQL 7.1rc4 on i686-pc-linux-gnu, compiled by GCC egcs-2.91.66
(1 row)
#CREATE TABLE ld (
billmonth CHAR(4),
cust_num CHAR(8),
location CHAR(4),
date TIMESTAMP,
orignum CHAR(10),
destnum CHAR(13),
destcity CHAR(10),
deststate CHAR(2),
duration NUMERIC(8,1),
callcode CHAR(9),
cost NUMERIC(10,4)
);
#copy ld from '/tmp/ld0104.txt';
#create index ld_custnum_idx ON ld(cust_num);
#create index ld_orignum_idx ON ld(orignum);
#vacuum analyze;
# explain select * from ld where cust_num = '10102899';
NOTICE: QUERY PLAN:
Seq Scan on ld (cost=0.00..20810.83 rows=8989 width=128)
"Clinton James" <cjames@callone.net> writes:
> # explain select * from ld where cust_num = '10102899';
> NOTICE: QUERY PLAN:
> Seq Scan on ld (cost=0.00..20810.83 rows=8989 width=128)
The reason it's not using an indexscan is that it thinks it's going to
find 8989 matches (and if that were true, a seqscan probably *is* the
right plan). You must have some extremely-often-repeated entries in
that column; what are they, and can you get rid of them?
regards, tom lane
"Clinton James" <cjames@callone.net> writes:
> True, using a different cust_num, it is possible there may be a couple of
> cust_num where that would be true. The majority will not have even near
> that many. In this case there are only 13 matching records. Is VACUUMDB
> ANALYZE using the worst case (the cust_num with the greatest number of rows)
> senario for this?
Yes, because that's as far as the statistics go at the moment :-(.
The frequency of the most common value is basically all the info the
planner has about the data distribution, and so a highly skewed
distribution leads to bad plan choices.
I am presently working on better statistics ... to appear in 7.2 ...
regards, tom lane