PG7.4.5: query not using index on date column

Поиск
Список
Период
Сортировка
От Dave Steinberg
Тема PG7.4.5: query not using index on date column
Дата
Msg-id 20041123092816.51d7aa5e.dave@redterror.net
обсуждение исходный текст
Ответы Re: PG7.4.5: query not using index on date column
Список pgsql-sql
Hi Folks, I was hoping someone could help me to improve the performance of a query I've got that insists on doing a
seq.scan on a large table.  I'm trying to do some reporting based on my spam logs which I've partly digested and stored
ina table.  Here are the particulars:
 

The messages table:
     Column       |         Type          |  Modifiers 
-------------------+-----------------------+--------------message_id        | integer               | not null default
nextval('spamreport.messages_message_id_seq'::text)received_date    | date                  | not nullscore
|numeric               | not nulluser_threshold    | numeric               | not nullraw_email_address | character
varying(64)| not nullprocessing_time   | numeric               | not nullsize              | integer               |
notnullfuzzed_address    | character varying(64) | not nulldomain            | character varying(64) | not null
 
Indexes:   "messages_pkey" primary key, btree (message_id)   "domain_idx" btree ("domain")   "fuzzy_idx" btree
(fuzzed_address)  "received_date_idx" btree (received_date)
 

And here's the primary query I run, along with explain analyze output:

>> explain analyze SELECT * FROM ( SELECT       domain,       count(*) as count,       max(score) as max_score,
avg(score)as average_score,       stddev(score) as stddev_score,       sum(CASE WHEN score > user_threshold THEN 1 ELSE
0END) as spam_count,       avg(processing_time) as average_time,       avg(size) as average_size     FROM messages
WHEREreceived_date BETWEEN '2004-11-01' AND '2004-11-30'       GROUP BY domain ) AS aggs       ORDER BY count DESC;
 
  QUERY PLAN                                                              
-------------------------------------------------------------------Sort  (cost=30303.51..30303.60 rows=35 width=226)
(actualtime=29869.716..29869.883 rows=69 loops=1)  Sort Key: count  ->  Subquery Scan aggs  (cost=30301.56..30302.61
rows=35width=226) (actual time=29861.705..29869.240 rows=69 loops=1)        ->  HashAggregate  (cost=30301.56..30302.26
rows=35width=54) (actual time=29861.681..29868.261 rows=69 loops=1)              ->  Seq Scan on messages
(cost=0.00..21573.04rows=436426 width=54) (actual time=5.523..6304.657 rows=462931 loops=1)                    Filter:
((received_date>= '2004-11-01'::date) AND (received_date <= '2004-11-30'::date))Total runtime: 29870.437 ms
 

This database gets vacuumed nightly.  Also, the query plan stays the same even if I restrict the received_date column
downto a single day.
 

Thanks in advance,
-- 
Dave Steinberg
http://www.geekisp.com/


В списке pgsql-sql по дате отправления:

Предыдущее
От: terry@greatgulfhomes.com
Дата:
Сообщение: Re: NULLS and string concatenation
Следующее
От: Tom Lane
Дата:
Сообщение: Re: User defined types