Hello-
In rewriting some queries I noticed a huge performance penalty when
using a "date + interval" summation in the SELECT statement, versus a
single simple "date". It is almost as though the "date + interval" is
being calculated for each row...
Much easier to demonstrate than it is to explain (note the runtimes of
the queries, 72ms versus 482ms):
===============================================
main_v0_8=# explain analyze select count(*) from sample_tracker where
initdate>=date '2002-02-01';
NOTICE: QUERY PLAN:
Aggregate (cost=607.24..607.24 rows=1 width=0) (actual
time=72.08..72.08 rows=1 loops=1)
-> Seq Scan on sample_tracker (cost=0.00..595.74 rows=4600 width=0)
(actual time=0.04..63.62 rows=4266 loops=1)
Total runtime: 72.20 msec
EXPLAIN
===============================================
main_v0_8=# explain analyze select count(*) from sample_tracker where
initdate>=date '2002-01-01' + interval '1 month';
NOTICE: QUERY PLAN:
Aggregate (cost=738.23..738.23 rows=1 width=0) (actual
time=482.49..482.49 rows=1 loops=1)
-> Seq Scan on sample_tracker (cost=0.00..723.98 rows=5700 width=0)
(actual time=0.13..470.94 rows=4266 loops=1)
Total runtime: 482.62 msec
EXPLAIN
===============================================
main_v0_8=# explain ANALYZE select date '2002-01-01' + interval '1 month';
NOTICE: QUERY PLAN:
Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.04..0.05 rows=1
loops=1)
Total runtime: 0.09 msec
===============================================
main_v0_8=# select version();
version
-------------------------------------------------------------
PostgreSQL 7.2.1 on i686-pc-linux-gnu, compiled by GCC 2.96
Seems like this could be something ripe for optimization...
--
-**-*-*---*-*---*-*---*-----*-*-----*---*-*---*-----*-----*-*-----*---
Jon Lapham
Extracta Moléculas Naturais, Rio de Janeiro, Brasil
email: lapham@extracta.com.br web: http://www.extracta.com.br/
***-*--*----*-------*------------*--------------------*---------------