Обсуждение: Timestamp indexes (why ">" or "between" does not use index?)

Поиск
Список
Период
Сортировка

Timestamp indexes (why ">" or "between" does not use index?)

От
"Phoenix Kiula"
Дата:
I have a table with an index on a field called "modify_date".

This works well if I have SQL which ends in

    WHERE modify_date = '2008-01-01'

But if I try this condition:

    WHERE modify_date = '2008-01-01'

THis index is not used. The EXPLAIN tells me it needs to do a seq
scan. Why is this? How can I make a date/time field index which uses
both equality criteria and the greater than/lesser than/between
criteria?

Thanks.

Re: Timestamp indexes (why ">" or "between" does not use index?)

От
"Phoenix Kiula"
Дата:
IN the second SQL, I meant this:

>     WHERE modify_date > '2008-01-01'


On 16/02/2008, Phoenix Kiula <phoenix.kiula@gmail.com> wrote:
> I have a table with an index on a field called "modify_date".
>
> This works well if I have SQL which ends in
>
>     WHERE modify_date = '2008-01-01'
>
> But if I try this condition:
>
>     WHERE modify_date = '2008-01-01'
>
> THis index is not used. The EXPLAIN tells me it needs to do a seq
> scan. Why is this? How can I make a date/time field index which uses
> both equality criteria and the greater than/lesser than/between
> criteria?
>
> Thanks.
>

Re: Timestamp indexes (why ">" or "between" does not use index?)

От
Michael Glaesemann
Дата:
On Feb 16, 2008, at 9:42 , Phoenix Kiula wrote:

> The EXPLAIN tells me it needs to do a seq
> scan. Why is this? How can I make a date/time field index which uses
> both equality criteria and the greater than/lesser than/between
> criteria?
>


The planner will choose a seq scan if it thinks that it will be
faster than using an index: if based on its statistics it thinks a
large portion of rows will match the criteria, a seq scan may well be
faster than an index scan.

Have you analyzed recently? Showing us the EXPLAIN ANALYZE for your
query would be helpful. Have you tried running the EXPLAIN ANALYZE
with seq scans disabled?

Michael Glaesemann
grzm seespotcode net



Re: Timestamp indexes (why ">" or "between" does not use index?)

От
Tom Lane
Дата:
Michael Glaesemann <grzm@seespotcode.net> writes:
> The planner will choose a seq scan if it thinks that it will be
> faster than using an index: if based on its statistics it thinks a
> large portion of rows will match the criteria, a seq scan may well be
> faster than an index scan.

> Have you analyzed recently?

If you've never analyzed at all, the default assumptions will
discourage the planner from using an indexscan for a one-sided
inequality condition (like "x > constant").  On the other hand,
it usually will use an indexscan for a range inequality (like
"x > constant1 and x < constant2", or a BETWEEN construct).

If you do have ANALYZE stats then it all depends on what fraction of
the column's range is selected by the inequality or range condition.

            regards, tom lane

Re: Timestamp indexes (why ">" or "between" does not use index?)

От
"Phoenix Kiula"
Дата:
This table is vacuumed and analyzed every hour, so yes, it's been
analyzed recently.

These are the EXPLAIN ANALYZE outputs for both the equality condition
and the greater than condition:


orguser=# explain analyze select alias from clientswhere modify_date =
'2008-01-01' ;
                                                        QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------
 Index Scan using new_idx_modify_date on clients (cost=0.00..30.23
rows=8 width=10) (actual time=0.136..0.136 rows=0 loops=1)
  Index Cond: (modify_date = '2008-01-01 00:00:00'::timestamp without
time zone)
 Total runtime: 0.220 ms
(3 rows)

Time: 2.832 ms


orguser=# explain analyze select alias from clientswhere modify_date >
'2008-01-01' ;
                                                   QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Seq Scan on clients (cost=0.00..237043.09 rows=136617 width=10)
(actual time=0.391..4007.188 rows=148225 loops=1)
  Filter: (modify_date > '2008-01-01 00:00:00'::timestamp without time zone)
 Total runtime: 4539.242 ms
(3 rows)

Time: 4539.850 ms



Welcome any thoughts. Thanks!

Re: Timestamp indexes (why ">" or "between" does not use index?)

От
Tom Lane
Дата:
"Phoenix Kiula" <phoenix.kiula@gmail.com> writes:
> orguser=# explain analyze select alias from clientswhere modify_date >
> '2008-01-01' ;
>                                                    QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------
>  Seq Scan on clients (cost=0.00..237043.09 rows=136617 width=10)
> (actual time=0.391..4007.188 rows=148225 loops=1)
>   Filter: (modify_date > '2008-01-01 00:00:00'::timestamp without time zone)
>  Total runtime: 4539.242 ms
> (3 rows)

Given the large number of rows being retrieved, I'm not sure that the
seqscan choice is wrong.  You could force the issue by setting
enable_seqscan = off; see what EXPLAIN ANALYZE gives you then.

If it does come out significantly faster, this may mean that you need
to dial down random_page_cost to make the planner's cost estimates
for indexscans be closer to reality on your machine.  Be wary however
of changing that parameter on the basis of only one test case.

            regards, tom lane