Re: When are index scans used over seq scans?

Поиск
Список
Период
Сортировка
От Richard van den Berg
Тема Re: When are index scans used over seq scans?
Дата
Msg-id 42679922.3040300@trust-factory.com
обсуждение исходный текст
Ответ на Re: When are index scans used over seq scans?  (John A Meinel <john@arbash-meinel.com>)
Список pgsql-performance
Thanks a lot John for the correct search terms. :-)

The suggestion in
http://archives.postgresql.org/pgsql-performance/2005-04/msg00029.php to
add a constraint that checks (finishtime >= starttime) does not make a
difference for me. Still seq scans are used.

The width solution explained in
http://archives.postgresql.org/pgsql-performance/2005-04/msg00027.php
and
http://archives.postgresql.org/pgsql-performance/2005-04/msg00116.php
does make a huge difference when selecting 1 timestamp using a BETWEEN
(2ms vs 2sec), but as soon as I put 2 timestamps in a table and try a
join, everything goes south (7.7sec). I have 10k timestamps in the
duration table. :-(

I'm getting more confused on how the planner decides to use indexes. For
example, if I try:

explain analyze select us.oid from sessions us where '2005-04-10
23:11:00' between us.starttimetrunc and us.finishtimetrunc;

     QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using sessions_st_ft_idx2 on sessions us
(cost=0.00..18320.73 rows=4765 width=4) (actual time=0.063..2.455
rows=279 loops=1)
   Index Cond: (('2005-04-10 23:11:00'::timestamp without time zone <=
finishtimetrunc) AND ('2005-04-10 23:11:00'::timestamp without time zone
>= starttimetrunc))
 Total runtime: 2.616 ms

is uses the index! However, if I change the date it does not:

explain analyze select us.oid from sessions us where '2005-04-09
23:11:00' between us.starttimetrunc and us.finishtimetrunc;

   QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on sessions us  (cost=0.00..68173.04 rows=41575 width=4)
(actual time=553.424..1981.695 rows=64 loops=1)
   Filter: (('2005-04-09 23:11:00'::timestamp without time zone >=
starttimetrunc) AND ('2005-04-09 23:11:00'::timestamp without time zone
<= finishtimetrunc))
 Total runtime: 1981.802 ms

The times in sessions go from '2005-04-04 00:00:00' to '2005-04-10
23:59:00' so both are valid times to query for, but April 10th is more
towards the end. A little experimenting shows that if I go earlier than
'2005-04-10 13:26:15' seq scans are being used. I was thinking this
timestamp would have something to do with the histogram_bounds in
pg_stats, but I cannot find a match:

 starttimetrunc         | {"2005-04-04 00:05:00","2005-04-04
11:49:00","2005-04-04 22:03:00","2005-04-05 10:54:00","2005-04-05
21:08:00","2005-04-06 10:28:00","2005-04-07 01:57:00","2005-04-07
15:55:00","2005-04-08 10:18:00","2005-04-08 17:12:00","2005-04-10 23:57:00"}
 finishtimetrunc        | {"2005-04-04 00:05:00.93","2005-04-04
11:53:00.989999","2005-04-04 22:35:00.38","2005-04-05
11:13:00.029999","2005-04-05 21:31:00.989999","2005-04-06
10:45:01","2005-04-07 02:08:08.25","2005-04-07 16:20:00.93","2005-04-08
10:25:00.409999","2005-04-08 17:15:00.949999","2005-04-11 02:08:19"}

--
Richard van den Berg, CISSP
-------------------------------------------
Trust Factory B.V. |     www.dna-portal.net
Bazarstraat 44a    |  www.trust-factory.com
2518AK The Hague   |  Phone: +31 70 3620684
The Netherlands    |  Fax  : +31 70 3603009
-------------------------------------------
   Have you visited our new DNA Portal?
-------------------------------------------

В списке pgsql-performance по дате отправления:

Предыдущее
От: "Shoaib Burq (VPAC)"
Дата:
Сообщение: two queries and dual cpu (perplexed)
Следующее
От: Jeff
Дата:
Сообщение: Re: two queries and dual cpu (perplexed)