Обсуждение: <= Index.

Поиск
Список
Период
Сортировка

<= Index.

От
"Greg Sikorski"
Дата:
How come this Index isn't being used in the situation below? :) The field
is an epoch timestamp, and for various reasons it can't be a postgres
date/time type ;)
It does use the index with a direct = comparison.

Cheers, Greg

--
cmaster=# explain analyze SELECT user_id,channel_id FROM levels WHERE
suspend_expires <= 1017550117;
NOTICE:  QUERY PLAN:

Seq Scan on levels  (cost=0.00..13297.38 rows=395082 width=8) (actual
time=0.10..6647.88 rows=355869 loops=1)
Total runtime: 7492.36 msec

cmaster=# \d levels_suspendexpires_idx
Index "levels_suspendexpires_idx"
     Column      |  Type
-----------------+---------
 suspend_expires | integer
btree

--
cmaster=# select min(suspend_expires) from levels;
   0
cmaster=# select max(suspend_expires) from levels;
 1049662527
cmaster=# select avg(suspend_expires) from levels;
 4555730.7985110746


Re: <= Index.

От
Peter Eisentraut
Дата:
Greg Sikorski writes:

> How come this Index isn't being used in the situation below?

There's no point in using an index if you're retrieving nearly the entire
table.

--
Peter Eisentraut   peter_e@gmx.net


Re: <= Index.

От
"Greg Sikorski"
Дата:
pgsql-admin-owner@postgresql.org wrote on 31/03/2002 08:06:52:

> Greg Sikorski writes:
>
> > How come this Index isn't being used in the situation below?
>
> There's no point in using an index if you're retrieving nearly the
entire
> table.
>
> --
> Peter Eisentraut   peter_e@gmx.net
>
>

Well, that was probably a poor example to include, but it was nearly 4am
;)

Typically <20 results are returned, although it still says "Seq Scan", but
on a smaller subset.

--
cmaster=# SELECT user_id,channel_id FROM levels WHERE suspend_expires <=
1017589362 AND suspend_expires <> 0;
 user_id | channel_id
---------+------------
...
(17 rows)

cmaster=# explain analyze SELECT user_id,channel_id FROM levels WHERE
suspend_expires <= 1017589362 AND suspend_expires <> 0;
NOTICE:  QUERY PLAN:

Seq Scan on levels  (cost=0.00..13709.09 rows=2609 width=8) (actual
time=208.98..1521.08 rows=17 loops=1)
Total runtime: 1521.29 msec

Cheers, Greg.

Re: <= Index.

От
Tom Lane
Дата:
"Greg Sikorski" <gte@atomicrevs.demon.co.uk> writes:
> cmaster=# explain analyze SELECT user_id,channel_id FROM levels WHERE
> suspend_expires <= 1017589362 AND suspend_expires <> 0;
> NOTICE:  QUERY PLAN:

> Seq Scan on levels  (cost=0.00..13709.09 rows=2609 width=8) (actual
> time=208.98..1521.08 rows=17 loops=1)
> Total runtime: 1521.29 msec

First question is *can* the thing use an index?  (Try "set enable_seqscan
to off" then explain again.)  If not, it's probably a datatype
compatibility issue --- you'll need to quote or explicitly cast the
constant 1017589362 to match the type of suspend_expires.

If it could use an index but chooses not to, at least part of the
problem is the factor-of-100 overestimate in the number of matching
rows.  That might be alleviated by increasing the statistics target
for the suspend_expires column.  (ALTER TABLE SET STATISTICS, then
ANALYZE or VACUUM ANALYZE.)

You might also find that reducing random_page_cost produces better
indexscan cost estimates for your environment.

            regards, tom lane

Re: <= Index.

От
"Greg Sikorski"
Дата:
pgsql-admin-owner@postgresql.org wrote on 31/03/2002 16:28:09:

>
> First question is *can* the thing use an index?  (Try "set
enable_seqscan
> to off" then explain again.)  If not, it's probably a datatype
> compatibility issue --- you'll need to quote or explicitly cast the
> constant 1017589362 to match the type of suspend_expires.
>

Yep, it does use the index in that case:

---
cmaster=# set enable_seqscan to off;
SET VARIABLE
cmaster=# explain analyze SELECT user_id,channel_id FROM levels WHERE
suspend_expires <= 1017589362 AND suspend_expires <> 0;
NOTICE:  QUERY PLAN:

Index Scan using levels_suspendexpires_idx on levels  (cost=0.00..37098.40
rows=2787 width=8) (actual time=2551.05..2551.05 rows=0 loops=1)
Total runtime: 2551.17 msec
---

However its not much faster, so I took a look at the data distribution in
that table and quite a large amount of the data is 0 most of the time.
After a quick dig around some new 7.2 features I stumbled upon partial
index support:

---
cmaster=# \d levels_suspendexpires_idx
Index "levels_suspendexpires_idx"
     Column      |  Type
-----------------+---------
 suspend_expires | integer
btree
Index predicate: (suspend_expires <> 0)
---
cmaster=# explain analyze SELECT user_id,channel_id FROM levels WHERE
suspend_expires <= 1017605805 AND suspend_expires <> 0;
NOTICE:  QUERY PLAN:

Index Scan using levels_suspendexpires_idx on levels  (cost=0.00..267.65
rows=1621 width=8) (actual time=0.06..0.11 rows=6 loops=1)
Total runtime: 0.19 msec
---

Much better ;)
Thanks for your time and advice :)

Cheers, Greg.