Hello:
I am running the following query:
SELECT COUNT(*) FROM orders WHERE o_orderdate < date('1995-03-15');
Here are some stats for the orders relation:
select relname, relpages, reltuples from pg_class where relname = 'orders';
"orders";29278;1.49935e+06
For my query above, the reduction factor is about 50%; 700,000 out of 1.5
millions tuples satisfy the WHERE clause.
I have a clustered index defined as follows:
CREATE INDEX orders_orderdate_idx
ON orders
USING btree
(o_orderdate);
ALTER TABLE orders CLUSTER ON orders_orderdate_idx;
I am running three ways: sequential scan, bitmap index scan and index scan.
The I/O cost for the index scan is 24+ times more than the other two. I do not
understand why this happens. If I am using a clustered index, it is my
understanding that there should be no need to retrieve a single page multiple
times, as tuples are sorted. Am I misunderstanding something?
Paula
Here are the results of explain analyze, and I/O results from
pg_statio_user_tables:
Aggregate (cost=1470905.79..1470905.80 rows=1 width=0) (actual
time=9040.320..9040.321 rows=1 loops=1)
-> Index Scan using orders_orderdate_idx on orders (cost=0.00..1469101.03
rows=721902 width=0) (actual time=0.098..8222.234 rows=727305 loops=1)
Index Cond: (o_orderdate < '1995-03-15'::date)
Total runtime: 9040.375 ms
I/O cost:
Heap Blocks Read: 649966 (from disk)
Heap Blocks Hit: 70070 (from buffer)
Index Blocks Read: 1591
__________________________________
Aggregate (cost=52033.65..52033.66 rows=1 width=0) (actual
time=2364.470..2364.471 rows=1 loops=1)
-> Bitmap Heap Scan on orders (cost=11927.12..50228.90 rows=721902 width=0)
(actual time=338.547..1609.118 rows=727305 loops=1)
Recheck Cond: (o_orderdate < '1995-03-15'::date)
-> Bitmap Index Scan on orders_orderdate_idx (cost=0.00..11746.65
rows=721902 width=0) (actual time=329.249..329.249 rows=727305 loops=1)
Index Cond: (o_orderdate < '1995-03-15'::date)
Total runtime: 2364.697 ms
I/O cost:
Heap Blocks Read: 29278
Index Blocks Read: 1591
__________________________________
Aggregate (cost=49832.76..49832.77 rows=1 width=0) (actual
time=2215.752..2215.753 rows=1 loops=1)
-> Seq Scan on orders (cost=0.00..48028.00 rows=721902 width=0) (actual
time=0.042..1458.734 rows=727305 loops=1)
Filter: (o_orderdate < '1995-03-15'::date)
Total runtime: 2215.801 ms
I/O cost:
Heap Blocks Read: 29278