Обсуждение: Postgres inconsistent use of Index vs. Seq Scan
I'm having difficulty understanding what I perceive as an inconsistency in how the postgres parser chooses to use indices. We have a query based on NOT IN against an indexed column that the parser executes sequentially, but when we perform the same query as IN, it uses the index.
I've created a simplistic example that I believe demonstrates the issue, notice this first query is sequential
CREATE TABLE node
(
id SERIAL PRIMARY KEY,
vid INTEGER
);
CREATE INDEX x ON node(vid);
INSERT INTO node(vid) VALUES (1),(2);
EXPLAIN ANALYZE
SELECT *
FROM node
WHERE NOT vid IN (1);
Seq Scan on node (cost=0.00..36.75 rows=2129 width=8) (actual time=0.009..0.010 rows=1 loops=1)
Filter: (vid <> 1)
Rows Removed by Filter: 1
Total runtime: 0.025 ms
But if we invert the query to IN, you'll notice that it now decided to use the index
EXPLAIN ANALYZE
SELECT *
FROM node
WHERE vid IN (2);
Bitmap Heap Scan on node (cost=4.34..15.01 rows=11 width=8) (actual time=0.017..0.017 rows=1 loops=1)
Recheck Cond: (vid = 1)
-> Bitmap Index Scan on x (cost=0.00..4.33 rows=11 width=0) (actual time=0.012..0.012 rows=1 loops=1)
Index Cond: (vid = 1)
Total runtime: 0.039 ms
Can anyone shed any light on this? Specifically, is there a way to re-write out NOT IN to work with the index (when obviously the result set is not as simplistic as just 1 or 2).
We are using Postgres 9.2 on CentOS 6.6
"Jim Carroll" <jim@carroll.com> writes: > I'm having difficulty understanding what I perceive as an inconsistency in > how the postgres parser chooses to use indices. We have a query based on NOT > IN against an indexed column that the parser executes sequentially, but > when we perform the same query as IN, it uses the index. What you've got here is a query that asks for all rows with vid <> 1. Not-equals is not an indexable operator according to Postgres, and there would not be much point in making it one, since it generally implies having to scan the majority of the table. If, indeed, 99% of your table has vid = 1, then there would be some point in trying to use an index to find the other 1%; but you'll have to formulate the query differently (perhaps "vid > 1" would do?) or else use a properly-designed partial index. regards, tom lane