Обсуждение: regarding IN clause

Поиск
Список
Период
Сортировка

regarding IN clause

От
"Surabhi Ahuja "
Дата:

I have a table where the primary key "id" is a serial key.

now i have a query to this table where in, i have to get information related to a set of ids.
This set contains around 130 ids.

I was thinking that i can form a select statement of the form

select * from table where id in(4,8,9,12, and so on....);
i was testing this query on the table with 3000 rows only...
and when i did
explain analyze select * from table where id in(4,8,9,12, and so on....);..
it said that it was doing a sequential scan.

why is it not searching the indexes(index scan)? and how can i make such a query much faster?

Re: regarding IN clause

От
"Dann Corbit"
Дата:

Probably, the sequential scan was faster.

 

There is a big cost to jumping all over the place, loading both the index pages and the data pages.

 

Typically, it is about 10% of the data volume for a file, but I don’t know what the metric is for PostgreSQL.

 

If you want to force the index behavior, you might try finding out how many items you can put into a IN list, and then repeating it with UNION.

E.g.:

SELECT * FROM table_name WHERE id IN (2,4,6,8)

UNION

SELECT * FROM table_name WHERE id IN (1,2,3,4)

UNION

SELECT * FROM table_name WHERE id IN (9,121,13,14)

 

Etc.

 

I am guessing that the sequential scan is faster.

 

You could also try OR lists, which accomplish the same thing.

 

You might also select the ID values into a temporary table and then do a join.  I am guessing that the join query would work pretty well.

 

Give it a try and see, and then tell us what you saw.

 

 


From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Surabhi Ahuja
Sent: Tuesday, May 17, 2005 10:13 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] regarding IN clause

 

I have a table where the primary key "id" is a serial key.

now i have a query to this table where in, i have to get information related to a set of ids.
This set contains around 130 ids.

I was thinking that i can form a select statement of the form

select * from table where id in(4,8,9,12, and so on....);
i was testing this query on the table with 3000 rows only...
and when i did
explain analyze select * from table where id in(4,8,9,12, and so on....);..
it said that it was doing a sequential scan.

why is it not searching the indexes(index scan)? and how can i make such a query much faster?

Re: regarding IN clause

От
Greg Stark
Дата:
"Surabhi Ahuja " <surabhi.ahuja@iiitb.ac.in> writes:

> why is it not searching the indexes(index scan)? and how can i make such a
> query much faster?

130 values out of 3,000 is 4.3% of the table. If you have narrow records with,
say, 20 per page that would mean reading most of the pages out of the table.
Add in that it has to read in the index pages and it could be doing more i/o
than a simple sequential scan. Now consider that random access is about four
times slower than reading sequentially and you could be far behind.

set enable_indexscan = off;
explain analyze select ...
set enable_indexscan = on;

Note that it can be quite hard to get reliable timing information for such a
small table since the second time you run the query all the data will be in
RAM already.

If you find indexes are faster you might try lowering random_page_cost,
possibly as low as 1.2-2.0 but beware of using a single query to tune such
parameters. And beware tuning them on non-production conditions. When the data
grows and less of it fits in RAM you may have to raise it again.

--
greg