Planner wrongly shuns multi-column index for select .. order by col1, col2 limit 1

Поиск
Список
Период
Сортировка
От John Surcombe
Тема Planner wrongly shuns multi-column index for select .. order by col1, col2 limit 1
Дата
Msg-id C0B87DEF8EF4F94C9D8FBD3151B627B26C0C23@digimapserver.digimap.local
обсуждение исходный текст
Ответы Re: Planner wrongly shuns multi-column index for select .. order by col1, col2 limit 1  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance

Hello,

 

We are using PostgreSQL 9.0.3, compiled by Visual C++ build 1500, 32-bit, installed on Windows 2003 R2 32-bit.

 

We have an ‘aisposition’ table used for a GPS tracking application, containing ~30 million rows and a number of indexes.  Two of these are:

 

idx_receiveddatetime: indexes aisposition(receiveddatetime timestamp)

 

idx_userid_receiveddatetime: indexes aisposition(userid int4 desc, receiveddatetime timestamp desc)

 

The problem we get is that the following query is taking many minutes to run:

 

select * from aisposition where userid = 311369000 order by userid desc, receiveddatetime desc limit 1

 

When we ‘EXPLAIN’ this query, PostgreSQL says it is using the index idx_receiveddatetime.  The way the application is designed means that in virtually all cases the query will have to scan a very long way into idx_receiveddatetime to find the first record where userid = 311369000.  If however we delete the idx_receiveddatetime index, the query uses the idx_userid_receiveddatetime index, and the query only takes a few milliseconds.

 

The EXPLAIN ANALYZE output with idx_receiveddatetime in place is:

 

Limit  (cost=0.00..1.30 rows=1 width=398) (actual time=1128097.540..1128097.541 rows=1 loops=1)

  ->  Index Scan Backward using idx_receiveddatetime on aisposition  (cost=0.00..2433441.05 rows=1875926 width=398) (actual time=1128097.532..1128097.532 rows=1 loops=1)

        Filter: (userid = 311369000)

Total runtime: 1128097.609 ms

 

And with that index deleted:

 

Limit  (cost=0.00..4.01 rows=1 width=398) (actual time=60.633..60.634 rows=1 loops=1)

  ->  Index Scan using idx_userid_receiveddatetime on aisposition  (cost=0.00..7517963.47 rows=1875926 width=398) (actual time=60.629..60.629 rows=1 loops=1)

        Index Cond: (userid = 311369000)

Total runtime: 60.736 ms

 

We would obviously prefer PostgreSQL to use the idx_userid_receiveddatetime index in all cases, because we know that this will guarantee results in a timely manner, whereas using idx_receiveddatetime will usually require a scan of much of the table and our application will not work.  What are we doing wrong?

 

Cheers now,

John

В списке pgsql-performance по дате отправления:

Предыдущее
От: Marti Raudsepp
Дата:
Сообщение: Re: Tuning massive UPDATES and GROUP BY's?
Следующее
От: Jeremy
Дата:
Сообщение: Re: Planner wrongly shuns multi-column index for select .. order by col1, col2 limit 1