performance problem with LIMIT (order BY in DESC order). Wrong index used?

Поиск
Список
Период
Сортировка
От Dieter Rehbein
Тема performance problem with LIMIT (order BY in DESC order). Wrong index used?
Дата
Msg-id 152FFD9B-A288-4329-9C96-0FF0C6A2D418@skiline.cc
обсуждение исходный текст
Ответы Re: performance problem with LIMIT (order BY in DESC order). Wrong index used?
Список pgsql-performance
Hi everybody,

I have a performance-problem with a query using a LIMIT. There are other threads rergading performance issues with
LIMIT,but I didn't find useful hints for our problem and it might 
be interesting for other postgres-users.


There are only 2 simple tables:

CREATE TABLE newsfeed
(
id             varchar(32) PRIMARY KEY,
version        int4        NOT NULL,
newsfeed_type  varchar(20) NOT NULL,
new_item_count int4        NOT NULL
);
CREATE  INDEX IDX_NEWSFEED_TYPE ON newsfeed (newsfeed_type);


CREATE TABLE newsfeed_item
(
id            varchar(32)   PRIMARY NOT NULL,
item_type     varchar(35)   NOT NULL,
version       int4          NOT NULL,
category      varchar(25)   NULL,
data1         bytea         NULL,
data2         bytea         NULL,
date_time     timestamp     NOT NULL,
guid1         varchar(32)   NULL,
guid2         varchar(32)   NULL,
guid3         varchar(32)   NULL,
id1           int8          NULL,
id2           int8          NULL,
long_value1   int8          NULL,
long_value2   int8          NULL,
long_value3   int8          NULL,
string_value1 varchar(4000) NULL,
string_value2 varchar(500)  NULL,
string_value3 varchar(500)  NULL,
string_value4 varchar(500)  NULL,
string_value5 varchar(500)  NULL,
string_value6 varchar(500)  NULL,
newsfeed      varchar(32)   NOT NULL
);
CREATE UNIQUE INDEX newsfeed_item_pkey ON newsfeed_item (id);
CREATE  INDEX idx_nfi_guid1 ON newsfeed_item (guid1);
CREATE  INDEX idx_nfi_guid2 ON newsfeed_item (guid2);
CREATE  INDEX idx_nfi_guid3 ON newsfeed_item (guid3);
CREATE  INDEX idx_nfi_id1 ON newsfeed_item (id1);
CREATE  INDEX idx_nfi_id2 ON newsfeed_item (id2);
CREATE  INDEX idx_nfi_newsfeed ON newsfeed_item (newsfeed);
CREATE  INDEX idx_nfi_type ON newsfeed_item (item_type);
CREATE  INDEX idx_nfi_datetime ON newsfeed_item (date_time);

newsfeed contains 457036 rows
newsweed_item contains 5169727 rows

postgres version: 9.0.2
OS: CentOS release 5.5 (Final)


The following query took 4.2 seconds:

-------------------------
select *
from newsfeed_item
where newsfeed in
          (
          '173ee4dcec0d11de9f4f12313c0018c1','10dabde0f70211df816612313b02054e',

'17841c9af70211df874b12313b02054e','1783fce2f70211df814412313b02054e','1783fdd2f70211df8c1d12313b02054e','178405a2f70211df829212313b02054e',

'178440c6f70211df97c812313b02054e','178416e6f70211dfac3412313b02054e','1783e4aaf70211df9acd12313b02054e','178437e8f70211df8b8512313b02054e',
          '1783f54ef70211df81e012313b02054e','178415c4f70211df8f8112313b02054e'
          )
order by date_time desc

limit 25
-------------------------

If the LIMIT was removed, the query took 60 milliseconds!  If the sorting order was changed to ASC, the query took
44ms,even with the LIMIT. 

Then I tried to create the index on date_time in DESC order (because the result is sorted in descending order), but
thatdid not change anything.   


Then I removed the index on date_time with the following results:

query with the limit:     40 ms
query without the limit:  60 ms

=> the optimizer seems to use a wrong index (I did perform an ANALYZE on newsfeed_item and a REINDEX before I did the
test).Since I currently don't need  
the index on date_time (but will need it in the near future), I removed the index on date_time, which is ok for now.

------------------------

here are the explain analyze results:

1) the query in descending order with the limit and index on date_time (the slow one):

Limit  (cost=0.00..980.09 rows=25 width=963) (actual time=48.592..4060.779 rows=25 loops=1)
  ->  Index Scan Backward using "IDX_NFI_DATETIME" on newsfeed_item  (cost=0.00..409365.16 rows=10442 width=963)
(actualtime=48.581..4060.542 rows=25 loops=1) 
        Filter: ((newsfeed)::text = ANY
('{173ee4dcec0d11de9f4f12313c0018c1,10dabde0f70211df816612313b02054e,17841c9af70211df874b12313b02054e,1783fce2f70211df814412313b02054e,1783fdd2f70211df8c1d12313b02054e,178405a2f70211df829212313b02054e,178440c6f70211df97c812313b02054e,178416e6f70211dfac3412313b02054e,1783e4aaf70211df9acd12313b02054e,178437e8f70211df8b8512313b02054e,1783f54ef70211df81e012313b02054e,178415c4f70211df8f8112313b02054e}'::text[]))
Total runtime: 4060.959 ms


2) the query in descending order without the limit (which is much faster):

Sort  (cost=39575.23..39601.33 rows=10442 width=963) (actual time=15.014..17.038 rows=477 loops=1)
  Sort Key: date_time
  Sort Method:  quicksort  Memory: 287kB
  ->  Bitmap Heap Scan on newsfeed_item  (cost=421.41..34450.72 rows=10442 width=963) (actual time=0.644..12.601
rows=477loops=1) 
        Recheck Cond: ((newsfeed)::text = ANY
('{173ee4dcec0d11de9f4f12313c0018c1,10dabde0f70211df816612313b02054e,17841c9af70211df874b12313b02054e,1783fce2f70211df814412313b02054e,1783fdd2f70211df8c1d12313b02054e,178405a2f70211df829212313b02054e,178440c6f70211df97c812313b02054e,178416e6f70211dfac3412313b02054e,1783e4aaf70211df9acd12313b02054e,178437e8f70211df8b8512313b02054e,1783f54ef70211df81e012313b02054e,178415c4f70211df8f8112313b02054e}'::text[]))
        ->  Bitmap Index Scan on idx_nfi_newsfeed  (cost=0.00..418.80 rows=10442 width=0) (actual time=0.555..0.555
rows=477loops=1) 
              Index Cond: ((newsfeed)::text = ANY
('{173ee4dcec0d11de9f4f12313c0018c1,10dabde0f70211df816612313b02054e,17841c9af70211df874b12313b02054e,1783fce2f70211df814412313b02054e,1783fdd2f70211df8c1d12313b02054e,178405a2f70211df829212313b02054e,178440c6f70211df97c812313b02054e,178416e6f70211dfac3412313b02054e,1783e4aaf70211df9acd12313b02054e,178437e8f70211df8b8512313b02054e,1783f54ef70211df81e012313b02054e,178415c4f70211df8f8112313b02054e}'::text[]))
Total runtime: 19.065 ms

3) the query in ascending order with the limit (which is fast):

Limit  (cost=0.00..980.09 rows=25 width=963) (actual time=0.261..3.704 rows=25 loops=1)
  ->  Index Scan using "IDX_NFI_DATETIME" on newsfeed_item  (cost=0.00..409365.16 rows=10442 width=963) (actual
time=0.250..3.495rows=25 loops=1) 
        Filter: ((newsfeed)::text = ANY
('{173ee4dcec0d11de9f4f12313c0018c1,10dabde0f70211df816612313b02054e,17841c9af70211df874b12313b02054e,1783fce2f70211df814412313b02054e,1783fdd2f70211df8c1d12313b02054e,178405a2f70211df829212313b02054e,178440c6f70211df97c812313b02054e,178416e6f70211dfac3412313b02054e,1783e4aaf70211df9acd12313b02054e,178437e8f70211df8b8512313b02054e,1783f54ef70211df81e012313b02054e,178415c4f70211df8f8112313b02054e}'::text[]))
Total runtime: 3.854 ms


4) The query after removing the index on date_time, in descending order with the LIMIT (which is fast as well).

Limit  (cost=34745.39..34745.45 rows=25 width=963) (actual time=12.855..13.143 rows=25 loops=1)
  ->  Sort  (cost=34745.39..34771.49 rows=10442 width=963) (actual time=12.846..12.946 rows=25 loops=1)
        Sort Key: date_time
        Sort Method:  top-N heapsort  Memory: 40kB
        ->  Bitmap Heap Scan on newsfeed_item  (cost=421.41..34450.72 rows=10442 width=963) (actual time=0.622..9.936
rows=477loops=1) 
              Recheck Cond: ((newsfeed)::text = ANY
('{173ee4dcec0d11de9f4f12313c0018c1,10dabde0f70211df816612313b02054e,17841c9af70211df874b12313b02054e,1783fce2f70211df814412313b02054e,1783fdd2f70211df8c1d12313b02054e,178405a2f70211df829212313b02054e,178440c6f70211df97c812313b02054e,178416e6f70211dfac3412313b02054e,1783e4aaf70211df9acd12313b02054e,178437e8f70211df8b8512313b02054e,1783f54ef70211df81e012313b02054e,178415c4f70211df8f8112313b02054e}'::text[]))
              ->  Bitmap Index Scan on idx_nfi_newsfeed  (cost=0.00..418.80 rows=10442 width=0) (actual
time=0.543..0.543rows=477 loops=1) 
                    Index Cond: ((newsfeed)::text = ANY
('{173ee4dcec0d11de9f4f12313c0018c1,10dabde0f70211df816612313b02054e,17841c9af70211df874b12313b02054e,1783fce2f70211df814412313b02054e,1783fdd2f70211df8c1d12313b02054e,178405a2f70211df829212313b02054e,178440c6f70211df97c812313b02054e,178416e6f70211dfac3412313b02054e,1783e4aaf70211df9acd12313b02054e,178437e8f70211df8b8512313b02054e,1783f54ef70211df81e012313b02054e,178415c4f70211df8f8112313b02054e}'::text[]))

Total runtime: 13.318 ms

Is there anything I can do to add the index on date_time without the performance problem?

regards
Dieter


В списке pgsql-performance по дате отправления:

Предыдущее
От: Jesper Krogh
Дата:
Сообщение: Re: Linux: more cores = less concurrency.
Следующее
От: Scott Marlowe
Дата:
Сообщение: Re: Linux: more cores = less concurrency.