Обсуждение: plan question - query with order by and limit not choosing index depends on size of limit, table

Поиск
Список
Период
Сортировка

plan question - query with order by and limit not choosing index depends on size of limit, table

От
Mike Broers
Дата:
Hello performance, I need help explaining the performance of a particular query:

select * from messages where ((messages.topic = E'/x') AND (messages.processed = 'f'))  ORDER BY messages.created_at ASC limit 10;


Table Structure:
 
  Column   |            Type             |                             Modifiers                              
------------+-----------------------------+--------------------------------------------------------------------
 id         | integer                     | not null default nextval('landing_page.messages_id_seq'::regclass)
 processed  | boolean                     | 
 topic      | character varying(255)      | 
 body       | text                        | 
 created_at | timestamp without time zone | 
 updated_at | timestamp without time zone | 
Indexes:
    "messages_pkey" PRIMARY KEY, btree (id)
    "idx_landing_page_messages_created_at" btree (created_at)
    "idx_messages_topic_processed" btree (topic, processed)


Table row count ~ 1million

When I run the query with limit 10 it skips the idx_messages_topic_processed.
When I run the query with no limit, or with a limit above 20 it uses the desired index.
On a different system with a much smaller data set (~200,000) i have to use a limit of about 35 to use the desired index.

this is the good plan with no limit or 'sweet spot' limit

 Limit  (cost=2050.29..2050.38 rows=35 width=1266)
   ->  Sort  (cost=2050.29..2052.13 rows=737 width=1266)
         Sort Key: created_at
         ->  Bitmap Heap Scan on messages  (cost=25.86..2027.70 rows=737 width=1266)
               Recheck Cond: ((topic)::text = 'x'::text)
               Filter: (NOT processed)
               ->  Bitmap Index Scan on idx_messages_topic_processed  (cost=0.00..25.68 rows=737 width=0)
                     Index Cond: (((topic)::text = '/x'::text) AND (processed = false))

This is the bad plan with limit 10
 Limit  (cost=0.00..1844.07 rows=30 width=1266)
   ->  Index Scan using idx_landing_page_messages_created_at on messages  (cost=0.00..45302.70 rows=737 width=1266)
         Filter: ((NOT processed) AND ((topic)::text = 'x'::text))


Not sure if cost has anything to do with it, but this is set in postgresql.conf.  I am hesitant to change this as I have inherited the database from a previous dba and dont want to adversely affect things that caused this to be set in a non default manner if possible.

#seq_page_cost = 1.0 # measured on an arbitrary scale
random_page_cost = 3.0 # same scale as above



Why does the smaller limit cause it to skip the index?
Is there a way to help the planner choose the better plan?

Much appreciated, 
Mike





Re: plan question - query with order by and limit not choosing index depends on size of limit, table

От
"Kevin Grittner"
Дата:
Mike Broers <mbroers@gmail.com> wrote:

> Hello performance, I need help explaining the performance of a
> particular query

You provided some of the information needed, but you should review
this page and post a bit more:

http://wiki.postgresql.org/wiki/SlowQueryQuestions

In particular, post the result of EXPLAIN ANALYZE, not just EXPLAIN.
Also, showing all overrides in your postgresql.conf file is
important, and some information about your hardware.  How big is the
active portion of your database (the frequently read portion)?

> Why does the smaller limit cause it to skip the index?

Because the optimizer thinks the query will return rows sooner that
way.

> Is there a way to help the planner choose the better plan?

You might get there by adjusting your memory settings and/or costing
settings, but we need to see more information to know that.

-Kevin

Thanks for the assistance.  

Here is an explain analyze of the query with the problem limit:

production=# explain analyze select * from landing_page.messages where ((messages.topic = E'x') AND (messages.processed = 'f'))  ORDER BY messages.created_at ASC limit 10;

                                                                                QUERY PLAN                    
                                                            
--------------------------------------------------------------------------------------------------------------
------------------------------------------------------------
 Limit  (cost=0.00..2891.06 rows=10 width=1340) (actual time=207922.586..207922.586 rows=0 loops=1)
   ->  Index Scan using idx_landing_page_messages_created_at on messages  (cost=0.00..449560.48 rows=1555 widt
h=1340) (actual time=207922.581..207922.581 rows=0 loops=1)
         Filter: ((NOT processed) AND ((topic)::text = 'x'::text))
 Total runtime: 207949.413 ms
(4 rows)


and an explain analyze with a higher limit that hits the index:


production=# explain analyze select * from landing_page.messages where ((messages.topic = E'x') AND (messages.processed = 'f'))  ORDER BY messages.created_at ASC limit 25;
                                                                      QUERY PLAN                              
                                         
--------------------------------------------------------------------------------------------------------------
-----------------------------------------
 Limit  (cost=5885.47..5885.54 rows=25 width=1340) (actual time=80.931..80.931 rows=0 loops=1)
   ->  Sort  (cost=5885.47..5889.36 rows=1555 width=1340) (actual time=80.926..80.926 rows=0 loops=1)
         Sort Key: created_at
         Sort Method:  quicksort  Memory: 17kB
         ->  Bitmap Heap Scan on messages  (cost=60.45..5841.59 rows=1555 width=1340) (actual time=64.404..64.
404 rows=0 loops=1)
               Recheck Cond: ((topic)::text = 'x'::text)
               Filter: (NOT processed)
               ->  Bitmap Index Scan on idx_messages_topic_processed  (cost=0.00..60.06 rows=1550 width=0) (ac
tual time=56.207..56.207 rows=0 loops=1)
                     Index Cond: (((topic)::text = 'x'::text) AND (p
rocessed = false))
 Total runtime: 88.051 ms
(10 rows)


overrides in postgresql.conf

shared_buffers = 256MB
work_mem = 8MB
max_fsm_pages = 2000000
max_fsm_relations = 2000
checkpoint_segments = 10
archive_mode = on
random_page_cost = 3.0
effective_cache_size = 6GB
default_statistics_target = 250
logging_collector = on


Forgot to mention this is Postgres 8.3.8 with 6GB memory on the server.

When you ask how big is the active portion of the database I am not sure how to answer.  The whole database server is about 140GB, but there are other applications that use this database, this particular table is about 1.6GB and growing.  Currently there are jobs that query from this table every minute.

Thanks again
Mike






On Wed, Jan 5, 2011 at 5:10 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
Mike Broers <mbroers@gmail.com> wrote:

> Hello performance, I need help explaining the performance of a
> particular query

You provided some of the information needed, but you should review
this page and post a bit more:

http://wiki.postgresql.org/wiki/SlowQueryQuestions

In particular, post the result of EXPLAIN ANALYZE, not just EXPLAIN.
Also, showing all overrides in your postgresql.conf file is
important, and some information about your hardware.  How big is the
active portion of your database (the frequently read portion)?

> Why does the smaller limit cause it to skip the index?

Because the optimizer thinks the query will return rows sooner that
way.

> Is there a way to help the planner choose the better plan?

You might get there by adjusting your memory settings and/or costing
settings, but we need to see more information to know that.

-Kevin

Re: plan question - query with order by and limit not choosing index depends on size of limit, table

От
pasman pasmański
Дата:
Try
order by created_at+0

On 1/6/11, Mike Broers <mbroers@gmail.com> wrote:
> Thanks for the assistance.
>
> Here is an explain analyze of the query with the problem limit:
>
> production=# explain analyze select * from landing_page.messages where
> ((messages.topic = E'x') AND (messages.processed = 'f'))  ORDER BY
> messages.created_at ASC limit 10;
>
>
>    QUERY PLAN
>
> --------------------------------------------------------------------------------------------------------------
> ------------------------------------------------------------
>  Limit  (cost=0.00..2891.06 rows=10 width=1340) (actual
> time=207922.586..207922.586 rows=0 loops=1)
>    ->  Index Scan using idx_landing_page_messages_created_at on messages
>  (cost=0.00..449560.48 rows=1555 widt
> h=1340) (actual time=207922.581..207922.581 rows=0 loops=1)
>          Filter: ((NOT processed) AND ((topic)::text = 'x'::text))
>  Total runtime: 207949.413 ms
> (4 rows)
>
>
> and an explain analyze with a higher limit that hits the index:
>
>
> production=# explain analyze select * from landing_page.messages where
> ((messages.topic = E'x') AND (messages.processed = 'f'))  ORDER BY
> messages.created_at ASC limit 25;
>                                                                       QUERY
> PLAN
>
> --------------------------------------------------------------------------------------------------------------
> -----------------------------------------
>  Limit  (cost=5885.47..5885.54 rows=25 width=1340) (actual
> time=80.931..80.931 rows=0 loops=1)
>    ->  Sort  (cost=5885.47..5889.36 rows=1555 width=1340) (actual
> time=80.926..80.926 rows=0 loops=1)
>          Sort Key: created_at
>          Sort Method:  quicksort  Memory: 17kB
>          ->  Bitmap Heap Scan on messages  (cost=60.45..5841.59 rows=1555
> width=1340) (actual time=64.404..64.
> 404 rows=0 loops=1)
>                Recheck Cond: ((topic)::text = 'x'::text)
>                Filter: (NOT processed)
>                ->  Bitmap Index Scan on idx_messages_topic_processed
>  (cost=0.00..60.06 rows=1550 width=0) (ac
> tual time=56.207..56.207 rows=0 loops=1)
>                      Index Cond: (((topic)::text = 'x'::text) AND (p
> rocessed = false))
>  Total runtime: 88.051 ms
> (10 rows)
>
>
> overrides in postgresql.conf
>
> shared_buffers = 256MB
> work_mem = 8MB
> max_fsm_pages = 2000000
> max_fsm_relations = 2000
> checkpoint_segments = 10
> archive_mode = on
> random_page_cost = 3.0
> effective_cache_size = 6GB
> default_statistics_target = 250
> logging_collector = on
>
>
> Forgot to mention this is Postgres 8.3.8 with 6GB memory on the server.
>
> When you ask how big is the active portion of the database I am not sure how
> to answer.  The whole database server is about 140GB, but there are other
> applications that use this database, this particular table is about 1.6GB
> and growing.  Currently there are jobs that query from this table every
> minute.
>
> Thanks again
> Mike
>
>
>
>
>
>
> On Wed, Jan 5, 2011 at 5:10 PM, Kevin Grittner
> <Kevin.Grittner@wicourts.gov>wrote:
>
>> Mike Broers <mbroers@gmail.com> wrote:
>>
>> > Hello performance, I need help explaining the performance of a
>> > particular query
>>
>> You provided some of the information needed, but you should review
>> this page and post a bit more:
>>
>> http://wiki.postgresql.org/wiki/SlowQueryQuestions
>>
>> In particular, post the result of EXPLAIN ANALYZE, not just EXPLAIN.
>> Also, showing all overrides in your postgresql.conf file is
>> important, and some information about your hardware.  How big is the
>> active portion of your database (the frequently read portion)?
>>
>> > Why does the smaller limit cause it to skip the index?
>>
>> Because the optimizer thinks the query will return rows sooner that
>> way.
>>
>> > Is there a way to help the planner choose the better plan?
>>
>> You might get there by adjusting your memory settings and/or costing
>> settings, but we need to see more information to know that.
>>
>> -Kevin
>>
>

--
Sent from my mobile device

------------
pasman

Thanks for the suggestion, 

created_at is a timestamp without time zone type column.  When I add +0 to created at I get a cast error.  I am able to get the query to use the desired index when increasing or removing the limit, and I am still looking for the reason why that is happening.  Any advice or more information I can supply please let me know.


ERROR:  operator does not exist: timestamp without time zone + integer
LINE 1: ...es.processed = 'f'))  ORDER BY messages.created_at+0 ASC lim...
                                                             ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.




From: "pasman pasmański" <pasman.p@gmail.com>
To: pgsql-performance@postgresql.org
Date: Fri, 7 Jan 2011 15:00:22 +0100
Subject: Re: plan question - query with order by and limit not choosing index depends on size of limit, table
Try
order by created_at+0

On Thu, Jan 6, 2011 at 3:36 PM, Mike Broers <mbroers@gmail.com> wrote:
Thanks for the assistance.  

Here is an explain analyze of the query with the problem limit:

production=# explain analyze select * from landing_page.messages where ((messages.topic = E'x') AND (messages.processed = 'f'))  ORDER BY messages.created_at ASC limit 10;

                                                                                QUERY PLAN                    
                                                            
--------------------------------------------------------------------------------------------------------------
------------------------------------------------------------
 Limit  (cost=0.00..2891.06 rows=10 width=1340) (actual time=207922.586..207922.586 rows=0 loops=1)
   ->  Index Scan using idx_landing_page_messages_created_at on messages  (cost=0.00..449560.48 rows=1555 widt
h=1340) (actual time=207922.581..207922.581 rows=0 loops=1)
         Filter: ((NOT processed) AND ((topic)::text = 'x'::text))
 Total runtime: 207949.413 ms
(4 rows)


and an explain analyze with a higher limit that hits the index:


production=# explain analyze select * from landing_page.messages where ((messages.topic = E'x') AND (messages.processed = 'f'))  ORDER BY messages.created_at ASC limit 25;
                                                                      QUERY PLAN                              
                                         
--------------------------------------------------------------------------------------------------------------
-----------------------------------------
 Limit  (cost=5885.47..5885.54 rows=25 width=1340) (actual time=80.931..80.931 rows=0 loops=1)
   ->  Sort  (cost=5885.47..5889.36 rows=1555 width=1340) (actual time=80.926..80.926 rows=0 loops=1)
         Sort Key: created_at
         Sort Method:  quicksort  Memory: 17kB
         ->  Bitmap Heap Scan on messages  (cost=60.45..5841.59 rows=1555 width=1340) (actual time=64.404..64.
404 rows=0 loops=1)
               Recheck Cond: ((topic)::text = 'x'::text)
               Filter: (NOT processed)
               ->  Bitmap Index Scan on idx_messages_topic_processed  (cost=0.00..60.06 rows=1550 width=0) (ac
tual time=56.207..56.207 rows=0 loops=1)
                     Index Cond: (((topic)::text = 'x'::text) AND (p
rocessed = false))
 Total runtime: 88.051 ms
(10 rows)


overrides in postgresql.conf

shared_buffers = 256MB
work_mem = 8MB
max_fsm_pages = 2000000
max_fsm_relations = 2000
checkpoint_segments = 10
archive_mode = on
random_page_cost = 3.0
effective_cache_size = 6GB
default_statistics_target = 250
logging_collector = on


Forgot to mention this is Postgres 8.3.8 with 6GB memory on the server.

When you ask how big is the active portion of the database I am not sure how to answer.  The whole database server is about 140GB, but there are other applications that use this database, this particular table is about 1.6GB and growing.  Currently there are jobs that query from this table every minute.

Thanks again
Mike






On Wed, Jan 5, 2011 at 5:10 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
Mike Broers <mbroers@gmail.com> wrote:

> Hello performance, I need help explaining the performance of a
> particular query

You provided some of the information needed, but you should review
this page and post a bit more:

http://wiki.postgresql.org/wiki/SlowQueryQuestions

In particular, post the result of EXPLAIN ANALYZE, not just EXPLAIN.
Also, showing all overrides in your postgresql.conf file is
important, and some information about your hardware.  How big is the
active portion of your database (the frequently read portion)?

> Why does the smaller limit cause it to skip the index?

Because the optimizer thinks the query will return rows sooner that
way.

> Is there a way to help the planner choose the better plan?

You might get there by adjusting your memory settings and/or costing
settings, but we need to see more information to know that.

-Kevin


Thanks Robert, this is what I was looking for.  I will try these suggestions and follow up if any of them are the silver bullet.

On Fri, Jan 14, 2011 at 7:11 AM, Robert Haas wrote:
On Thu, Jan 6, 2011 at 4:36 PM, Mike Broers <mbroers@gmail.com> wrote:
> Thanks for the assistance.
> Here is an explain analyze of the query with the problem limit:
> production=# explain analyze select * from landing_page.messages where
> ((messages.topic = E'x') AND (messages.processed = 'f'))  ORDER BY
> messages.created_at ASC limit 10;
>
>    QUERY PLAN
>
> --------------------------------------------------------------------------------------------------------------
> ------------------------------------------------------------
>  Limit  (cost=0.00..2891.06 rows=10 width=1340) (actual
> time=207922.586..207922.586 rows=0 loops=1)
>    ->  Index Scan using idx_landing_page_messages_created_at on messages
>  (cost=0.00..449560.48 rows=1555 widt
> h=1340) (actual time=207922.581..207922.581 rows=0 loops=1)
>          Filter: ((NOT processed) AND ((topic)::text = 'x'::text))
>  Total runtime: 207949.413 ms
> (4 rows)

You're not the first person to have been bitten by this.  The
optimizer thinks that rows WHERE NOT processed and topic = 'x' are
reasonably common, so it figures that it can just index scan until it
finds 10 of them.  But when it turns out that there are none at all,
it ends up having to scan the entire index, which stinks big-time.

The alternative plan is to use a different index to find ALL the
relevant rows, sort them, and then take the top 10.   That would suck
if there actually were tons of rows like this, but there aren't.

So the root of the problem, in some sense, is that the planner's
estimate of the selectivity of "NOT processed and topic = 'x'" is not
very good.  Some things to try:

- increase the statistics target for the "processed" and "topic"
columns even higher
- put the processed rows in one table and the not processed rows in
another table
- do something like SELECT * FROM (SELECT .. LIMIT 200 OFFSET 0) LIMIT
10 to try to fool the planner into planning based on the higher, inner
limit
- create a partial index on messages (topic) WHERE NOT processed and
see if the planner will use it

...Robert

On Fri, Jan 14, 2011 at 11:36 AM, Mike Broers <mbroers@gmail.com> wrote:
> Thanks Robert, this is what I was looking for.  I will try these suggestions
> and follow up if any of them are the silver bullet.

No problem - and sorry for the off-list reply.  I was a little sleepy
when I wrote that; thanks for getting it back on-list.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company