Inefficient plan selected by PostgreSQL 9.0.7

Поиск
Список
Период
Сортировка
От Maxim Boguk
Тема Inefficient plan selected by PostgreSQL 9.0.7
Дата
Msg-id CAK-MWwQawCr2RbJ-hvs9JozGt+VSnbZqDssA+zHGeyxvJpQpQQ@mail.gmail.com
обсуждение исходный текст
Ответы Re: Inefficient plan selected by PostgreSQL 9.0.7  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
Hi,

I got very inefficient plan for a simple query.
PostgreSQL 9.0.7 on FreeBSD, default_statistics_target=1000

Table:
Game2=# \d sb_messages
                                            Table "public.sb_messages"
       Column        |           Type           |                            Modifiers
---------------------+--------------------------+------------------------------------------------------------------
 message_id          | integer                  | not null default nextval('sb_messages_message_id_seq'::regclass)
 date                | timestamp with time zone | default now()
 from_user           | integer                  | not null
 to_user             | integer                  | not null
 type                | smallint                 | not null default 0
 visibility_status   | smallint                 | not null default 0
 not_show_on_air     | boolean                  | default false
 clan_id             | integer                  |
 priority            | numeric                  | default 0
...
Indexes:
    "sb_messages_pkey" PRIMARY KEY, btree (message_id)
...
    "sb_messages_special3_key" btree (date DESC)                WHERE (type = ANY (ARRAY[0, 9])) AND visibility_status = 0 AND not_show_on_air = false AND clan_id IS NULL
    "sb_messages_special4_key" btree (priority DESC NULLS LAST) WHERE (type = ANY (ARRAY[0, 9])) AND visibility_status = 0 AND not_show_on_air = false AND clan_id IS NULL

Now the problem query:

Game2=# EXPLAIN (ANALYZE, COSTS) SELECT * FROM sb_messages messages_tbl WHERE
   (messages_tbl.type IN (0, 9) AND messages_tbl.visibility_status=0 AND messages_tbl.not_show_on_air='f' AND messages_tbl.clan_id IS NULL)
   AND  NOT EXISTS (SELECT 1 FROM users users_tbl WHERE blocked='t' and users_tbl.id = messages_tbl.from_user)
ORDER BY priority DESC NULLS LAST
LIMIT 10;
                                                                                    QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=24576.83..24576.84 rows=1 width=206) (actual time=514.011..514.052 rows=10 loops=1)
   ->  Sort  (cost=24576.83..24576.84 rows=1 width=206) (actual time=514.008..514.021 rows=10 loops=1)
         Sort Key: messages_tbl.priority
         Sort Method:  top-N heapsort  Memory: 40kB
         ->  Nested Loop Anti Join  (cost=0.00..24576.82 rows=1 width=206) (actual time=0.043..436.386 rows=20761 loops=1)
               ->  Index Scan using sb_messages_special3_key on sb_messages messages_tbl  (cost=0.00..3804.39 rows=35924 width=206) (actual time=0.019..69.801 rows=24938 loops=1)
               ->  Index Scan using sb_users_pkey on users users_tbl  (cost=0.00..0.53 rows=1 width=4) (actual time=0.009..0.009 rows=0 loops=24938)
                     Index Cond: (users_tbl.id = messages_tbl.from_user)
                     Filter: users_tbl.blocked
 Total runtime: 514.171 ms

As can be seen, PostgreSQL prefers scan over unrelated index + sort.

Correct plan is used only after sb_messages_special3_key index were dropped:

Game2=# drop index sb_messages_special3_key;
DROP INDEX
Game2=# EXPLAIN (ANALYZE, COSTS) SELECT * FROM sb_messages messages_tbl WHERE
   (messages_tbl.type IN (0, 9) AND messages_tbl.visibility_status=0 AND messages_tbl.not_show_on_air='f' AND messages_tbl.clan_id IS NULL)
   AND  NOT EXISTS (SELECT 1 FROM users users_tbl WHERE blocked='t' and users_tbl.id = messages_tbl.from_user)
ORDER BY priority DESC NULLS LAST
LIMIT 10;
                                                                               QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..24860.54 rows=1 width=206) (actual time=0.047..0.273 rows=10 loops=1)
   ->  Nested Loop Anti Join  (cost=0.00..24860.54 rows=1 width=206) (actual time=0.045..0.245 rows=10 loops=1)
         ->  Index Scan using sb_messages_special4_key on sb_messages messages_tbl  (cost=0.00..4088.11 rows=35924 width=206) (actual time=0.021..0.051 rows=10 loops=1)
         ->  Index Scan using sb_users_pkey on users users_tbl  (cost=0.00..0.53 rows=1 width=4) (actual time=0.014..0.014 rows=0 loops=10)
               Index Cond: (users_tbl.id = messages_tbl.from_user)
               Filter: users_tbl.blocked
 Total runtime: 0.391 ms
(7 rows)


No manipulations with the *_cost settings had forced the database select a correct plan in presence of the sb_messages_special3_key index (any changes in *_costs have proportional effects on cost of the both plans and slow plan win always by cost).
As can be seen, an actual rows very close to the predicted amount (
   ->  Index Scan using sb_messages_special3_key on sb_messages messages_tbl  (cost=0.00..3804.39 rows=35924 width=206) (actual time=0.019..69.801 rows=24938 loops=1)
).

What I don't understand is that why planner prefer full scan over unrelated index and sort to the index scan over related index without sort?
Even in the worst possible case index scan over related index (sb_messages_special4_key) will read the exactly same amount of rows from the table as scan over sb_messages_special3_key.
And very likely scan over related index will win.

What I can do to fight that issue (I looking to keep both indexes on that table for fast queries with different ordering).

--
Maxim Boguk
Senior Postgresql DBA.

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com

LinkedIn profile: http://nz.linkedin.com/in/maximboguk
"If they can send one man to the moon... why can't they send them all?"

МойКруг: http://mboguk.moikrug.ru/
"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."


В списке pgsql-general по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: installation of plpython2.7
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Inefficient plan selected by PostgreSQL 9.0.7