Re: Inefficient plan selected by PostgreSQL 9.0.7

Поиск
Список
Период
Сортировка
От Maxim Boguk
Тема Re: Inefficient plan selected by PostgreSQL 9.0.7
Дата
Msg-id CAK-MWwSBpLpELxsNwUA78RSD2mkQFTdnrn1Q07AQ5j6+0n4NRg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Inefficient plan selected by PostgreSQL 9.0.7  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general


On Wed, May 2, 2012 at 2:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Maxim Boguk <maxim.boguk@gmail.com> writes:
> I got very inefficient plan for a simple query.

It looks like the problem is with the estimate of the antijoin size:

>          ->  Nested Loop Anti Join  (cost=0.00..24576.82 rows=1 width=206)
> (actual time=0.043..436.386 rows=20761 loops=1)

that is, only about 20% of the rows in sb_messages are eliminated by the
NOT EXISTS condition, but the planner thinks that nearly all of them
will be (and that causes it to not think that the LIMIT is going to
affect anything, so it doesn't prefer a fast-start plan).

Since you've not told us anything about the statistics of these tables,
it's hard to speculate as to why the estimate is off.

                       regards, tom lane

Most interesting part that NOT EXISTS estimates way off, when LEFT JOIN WHERE ... IS NULL esimated correctly:

good esitmate (estimated rows=20504  vs real rows=20760):
Game2=# EXPLAIN ANALYZE
SELECT
*
FROM sb_messages messages_tbl
LEFT JOIN users users_tbl ON users_tbl.id = messages_tbl.from_user
WHERE
    messages_tbl.type IN (0, 9) AND
    messages_tbl.visibility_status = 0 AND
    messages_tbl.not_show_on_air = 'f' AND
    messages_tbl.clan_id IS NULL AND
    users_tbl.blocked IS DISTINCT FROM 't';
                                                                              QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=0.00..24577.74 rows=20504 width=1037) (actual time=0.045..532.012 rows=20760 loops=1)
   Filter: (users_tbl.blocked IS DISTINCT FROM true)
   ->  Index Scan using sb_messages_special3_key on sb_messages messages_tbl  (cost=0.00..3793.75 rows=35784 width=208) (actual time=0.019..67.746 rows=24937 loops=1)
   ->  Index Scan using sb_users_pkey on users users_tbl  (cost=0.00..0.53 rows=1 width=829) (actual time=0.007..0.009 rows=1 loops=24937)
         Index Cond: (users_tbl.id = messages_tbl.from_user)
 Total runtime: 563.944 ms


bad estimate (estimated 1 vs real rows=20760):
Game2=# EXPLAIN (ANALYZE, COSTS) SELECT * FROM sb_messages messages_tbl WHERE
   (messages_tbl.type IN (0, 9) AND messages_tbl.visibility_status=0 AND messages_tbl.not_show_on_air='f' AND messages_tbl.clan_id IS NULL)
   AND  NOT EXISTS (SELECT 1 FROM users users_tbl WHERE blocked='t' and users_tbl.id = messages_tbl.from_user);
                                                                              QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Anti Join  (cost=0.00..24488.28 rows=1 width=208) (actual time=0.044..430.645 rows=20760 loops=1)
   ->  Index Scan using sb_messages_special3_key on sb_messages messages_tbl  (cost=0.00..3793.75 rows=35784 width=208) (actual time=0.020..67.810 rows=24937 loops=1)
   ->  Index Scan using sb_users_pkey on users users_tbl  (cost=0.00..0.53 rows=1 width=4) (actual time=0.009..0.009 rows=0 loops=24937)
         Index Cond: (users_tbl.id = messages_tbl.from_user)
         Filter: users_tbl.blocked
 Total runtime: 461.296 ms


What is curious that not exists always perform 20% faster (I performed both explains like 10 times each and each time not exits is close to 20% faster).


--
Maxim Boguk
Senior Postgresql DBA.

В списке pgsql-general по дате отправления:

Предыдущее
От: Evan Martin
Дата:
Сообщение: Re: SQL functions not being inlined
Следующее
От: Chitra Creta
Дата:
Сообщение: Re: PostgreSQL 8.3 data corruption