Indexes on NULL's and order by ... limit N queries

Поиск
Список
Период
Сортировка
От Maxim Boguk
Тема Indexes on NULL's and order by ... limit N queries
Дата
Msg-id 4933D94E.2010407@masterhost.ru
обсуждение исходный текст
Ответы Re: Indexes on NULL's and order by ... limit N queries  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
When i read about 8.3 support indexed queries on NULL values (like rubric_id is NULL) i was really happy.

But reality strike again... look like NULL in WHERE don't allow effective using
index on (rubric_id, pos) for queries like:
... WHERE rubric_id IS NULL ORDER BY pos LIMIT 5


Here is some details about my issue (all tests on fresh loaded/analyzed into empty 8.3.5 DB):


mboguk_billing=# EXPLAIN ANALYZE SELECT * from cluster_weight where cluster_weight.rubric_id=8 order by pos limit 5;
                                                           QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=1337.02..1337.03 rows=5 width=28) (actual time=27.556..27.575 rows=5 loops=1)
    ->  Sort  (cost=1337.02..1340.77 rows=1501 width=28) (actual time=27.552..27.558 rows=5 loops=1)
          Sort Key: pos
          Sort Method:  top-N heapsort  Memory: 25kB
          ->  Seq Scan on cluster_weight  (cost=0.00..1312.09 rows=1501 width=28) (actual time=0.058..25.008 rows=1501
loops=1)
                Filter: (rubric_id = 8)
  Total runtime: 27.638 ms
(7 rows)

ok so we need index on (rubric_id, pos), lets add it:


mboguk_billing=# CREATE INDEX cluster_weight_2 on cluster_weight(rubric_id, pos);
CREATE INDEX


And try again:
mboguk_billing=# EXPLAIN ANALYZE SELECT * from cluster_weight where cluster_weight.rubric_id=8 order by pos limit 5;
                                                                  QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=0.00..1.70 rows=5 width=28) (actual time=0.095..0.122 rows=5 loops=1)
    ->  Index Scan using cluster_weight_2 on cluster_weight  (cost=0.00..509.31 rows=1501 width=28) (actual
time=0.090..0.104rows=5 loops=1) 
          Index Cond: (rubric_id = 8)
  Total runtime: 0.176 ms
(4 rows)

Ok... so now query works as intended...

Lets check are index used on search NULL values:
mboguk_billing=# EXPLAIN ANALYZE SELECT * from cluster_weight where cluster_weight.rubric_id IS NULL;
                                                                   QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------
  Index Scan using cluster_weight_2 on cluster_weight  (cost=0.00..1034.21 rows=26435 width=28) (actual
time=0.053..48.123rows=26435 loops=1) 
    Index Cond: (rubric_id IS NULL)
  Total runtime: 85.210 ms
(3 rows)

Yes it is working...

Now lets try main query over NULL:

mboguk_billing=# EXPLAIN ANALYZE SELECT * from cluster_weight where cluster_weight.rubric_id IS NULL ORDER BY pos LIMIT
5;
                                                                         QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=1473.29..1473.30 rows=5 width=28) (actual time=92.220..92.239 rows=5 loops=1)
    ->  Sort  (cost=1473.29..1539.37 rows=26435 width=28) (actual time=92.216..92.223 rows=5 loops=1)
          Sort Key: pos
          Sort Method:  top-N heapsort  Memory: 25kB
          ->  Index Scan using cluster_weight_2 on cluster_weight  (cost=0.00..1034.21 rows=26435 width=28) (actual
time=0.033..47.333rows=26435 loops=1) 
                Index Cond: (rubric_id IS NULL)
  Total runtime: 92.310 ms
(7 rows)

Ooops... that is surprise...
I wasn't ready see that plan here... and performance difference over 1000.

Looks like when indexed search over NULL's added, planner wasn't learned right way use such index on  "where something
isNULL order by ... limit ..."  
queries.


Thanks for any responses and sorry for not so good English.

--
SY, Maxim Boguk

В списке pgsql-general по дате отправления:

Предыдущее
От: "Albe Laurenz"
Дата:
Сообщение: Re: configure options
Следующее
От: "Andrus"
Дата:
Сообщение: Re: db backup script in gentoo