PATCH: index-only scans with partial indexes

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема PATCH: index-only scans with partial indexes
Дата
Msg-id 55A00F17.1020608@2ndquadrant.com
обсуждение исходный текст
Ответы Re: PATCH: index-only scans with partial indexes  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: PATCH: index-only scans with partial indexes  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-hackers
Hi,

currently partial indexes end up not using index only scans in most
cases, because check_index_only() is overly conservative, as explained
in this comment:

  * XXX this is overly conservative for partial indexes, since we will
  * consider attributes involved in the index predicate as required even
  * though the predicate won't need to be checked at runtime. (The same
  * is true for attributes used only in index quals, if we are certain
  * that the index is not lossy.)  However, it would be quite expensive
  * to determine that accurately at this point, so for now we take the
  * easy way out.

In other words, unless you include columns from the index predicate to
the index, the planner will decide index only scans are not possible.
Which is a bit unfortunate, because those columns are not needed at
runtime, and will only increase the index size (and the main benefit of
partial indexes is size reduction).

The attached patch fixes this by only considering clauses that are not
implied by the index predicate. The effect is simple:

     create table t as select i as a, i as b from
                       generate_series(1,10000000) s(i);

     create index tidx_partial on t(b) where a > 1000 and a < 2000;

     vacuum freeze t;
     analyze t;

explain analyze select count(b) from t where a > 1000 and a < 2000;

                                 QUERY PLAN
-----------------------------------------------------------------------
  Aggregate  (cost=39.44..39.45 rows=1 width=4)
             (actual time=8.350..8.354 rows=1 loops=1)
    ->  Index Scan using tidx_partial on t
             (cost=0.28..37.98 rows=585 width=4)
             (actual time=0.034..4.368 rows=999 loops=1)
  Planning time: 0.197 ms
  Execution time: 8.441 ms
(4 rows)

explain analyze select count(b) from t where a > 1000 and a < 2000;

                                 QUERY PLAN
-----------------------------------------------------------------------
  Aggregate  (cost=33.44..33.45 rows=1 width=4)
             (actual time=8.019..8.023 rows=1 loops=1)
    ->  Index Only Scan using tidx_partial on t
             (cost=0.28..31.98 rows=585 width=4)
             (actual time=0.036..4.165 rows=999 loops=1)
          Heap Fetches: 0
  Planning time: 0.188 ms
  Execution time: 8.106 ms
(5 rows)


I've done a bunch of tests, and I do see small (hardly noticeable)
increase in planning time with long list of WHERE clauses, because all
those need to be checked against the index predicate. Not sure if this
is what's meant by 'quite expensive' in the comment. Moreover, this was
more than compensated by the IOS benefits (even with everything in RAM).

But maybe it's possible to fix that somehow? For example, we're
certainly doing those checks elsewhere when deciding which clauses need
to be evaluated at run-time, so maybe we could cache that somehow?

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Mike Blackwell
Дата:
Сообщение: Re: pg_upgrade + Ubuntu
Следующее
От: Smitha Pamujula
Дата:
Сообщение: Re: pg_upgrade + Extensions