Re: PATCH: index-only scans with partial indexes

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема Re: PATCH: index-only scans with partial indexes
Дата
Msg-id CAMkU=1y9JGr4=ATZJ7RoyoHWx-eK7fhjECbEmHkeXo7bD37poA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: PATCH: index-only scans with partial indexes  (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>)
Ответы Re: PATCH: index-only scans with partial indexes  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
On Fri, Sep 4, 2015 at 4:28 AM, Anastasia Lubennikova <a.lubennikova@postgrespro.ru> wrote:


25.08.2015 20:19, Jeff Janes пишет:
On Fri, Jul 10, 2015 at 11:29 AM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
Hi,

currently partial indexes end up not using index only scans in most cases, because check_index_only() is overly conservative, as explained in this comment:

 * XXX this is overly conservative for partial indexes, since we will
 * consider attributes involved in the index predicate as required even
 * though the predicate won't need to be checked at runtime. (The same
 * is true for attributes used only in index quals, if we are certain
 * that the index is not lossy.)  However, it would be quite expensive
 * to determine that accurately at this point, so for now we take the
 * easy way out.

In other words, unless you include columns from the index predicate to the index, the planner will decide index only scans are not possible. Which is a bit unfortunate, because those columns are not needed at runtime, and will only increase the index size (and the main benefit of partial indexes is size reduction).

The attached patch fixes this by only considering clauses that are not implied by the index predicate. The effect is simple:

    create table t as select i as a, i as b from
                      generate_series(1,10000000) s(i);

    create index tidx_partial on t(b) where a > 1000 and a < 2000;

    vacuum freeze t;
    analyze t;

explain analyze select count(b) from t where a > 1000 and a < 2000;


However, "explain analyze select sum(b) from t where a > 1000 and a < 1999;" still doesn't use the index only 
scan.  Isn't that also implied by the predicate?


In this example it doesn't use IndexOnlyScan correctly. If I understand partial indexes right, if index predicate and search clause are not equal, index scan must recheck values when it's fetching them.
'tidx_partial' in example above has no information about 'a' attribute, beside the index->indpred, so it is impossible to recheck qual without referencing to table.

In example:
create index tidx_partial on t(a) where a > 1000 and a < 2000;
explain analyze select sum(a) from t where a > 1000 and a < 1999;
it can use IndexOnlyScan.

Yes, of course.  Thanks for the explanation, it is obvious now that you have explained it.  I kept slipping into thinking that the predicate-dependent variables are included in the index but only when the predicate is met, but that isn't the case.

How can we evaluate Tom's performance concerns?  I tried turning log_planner_stats on and using the regression test as a load generator, but I don't think that that is very demanding of a test.

Thanks,

Jeff

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Daniel Verite"
Дата:
Сообщение: Re: [patch] Proposal for \rotate in psql
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Freeze avoidance of very large table.