Re: PATCH: index-only scans with partial indexes

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: PATCH: index-only scans with partial indexes
Дата
Msg-id 55A10575.90301@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: PATCH: index-only scans with partial indexes  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi,

On 07/10/2015 10:43 PM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> currently partial indexes end up not using index only scans in most
>> cases, because check_index_only() is overly conservative, as explained
>> in this comment:
>> ...
>
>> I've done a bunch of tests, and I do see small (hardly noticeable)
>> increase in planning time with long list of WHERE clauses, because all
>> those need to be checked against the index predicate. Not sure if this
>> is what's meant by 'quite expensive' in the comment. Moreover, this was
>> more than compensated by the IOS benefits (even with everything in RAM).
>
>> But maybe it's possible to fix that somehow? For example, we're
>> certainly doing those checks elsewhere when deciding which clauses need
>> to be evaluated at run-time, so maybe we could cache that somehow?
>
> The key problem here is that you're doing those proofs vastly earlier
> than before, for indexes that might not get used at all in the final
> plan. If you do some tests with multiple partial indexes you will
> probably see a bigger planning-time penalty.

Hmmm. Maybe we could get a bit smarter by looking at the attnums of each 
clause before doing the expensive stuff (which is predicate_implied_by I 
believe), exploiting a few simple observations:
  * if the clause is already covered by attrs_used, we don't need to    process it at all
  * if the clause uses attributes not included in the index predicate,    we know it can't be implied

Of course, those are local optimizations, and can't fix some of the 
problems (e.g. a lot of partial indexes).

> Perhaps we should bite the bullet and do it anyway, but I'm pretty
> suspicious of any claim that the planning cost is minimal.

Perhaps - I'm not claiming the planning cost is minimal. It was in the 
tests I've done, but no doubt it's possible to construct examples where 
the planning time will get much worse. With 30 partial indexes, I got an 
increase from 0.01 ms to ~2.5ms on simple queries.

But maybe we could get at least some of the benefits by planning the 
index scans like today, and then do the IOS check later? Of course, this 
won't help with cases where the index scan is thrown away while the 
index only scan would win, but it does help with cases where we end up 
doing index scan anyway?

That's essentially what I'm struggling right now - I do have a 3TB data 
set, the plan looks like this:
                               QUERY PLAN
------------------------------------------------------------------------ Sort  (cost=1003860164.92..1003860164.92
rows=1width=16)   Sort Key: orders.o_orderpriority   ->  HashAggregate         Group Key: orders.o_orderpriority
->  Merge Semi Join               Merge Cond:               ->  Index Scan using pk_orders on orders
Filter: ((o_orderdate >= '1997-07-01'::date) AND                     (o_orderdate < '1997-10-01 00:00:00'::timestamp))
            ->  Index Scan using lineitem_l_orderkey_idx_part1 on                   lineitem
 

and the visibility checks from Index Scans are killing the I/O. An IOS 
is likely to perform much better here (but haven't ran the query yet).

regards


--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Davis
Дата:
Сообщение: Re: Memory Accounting v11
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Re: Removing SSL renegotiation (Was: Should we back-patch SSL renegotiation fixes?)