Re: Index Onlys Scan for expressions

Поиск
Список
Период
Сортировка
От Ildar Musin
Тема Re: Index Onlys Scan for expressions
Дата
Msg-id d12e0c0e-1497-5152-d316-c6e914efdd21@postgrespro.ru
обсуждение исходный текст
Ответ на Re: Index Onlys Scan for expressions  (Vladimir Sitnikov <sitnikov.vladimir@gmail.com>)
Ответы Re: Index Onlys Scan for expressions  (Vladimir Sitnikov <sitnikov.vladimir@gmail.com>)
Список pgsql-hackers
Hi Vladimir,

On 23.08.2016 23:35, Vladimir Sitnikov wrote:
> Hi,
>
> I've tried your indexonlypatch5.patch against REL9_6_BETA3.
> Here are some results.
>
> TL;DR:
> 1) <<where type=42 and upper(vc) like '%ABC%'>> does not support
> index-only scan for index (type, upper(vc) varchar_pattern_ops).
> 3) <<(... where type=42 offset 0) where upper_vc like '%ABC%'>> does
> trigger index-only scan. IOS reduces number of buffers from 977 to 17
> and that is impressive.
>
> Can IOS be used for simple query like #1 as well?
>

Thanks for checking out the patch. Sorry for the delayed reply.

> Here are the details.
>
> drop table vlsi;
> create table vlsi(type numeric, vc varchar(500));
> insert into vlsi(type,vc) select round(x/1000),
> md5('||x)||md5('||x+1)||md5(''||x+2) from generate_series(1,1000000) as
> s(x);
> create index type_vc__vlsi on vlsi(type, upper(vc) varchar_pattern_ops);
> vacuum analyze vlsi;
>
> 0) Smoke test (index only scan works when selecting indexed expression):
>
> explain (analyze, buffers) select type, upper(vc) from vlsi where type=42;
>
>  Index Only Scan using type_vc__vlsi on vlsi  (cost=0.55..67.97 rows=971
> width=36) (actual time=0.012..0.212 rows=1000 loops=1)
>    Index Cond: (type = '42'::numeric)
>    Heap Fetches: 0
>    Buffers: shared hit=17
>  Planning time: 0.112 ms
>  Execution time: 0.272 ms
>
> 1) When trying to apply "like condition", index only scan does not work.
> Note: "buffers hit" becomes 977 instead of 17.
>
> explain (analyze, buffers) select type, upper(vc) from vlsi where
> type=42 and upper(vc) like '%ABC%';
>
>  Index Scan using type_vc__vlsi on vlsi  (cost=0.55..1715.13 rows=20
> width=36) (actual time=0.069..1.343 rows=23 loops=1)
>    Index Cond: (type = '42'::numeric)
>    Filter: (upper((vc)::text) ~~ '%ABC%'::text)
>    Rows Removed by Filter: 977
>    Buffers: shared hit=939
>  Planning time: 0.104 ms
>  Execution time: 1.358 ms
>

The reason why this doesn't work is that '~~' operator (which is a 
synonym for 'like') isn't supported by operator class for btree. Since 
the only operators supported by btree are <, <=, =, >=, >, you can use 
it with queries like:

explain (analyze, buffers) select type, upper(vc) from vlsi where 
type=42 and upper(vc) ~~ 'ABC%';                                                         QUERY PLAN 


-----------------------------------------------------------------------------------------------------------------------------
IndexOnly Scan using type_vc__vlsi on vlsi  (cost=0.55..4.58 rows=1 
 
width=36) (actual time=0.021..0.021 rows=0 loops=1)   Index Cond: ((type = '42'::numeric) AND ((upper((vc)::text)) ~>=~

'ABC'::text) AND ((upper((vc)::text)) ~<~ 'ABD'::text))   Filter: ((upper((vc)::text)) ~~ 'ABC%'::text)   Heap Fetches:
0  Buffers: shared hit=4 Planning time: 0.214 ms Execution time: 0.044 ms
 
(7 rows)

In case of fixed prefix postgres implicitly substitutes '~~' operator 
with two range operators:

((upper((vc)::text)) ~>=~ 'ABC'::text) AND ((upper((vc)::text)) ~<~ 
'ABD'::text)

so that you can use these conditions to lookup in btree.

> Mere "subquery" does not help: still no index-only scan
>
> 2) explain (analyze, buffers) select * from (select type, upper(vc)
> upper_vc from vlsi where type=42) as x where upper_vc like '%ABC%';
>
>  Index Scan using type_vc__vlsi on vlsi  (cost=0.55..1715.13 rows=20
> width=36) (actual time=0.068..1.344 rows=23 loops=1)
>    Index Cond: (type = '42'::numeric)
>    Filter: (upper((vc)::text) ~~ '%ABC%'::text)
>    Rows Removed by Filter: 977
>    Buffers: shared hit=939
>  Planning time: 0.114 ms
>  Execution time: 1.357 ms
>
> 3) "offset 0" trick does help:
> explain (analyze, buffers) select * from (select type, upper(vc)
> upper_vc from vlsi where type=42 offset 0) as x where upper_vc like '%ABC%';
>
>  Subquery Scan on x  (cost=0.55..80.11 rows=39 width=36) (actual
> time=0.033..0.488 rows=23 loops=1)
>    Filter: (x.upper_vc ~~ '%ABC%'::text)
>    Rows Removed by Filter: 977
>    Buffers: shared hit=17
>    ->  Index Only Scan using type_vc__vlsi on vlsi  (cost=0.55..67.97
> rows=971 width=36) (actual time=0.015..0.210 rows=1000 loops=1)
>          Index Cond: (type = '42'::numeric)
>          Heap Fetches: 0
>          Buffers: shared hit=17
>  Planning time: 0.086 ms
>  Execution time: 0.503 ms
>
> Vladimir

I debugged the last two queries to figure out the difference between 
them. It turned out that that the query 2) transforms to the same as 
query 1). And in 3rd query 'OFFSET' statement prevents rewriter from 
transforming the query, so it is possible to use index only scan on 
subquery and then filter the result of subquery with '~~' operator.

-- 
Ildar Musin
i.musin@postgrespro.ru



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
Следующее
От: Fabien COELHO
Дата:
Сообщение: Re: pgbench - allow to store select results into variables