intarray planning/exeuction problem with multiple operators
От | Jeff Janes |
---|---|
Тема | intarray planning/exeuction problem with multiple operators |
Дата | |
Msg-id | CAMkU=1yx952u=tc8o1Ed1LZgLceopJ3MR5n0OAP84tPWn_jNtg@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: intarray planning/exeuction problem with multiple operators
(Uriy Zhuravlev <u.zhuravlev@postgrespro.ru>)
Re: intarray planning/exeuction problem with multiple operators (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
I've found an interesting performance problem in the intarray extension module. It doesn't seem to be version dependent, I've verified it in 9.4.4 and 9.6devel.
If I do a query that involves both an = op and a @@ op ANDed together, it gives a high cost estimate, which is justified by the slow runtime. If I omit the @@ it gives a low estimate, also justified. If I trick it into thinking it cannot use the index to satisfy the @@ op, then it gives a low estimate and low runtime, applying the @@ in the filter step and only the fast = in the bitmap index scan.
Now it could use the index for the fast = operation and apply the @@ in the recheck, rather than the filter. But I guess it doesn't think of that, despite knowing that applying the @@ in index operation is slow.
So it seems to be missing a trick someplace, but I don't if it reasonable to expect it to find that trick, or how easy/hard it would be to implement.
The proposed selectivity estimate improvement patch for @@ does not fix this (and evaluating that patch was how I found this issue.)
Set up:
create table foobar as
select (
select
array_agg(floor(sqrt(random()*10000000+g.y/1000000+f.x/10000000))::int)
from generate_series(1,10) g(y)
) foo
from generate_series(1,10000000) f(x);
create index on foobar using gin(foo gin__int_ops);
analyze;
You can knock an order of magnitude off from the table size and should still be able to see the problem.
example:
explain(analyze, buffers) select * from foobar where foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}' and foo @@ '!1'::query_int;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Seq Scan on foobar (cost=0.00..263637.00 rows=1 width=61) (actual time=9717.957..9717.957 rows=0 loops=1)
Filter: ((foo @@ '!1'::query_int) AND (foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[]))
Rows Removed by Filter: 10000000
Buffers: shared hit=64 read=113573
I/O Timings: read=361.402
Planning time: 0.101 ms
Execution time: 9717.991 ms
(7 rows)
explain(analyze, buffers) select * from foobar where foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}' ;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on foobar (cost=112.01..116.02 rows=1 width=61) (actual time=0.027..0.027 rows=0 loops=1)
Recheck Cond: (foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[])
Buffers: shared hit=21
-> Bitmap Index Scan on foobar_foo_idx (cost=0.00..112.01 rows=1 width=0) (actual time=0.023..0.023 rows=0 loops=1)
Index Cond: (foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[])
Buffers: shared hit=21
Planning time: 0.097 ms
Execution time: 0.061 ms
If I trick it into thinking the @@ operator cannot be used in th eindex, then it gets really fast again:
explain(analyze, buffers) select * from foobar where foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}' and foo||'{}' @@ '!1'::query_int;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on foobar (cost=112.01..116.03 rows=1 width=61) (actual time=0.082..0.082 rows=0 loops=1)
Recheck Cond: (foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[])
Filter: ((foo || '{}'::integer[]) @@ '!1'::query_int)
Buffers: shared hit=21
-> Bitmap Index Scan on foobar_foo_idx (cost=0.00..112.01 rows=1 width=0) (actual time=0.080..0.080 rows=0 loops=1)
Index Cond: (foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[])
Buffers: shared hit=21
Planning time: 0.139 ms
Execution time: 0.129 ms
Cheers,
Jeff
В списке pgsql-hackers по дате отправления: