intarray planning/exeuction problem with multiple operators

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема intarray planning/exeuction problem with multiple operators
Дата
Msg-id CAMkU=1yx952u=tc8o1Ed1LZgLceopJ3MR5n0OAP84tPWn_jNtg@mail.gmail.com
обсуждение исходный текст
Ответы Re: intarray planning/exeuction problem with multiple operators  (Uriy Zhuravlev <u.zhuravlev@postgrespro.ru>)
Re: intarray planning/exeuction problem with multiple operators  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
I've found an interesting performance problem in the intarray extension module.  It doesn't seem to be version dependent, I've verified it in 9.4.4 and 9.6devel.

If I do a query that involves both an = op and a @@ op ANDed together, it gives a high cost estimate, which is justified by the slow runtime.  If I omit the @@ it gives a low estimate, also justified.  If I trick it into thinking it cannot use the index to satisfy the @@ op, then it gives a low estimate and low runtime, applying the @@ in the filter step and only the fast = in the bitmap index scan.  

Now it could use the index for the fast = operation and apply the @@ in the recheck, rather than the filter.  But I guess it doesn't think of that, despite knowing that applying the @@ in index operation is slow.  

So it seems to be missing a trick someplace, but I don't if it reasonable to expect it to find that trick, or how easy/hard it would be to implement.

The proposed selectivity estimate improvement patch for @@ does not fix this (and evaluating that patch was how I found this issue.)


Set up:

create table foobar as 
select (
  select 
    array_agg(floor(sqrt(random()*10000000+g.y/1000000+f.x/10000000))::int) 
    from generate_series(1,10) g(y) 
  ) foo
  from generate_series(1,10000000) f(x);
create index on foobar using gin(foo gin__int_ops);
analyze;

You can knock an order of magnitude off from the table size and should still be able to see the problem.

example:

explain(analyze, buffers) select * from foobar where foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}' and foo @@ '!1'::query_int;

                                                        QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
 Seq Scan on foobar  (cost=0.00..263637.00 rows=1 width=61) (actual time=9717.957..9717.957 rows=0 loops=1)
   Filter: ((foo @@ '!1'::query_int) AND (foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[]))
   Rows Removed by Filter: 10000000
   Buffers: shared hit=64 read=113573
   I/O Timings: read=361.402
 Planning time: 0.101 ms
 Execution time: 9717.991 ms
(7 rows)


explain(analyze, buffers) select * from foobar where foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}' ;
                                                       QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on foobar  (cost=112.01..116.02 rows=1 width=61) (actual time=0.027..0.027 rows=0 loops=1)
   Recheck Cond: (foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[])
   Buffers: shared hit=21
   ->  Bitmap Index Scan on foobar_foo_idx  (cost=0.00..112.01 rows=1 width=0) (actual time=0.023..0.023 rows=0 loops=1)
         Index Cond: (foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[])
         Buffers: shared hit=21
 Planning time: 0.097 ms
 Execution time: 0.061 ms

If I trick it into thinking the @@ operator cannot be used in th eindex, then it gets really fast again:

explain(analyze, buffers) select * from foobar where foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}' and foo||'{}' @@ '!1'::query_int;
                                                       QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on foobar  (cost=112.01..116.03 rows=1 width=61) (actual time=0.082..0.082 rows=0 loops=1)
   Recheck Cond: (foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[])
   Filter: ((foo || '{}'::integer[]) @@ '!1'::query_int)
   Buffers: shared hit=21
   ->  Bitmap Index Scan on foobar_foo_idx  (cost=0.00..112.01 rows=1 width=0) (actual time=0.080..0.080 rows=0 loops=1)
         Index Cond: (foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[])
         Buffers: shared hit=21
 Planning time: 0.139 ms
 Execution time: 0.129 ms

Cheers,

Jeff

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: A little RLS oversight?
Следующее
От: Haribabu Kommi
Дата:
Сообщение: Re: optimizing vacuum truncation scans