Re: Fixing GIN for empty/null/full-scan cases

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Fixing GIN for empty/null/full-scan cases
Дата
Msg-id 3234.1295387895@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Fixing GIN for empty/null/full-scan cases  (David E. Wheeler <david@kineticode.com>)
Ответы Re: Fixing GIN for empty/null/full-scan cases  ("David E. Wheeler" <david@kineticode.com>)
Список pgsql-hackers
"David E. Wheeler" <david@kineticode.com> writes:
> These numbers are a bit crazy-making, but the upshot is that Gist is
> slow out of the gate, but with data cached, it's pretty speedy. With
> indexscan and bitmapscan disabled, these queries all took 300-400
> ms. So GIN was never better performing than a table scan.

I could not replicate that here at all --- GIN indexscans were
consistently better than seqscans for me, eg

regression=# set enable_bitmapscan TO 1;
SET
Time: 0.673 ms
regression=# explain analyze SELECT count(*) FROM listings     WHERE features @@ '(1368799&1368800&1369043)'::query_int
     AND deleted_at IS NULL AND status = 1;                                                             QUERY PLAN
                                   
 

--------------------------------------------------------------------------------------------------------------------------------------Aggregate
(cost=1159.20..1159.21 rows=1 width=0) (actual time=23.964..23.964 rows=1 loops=1)  ->  Bitmap Heap Scan on listings
(cost=31.15..1158.18rows=406 width=0) (actual time=23.014..23.876 rows=772 loops=1)        Recheck Cond: ((features @@
'1368799& 1368800 & 1369043'::query_int) AND (deleted_at IS NULL) AND (status = 1))        ->  Bitmap Index Scan on
idx_gin_features (cost=0.00..31.05 rows=406 width=0) (actual time=22.913..22.913 rows=772 loops=1)              Index
Cond:(features @@ '1368799 & 1368800 & 1369043'::query_int)Total runtime: 24.040 ms
 
(6 rows)

Time: 24.968 ms
regression=# set enable_bitmapscan TO 0;
SET
Time: 0.458 ms
regression=# explain analyze SELECT count(*) FROM listings     WHERE features @@ '(1368799&1368800&1369043)'::query_int
     AND deleted_at IS NULL AND status = 1;                                                    QUERY PLAN
                                   
 

--------------------------------------------------------------------------------------------------------------------Aggregate
(cost=9158.24..9158.25 rows=1 width=0) (actual time=145.121..145.121 rows=1 loops=1)  ->  Seq Scan on listings
(cost=0.00..9157.22rows=406 width=0) (actual time=0.025..144.982 rows=772 loops=1)        Filter: ((deleted_at IS NULL)
AND(features @@ '1368799 & 1368800 & 1369043'::query_int) AND (status = 1))Total runtime: 145.177 ms
 
(4 rows)

Time: 146.228 ms

I'm noticing also that I get different rowcounts than you do, although
possibly that has something to do with the partial-index conditions,
which I'm not trying to duplicate here (all rows in my table pass those
two tests).

> * Why does it take 3-4x longer to create the GIN than the GiST index
> on tsvector?

Perhaps more maintenance_work_mem would help with that; although the
fine manual says specifically that GIN text search indexes take about
three times longer to build than equivalent GiST indexes, so maybe that
behavior is as designed.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "A.M."
Дата:
Сообщение: Re: test_fsync label adjustments
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql