B-Tree emulation for GIN - some numbers

Поиск
Список
Период
Сортировка
От Oleg Bartunov
Тема B-Tree emulation for GIN - some numbers
Дата
Msg-id Pine.LNX.4.64.0811191828050.7862@sn.sai.msu.ru
обсуждение исходный текст
Список pgsql-hackers
Hi there,

we got no comments on our proposal about changing GIN interface to 
support multicolumn feature of GIN, see 
http://archives.postgresql.org/message-id/490B3752.3010800@sigaev.ru
for details, so here are some numbers demonstrating benefits and some
issues. Any help and feedback are welcome.

We used geonames database with 5793013 rows, btree_gin patch (v.0.5) 
for CVS HEAD and contrib/btree_gin, which adds GIN support for all
data types, like contrib/btree_gist does.

postgres=# \d spots     Table "public.spots" Column |   Type   | Modifiers
--------+----------+----------- lat    | real     | long   | real     | name   | text     | name1  | text     | fts
|tsvector |
 

We used 3 indexes - btree index on lat column (lat_idx),
multicolumn GIN index on lat,fts columns (lat_fts_idx) and
GIN index on fts column (fts_idx). 
The test query returns 7653 rows.
explain analyze select name from spots where fts @@ to_tsquery('river') and lat < 20 and lat > 10;

T1. lat_idx + fts_idx        : 152.612 ms
T2. lat_fts_idx              :  34.051 ms
T3. lat_idx + lat_fts_idx    : 152.430 ms
T4. Disabled bitmapscan      :  30.005 ms

T2 vs T1 show >4 times better performance for multicolumn GIN index (T2) than
BitmapAnd  scan over two separate indexes (T1).

Several issues:
1. in T3 we see than planner prefer to use Btree index on lat_idx instead of 
using lat_fts_idx as in T2, so we got performance of T1 instead of T2.

2. Disabling bitmapscan produces the best results in all cases and 
the same plan. Changing statistics for lat column (we tried 100,1000)
doesn't change plan.


==============================



T1. btree index + GIN index
Indexes:    "fts_idx" gin (fts)    "lat_idx" btree (lat)

Q1.
postgres=# explain analyze select name from spots where fts @@ to_tsquery('river') and lat < 20 and lat > 10;
                                                   QUERY PLAN
 

----------------------------------------------------------------------------------------------------------------------------------------
BitmapHeap Scan on spots  (cost=12838.77..23940.34 rows=3391 width=12) (actual time=143.451..149.288 rows=7653 loops=1)
 Recheck Cond: ((fts @@ to_tsquery('river'::text)) AND (lat < 20::double precision) AND (lat > 10::double precision))
-> BitmapAnd  (cost=12838.77..12838.77 rows=3391 width=0) (actual time=143.097..143.097 rows=0 loops=1)         ->
BitmapIndex Scan on fts_idx  (cost=0.00..893.21 rows=30896 width=0) (actual time=8.543..8.543 rows=32820 loops=1)
       Index Cond: (fts @@ to_tsquery('river'::text))         ->  Bitmap Index Scan on lat_idx  (cost=0.00..11943.62
rows=635886width=0) (actual time=132.686..132.686 rows=618548 loops=1)               Index Cond: ((lat < 20::double
precision)AND (lat > 10::double precision)) Total runtime: 152.040 ms
 
(8 rows)

Time: 152.612 ms

T2. One multicolumn GIN index
Indexes:    "lat_fts_idx" gin (lat, fts)

postgres=# explain analyze select name from spots where fts @@ to_tsquery('river') and lat < 20 and lat > 10;
                                              QUERY PLAN
 

-------------------------------------------------------------------------------------------------------------------------------
BitmapHeap Scan on spots  (cost=1011.83..54427.00 rows=3043 width=12) (actual time=10.374..31.349 rows=7653 loops=1)
RecheckCond: (fts @@ to_tsquery('river'::text))   Filter: ((lat < 20::double precision) AND (lat > 10::double
precision))  ->  Bitmap Index Scan on lat_fts_idx  (cost=0.00..1011.07 rows=28965 width=0) (actual time=8.439..8.439
rows=32820loops=1)         Index Cond: (fts @@ to_tsquery('river'::text)) Total runtime: 34.051 ms
 
(6 rows)

T3. btree index + multicolumn GIN index
    "lat_fts_idx" gin (lat, fts)    "lat_idx" btree (lat)

postgres=# explain analyze select name from spots where fts @@ to_tsquery('river') and lat < 20 and lat > 10;
                                                   QUERY PLAN
 

----------------------------------------------------------------------------------------------------------------------------------------
BitmapHeap Scan on spots  (cost=12442.97..22532.06 rows=3043 width=12) (actual time=142.979..149.162 rows=7653 loops=1)
 Recheck Cond: ((fts @@ to_tsquery('river'::text)) AND (lat < 20::double precision) AND (lat > 10::double precision))
-> BitmapAnd  (cost=12442.97..12442.97 rows=3043 width=0) (actual time=142.608..142.608 rows=0 loops=1)         ->
BitmapIndex Scan on lat_fts_idx  (cost=0.00..1011.07 rows=28965 width=0) (actual time=8.493..8.493 rows=32820 loops=1)
            Index Cond: (fts @@ to_tsquery('river'::text))         ->  Bitmap Index Scan on lat_idx
(cost=0.00..11430.13rows=608537 width=0) (actual time=132.225..132.225 rows=618548 loops=1)               Index Cond:
((lat< 20::double precision) AND (lat > 10::double precision)) Total runtime: 151.880 ms
 
(8 rows)

Time: 152.430 ms


T4. Disabled bitmapscan

postgres=# set enable_bitmapscan to off;
SET
Time: 0.101 ms

postgres=# explain analyze select name from spots where fts @@ to_tsquery('river') and lat < 20 and lat > 10;
                                              QUERY PLAN
 

-------------------------------------------------------------------------------------------------------------------------------
IndexScan using lat_fts_idx on spots  (cost=0.00..96821.54 rows=3043 width=12) (actual time=0.040..26.847 rows=7653
loops=1)  Index Cond: (fts @@ to_tsquery('river'::text))   Filter: ((lat < 20::double precision) AND (lat > 10::double
precision))Total runtime: 29.543 ms
 
(4 rows)

Time: 30.005 ms


    Regards,        Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Emmanuel Cecchet
Дата:
Сообщение: Re: Transactions and temp tables
Следующее
От: "Alex Hunsaker"
Дата:
Сообщение: Re: Client certificate authentication