B-Tree emulation for GIN - some numbers
От | Oleg Bartunov |
---|---|
Тема | B-Tree emulation for GIN - some numbers |
Дата | |
Msg-id | Pine.LNX.4.64.0811191828050.7862@sn.sai.msu.ru обсуждение исходный текст |
Список | pgsql-hackers |
Hi there, we got no comments on our proposal about changing GIN interface to support multicolumn feature of GIN, see http://archives.postgresql.org/message-id/490B3752.3010800@sigaev.ru for details, so here are some numbers demonstrating benefits and some issues. Any help and feedback are welcome. We used geonames database with 5793013 rows, btree_gin patch (v.0.5) for CVS HEAD and contrib/btree_gin, which adds GIN support for all data types, like contrib/btree_gist does. postgres=# \d spots Table "public.spots" Column | Type | Modifiers --------+----------+----------- lat | real | long | real | name | text | name1 | text | fts |tsvector | We used 3 indexes - btree index on lat column (lat_idx), multicolumn GIN index on lat,fts columns (lat_fts_idx) and GIN index on fts column (fts_idx). The test query returns 7653 rows. explain analyze select name from spots where fts @@ to_tsquery('river') and lat < 20 and lat > 10; T1. lat_idx + fts_idx : 152.612 ms T2. lat_fts_idx : 34.051 ms T3. lat_idx + lat_fts_idx : 152.430 ms T4. Disabled bitmapscan : 30.005 ms T2 vs T1 show >4 times better performance for multicolumn GIN index (T2) than BitmapAnd scan over two separate indexes (T1). Several issues: 1. in T3 we see than planner prefer to use Btree index on lat_idx instead of using lat_fts_idx as in T2, so we got performance of T1 instead of T2. 2. Disabling bitmapscan produces the best results in all cases and the same plan. Changing statistics for lat column (we tried 100,1000) doesn't change plan. ============================== T1. btree index + GIN index Indexes: "fts_idx" gin (fts) "lat_idx" btree (lat) Q1. postgres=# explain analyze select name from spots where fts @@ to_tsquery('river') and lat < 20 and lat > 10; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------- BitmapHeap Scan on spots (cost=12838.77..23940.34 rows=3391 width=12) (actual time=143.451..149.288 rows=7653 loops=1) Recheck Cond: ((fts @@ to_tsquery('river'::text)) AND (lat < 20::double precision) AND (lat > 10::double precision)) -> BitmapAnd (cost=12838.77..12838.77 rows=3391 width=0) (actual time=143.097..143.097 rows=0 loops=1) -> BitmapIndex Scan on fts_idx (cost=0.00..893.21 rows=30896 width=0) (actual time=8.543..8.543 rows=32820 loops=1) Index Cond: (fts @@ to_tsquery('river'::text)) -> Bitmap Index Scan on lat_idx (cost=0.00..11943.62 rows=635886width=0) (actual time=132.686..132.686 rows=618548 loops=1) Index Cond: ((lat < 20::double precision)AND (lat > 10::double precision)) Total runtime: 152.040 ms (8 rows) Time: 152.612 ms T2. One multicolumn GIN index Indexes: "lat_fts_idx" gin (lat, fts) postgres=# explain analyze select name from spots where fts @@ to_tsquery('river') and lat < 20 and lat > 10; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------- BitmapHeap Scan on spots (cost=1011.83..54427.00 rows=3043 width=12) (actual time=10.374..31.349 rows=7653 loops=1) RecheckCond: (fts @@ to_tsquery('river'::text)) Filter: ((lat < 20::double precision) AND (lat > 10::double precision)) -> Bitmap Index Scan on lat_fts_idx (cost=0.00..1011.07 rows=28965 width=0) (actual time=8.439..8.439 rows=32820loops=1) Index Cond: (fts @@ to_tsquery('river'::text)) Total runtime: 34.051 ms (6 rows) T3. btree index + multicolumn GIN index "lat_fts_idx" gin (lat, fts) "lat_idx" btree (lat) postgres=# explain analyze select name from spots where fts @@ to_tsquery('river') and lat < 20 and lat > 10; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------- BitmapHeap Scan on spots (cost=12442.97..22532.06 rows=3043 width=12) (actual time=142.979..149.162 rows=7653 loops=1) Recheck Cond: ((fts @@ to_tsquery('river'::text)) AND (lat < 20::double precision) AND (lat > 10::double precision)) -> BitmapAnd (cost=12442.97..12442.97 rows=3043 width=0) (actual time=142.608..142.608 rows=0 loops=1) -> BitmapIndex Scan on lat_fts_idx (cost=0.00..1011.07 rows=28965 width=0) (actual time=8.493..8.493 rows=32820 loops=1) Index Cond: (fts @@ to_tsquery('river'::text)) -> Bitmap Index Scan on lat_idx (cost=0.00..11430.13rows=608537 width=0) (actual time=132.225..132.225 rows=618548 loops=1) Index Cond: ((lat< 20::double precision) AND (lat > 10::double precision)) Total runtime: 151.880 ms (8 rows) Time: 152.430 ms T4. Disabled bitmapscan postgres=# set enable_bitmapscan to off; SET Time: 0.101 ms postgres=# explain analyze select name from spots where fts @@ to_tsquery('river') and lat < 20 and lat > 10; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------- IndexScan using lat_fts_idx on spots (cost=0.00..96821.54 rows=3043 width=12) (actual time=0.040..26.847 rows=7653 loops=1) Index Cond: (fts @@ to_tsquery('river'::text)) Filter: ((lat < 20::double precision) AND (lat > 10::double precision))Total runtime: 29.543 ms (4 rows) Time: 30.005 ms Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
В списке pgsql-hackers по дате отправления: