Hi Hackers.
I have a feeling that GIN is "cheating" on the visibillity checks:
test=# set enable_seqscan = off;
SET
Time: 0.129 ms
test=# select count(id) from fts_test where fts @@ to_tsquery('core'); count
-------- 158827
(1 row)
Time: 95.530 ms
test=# explain select count(id) from fts_test where fts @@
to_tsquery('core'); QUERY PLAN
---------------------------------------------------------------------------------------------- Aggregate
(cost=211571.52..211571.53rows=1 width=4) -> Bitmap Heap Scan on fts_test (cost=134925.95..211174.01
rows=159004 width=4) Recheck Cond: (fts @@ to_tsquery('core'::text)) -> Bitmap Index Scan on fts_idx
(cost=0.00..134886.20
rows=159004 width=0) Index Cond: (fts @@ to_tsquery('core'::text))
(5 rows)
Time: 0.609 ms
test=# select count(id) from fts_test; count
-------- 168556
(1 row)
Time: 164.655 ms
test=# explain select count(id) from fts_test; QUERY PLAN
------------------------------------------------------------------------------------------------ Aggregate
(cost=10000075969.95..10000075969.96rows=1 width=4) -> Seq Scan on fts_test (cost=10000000000.00..10000075548.56
rows=168556 width=4)
(2 rows)
Time: 0.338 ms
This is run multiple times for both queries and the seqscan of the table
is consistently about 1.8 times more expensive than the fts-scan.
This is all on a fully memory cached dataset.
The first query should have the cost of the GIN-search +
visibillity-test of 158K tuples,
the latter should have the cost of visibillity-testing 168K tuples. If
we set the cost
of actually searching GIN to 0 then the gin-search - visibillity costs:
95/158000 0.000373ms/tuple
where the seq-scan case costs close to 0.001ms/tuple (close to 3 times
as much).
So I have a strong feeling that GIN is cheating on the visibillity tests
otherwise I have problems imagining how it ever can become faster to execute
than the seq_scan of the table.
Or is a Bitmap Heap Scan simply 3 times faster than a Seq-scan for
visibillity-testing?
What have I missed in the logic?
Thanks.
--
Jesper