bitmap heap scan way cheaper than seq scan on the same amount of tuples (fts-search).

Поиск
Список
Период
Сортировка
Hi.

I'm currently trying to figure out why the tsearch performance seems to
vary a lot between different queryplans. I have created a sample dataset
that sort of resembles the data I have to work on.

The script that builds the dataset is at:
http://krogh.cc/~jesper/build-test.pl
and http://krogh.cc/~jesper/words.txt is needed for it to run.

Test system.. average desktop, 1 SATA drive and 1.5GB memory with pg 8.4.1.

The dataset consists of words randomized, but .. all records contains
"commonterm", around 80% contains commonterm80 and so on..

    my $rand = rand();
    push @doc,"commonterm" if $commonpos == $j;
    push @doc,"commonterm80" if $commonpos == $j && $rand < 0.8;

Results are run multiple times after each other so they should be
reproducible:

ftstest=# explain analyze select id from ftstest where body_fts @@
to_tsquery('commonterm80');
                                                   QUERY PLAN

----------------------------------------------------------------------------------------------------------------
 Seq Scan on ftstest  (cost=0.00..10750.00 rows=40188 width=4) (actual
time=0.102..1792.215 rows=40082 loops=1)
   Filter: (body_fts @@ to_tsquery('commonterm80'::text))
 Total runtime: 1809.437 ms
(3 rows)

ftstest=# set enable_seqscan=off;
SET
ftstest=# explain analyze select id from ftstest where body_fts @@
to_tsquery('commonterm80');
                                                              QUERY PLAN


---------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on ftstest  (cost=115389.14..125991.96 rows=40188
width=4) (actual time=17.445..197.356 rows=40082 loops=1)
   Recheck Cond: (body_fts @@ to_tsquery('commonterm80'::text))
   ->  Bitmap Index Scan on ftstest_gin_idx  (cost=0.00..115379.09
rows=40188 width=0) (actual time=13.370..13.370 rows=40082 loops=1)
         Index Cond: (body_fts @@ to_tsquery('commonterm80'::text))
 Total runtime: 204.201 ms
(5 rows)

Given that the seq-scan have to visit 50K row to create the result and
the bitmap heap scan only have to visit 40K (but search the index) we
would expect the seq-scan to be at most 25% more expensive than the
bitmap-heap scan.. e.g. less than 300ms.

Jesper
--
Jesper

В списке pgsql-performance по дате отправления:

Предыдущее
От: Waldomiro
Дата:
Сообщение: Re: query planning different in plpgsql?
Следующее
От: Craig Ringer
Дата:
Сообщение: Re: bitmap heap scan way cheaper than seq scan on the same amount of tuples (fts-search).