sequential scan unduly favored over text search gin index

Поиск
Список
Период
Сортировка
От Sushant Sinha
Тема sequential scan unduly favored over text search gin index
Дата
Msg-id 1308584339.2488.12.camel@dragflick
обсуждение исходный текст
Ответы Re: sequential scan unduly favored over text search gin index  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Re: sequential scan unduly favored over text search gin index  (Jesper Krogh <jesper@krogh.cc>)
Re: sequential scan unduly favored over text search gin index  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-performance
I have a tsvector column docvector and a gin index on it
docmeta1_docvector_idx

I have a simple query "select * from docmeta1 where docvector @@
plainto_tsquery('english', 'free');"

I find that the planner chooses a sequential scan of the table even when
the index performs orders of magnitude. I set random_page_cost = 1.0 for
the database to favor index use. However, I still see that the cost
estimate for sequential scan of the entire table (23000) is cheaper than
the cost of using the index (33000). The time taken for sequential
access is 5200 ms and for index usage is only 85 ms.

Details here:

postgres version 9.0.2
statistics on docvector is set to 10000 and as you can see the row
estimates are fine.

lawdb=# \d docmeta1
       Table "public.docmeta1"
   Column    |   Type    | Modifiers
-------------+-----------+-----------
 tid         | integer   | not null
 docweight   | integer   |
 doctype     | integer   |
 publishdate | date      |
 covertids   | integer[] |
 titlevector | tsvector  |
 docvector   | tsvector  |
Indexes:
    "docmeta1_pkey" PRIMARY KEY, btree (tid)
    "docmeta1_date_idx" btree (publishdate)
    "docmeta1_docvector_idx" gin (docvector)
    "docmeta1_title_idx" gin (titlevector)

lawdb=# SELECT relpages, reltuples FROM pg_class WHERE relname
='docmeta1';
relpages | reltuples
----------+-----------
    18951 |    329940


lawdb=# explain analyze select * from docmeta1 where docvector @@
plainto_tsquery('english', 'free');
                                                    QUERY
PLAN

--------------------------------------------------------------------------------
-----------------------------------
 Seq Scan on docmeta1  (cost=0.00..23075.25 rows=35966 width=427)
(actual time=0
.145..5189.556 rows=35966 loops=1)
   Filter: (docvector @@ '''free'''::tsquery)
 Total runtime: 5196.231 ms
(3 rows)

lawdb=# set enable_seqscan = off;
SET
lawdb=# explain analyze select * from docmeta1 where docvector @@
plainto_tsquery('english', 'free');
                                                                QUERY
PLAN

--------------------------------------------------------------------------------
-----------------------------------------------------------
 Bitmap Heap Scan on docmeta1  (cost=14096.25..33000.83 rows=35966
width=427) (a
ctual time=9.543..82.754 rows=35966 loops=1)
   Recheck Cond: (docvector @@ '''free'''::tsquery)
   ->  Bitmap Index Scan on docmeta1_docvector_idx  (cost=0.00..14087.26
rows=35
966 width=0) (actual time=8.059..8.059 rows=35967 loops=1)
         Index Cond: (docvector @@ '''free'''::tsquery)
 Total runtime: 85.304 ms
(5 rows)


-Sushant.


В списке pgsql-performance по дате отправления:

Предыдущее
От: "Kevin Grittner"
Дата:
Сообщение: Re: Inoptimal query plan for max() and multicolumn index
Следующее
От: "Kevin Grittner"
Дата:
Сообщение: Re: sequential scan unduly favored over text search gin index