Re: Understanding tsearch2 performance

От: Oleg Bartunov
Тема: Re: Understanding tsearch2 performance
Дата: ,
Msg-id: Pine.LNX.4.64.1007141834390.32129@sn.sai.msu.ru
(см: обсуждение, исходный текст)
Ответ на: Re: Understanding tsearch2 performance  (Ivan Voras)
Список: pgsql-performance

Скрыть дерево обсуждения

Understanding tsearch2 performance  (Ivan Voras, )
 Re: Understanding tsearch2 performance  (Oleg Bartunov, )
  Re: Understanding tsearch2 performance  (Ivan Voras, )
   Re: Understanding tsearch2 performance  (Oleg Bartunov, )
    Re: Understanding tsearch2 performance  (Ivan Voras, )
     Re: Understanding tsearch2 performance  (Stephen Frost, )
      Re: Understanding tsearch2 performance  (Ivan Voras, )
 Re: Understanding tsearch2 performance  ("Kevin Grittner", )
  Re: Understanding tsearch2 performance  (Ivan Voras, )
   Re: Understanding tsearch2 performance  (Oleg Bartunov, )
   Re: Understanding tsearch2 performance  ("Kevin Grittner", )
    Re: Understanding tsearch2 performance  (Ivan Voras, )

Ivan,

here is explain analyze output - 7122 out of 528155 docs
tseval=# select count(*) from document;
  count
--------
  528155
(1 row)

Time: 345,562 ms

tseval=# explain analyze select docno, title from document where vector @@ to_tsquery('english','mars');
  Bitmap Heap Scan on document  (cost=1655.97..10518.34 rows=2641 width=13) (actual time=3.127..11.556 rows=7122
loops=1)
    Recheck Cond: (vector @@ '''mar'''::tsquery)
    ->  Bitmap Index Scan on idx_vector  (cost=0.00..1655.31 rows=2641 width=0) (actual time=1.899..1.899 rows=7122
loops=1)
          Index Cond: (vector @@ '''mar'''::tsquery)
  Total runtime: 12.303 ms
(5 rows)

This is PostgreSQL 8.4.4 on Ubuntu machine.


Oleg

On Wed, 14 Jul 2010, Ivan Voras wrote:

> On 07/14/10 16:03, Kevin Grittner wrote:
>> Ivan Voras <    > wrote:
>>> On 07/14/10 15:49, Stephen Frost wrote:
>>
>>>> Regarding the statistics, it's entirely possible that the index
>>>> is *not* the fastest way to pull this data (it's nearly 10% of
>>>> the table..)
>>>
>>> I think that what I'm asking here is: is it reasonable for
>>> tsearch2 to extract 8,500 rows from an index of 90,000 rows in 118
>>> ms, given that the approximately same task can be done with an
>>> unindexed "LIKE" operator in nearly the same time?
>>
>> The answer is "yes."  When it's 10% of the table, a sequential scan
>> can be more efficient than an index, as Stephen indicated.
>
> Ok, to verify this I've tried increasing statistics on the field and
> running vacumm analyze full, which didn't help. Next, I've tried setting
> enable_indexscan to off, which also didn't do it:
>
> cms=> set enable_indexscan=off;
> SET
> cms=> explain analyze select id,title from forum where _fts_ @@
> 'fer'::tsquery order by id limit 10;
>                                                             QUERY PLAN
>
>
-------------------------------------------------------------------------------------------------------------------------------------
> Limit  (cost=363.18..363.20 rows=10 width=35) (actual
> time=192.243..192.406 rows=10 loops=1)
>   ->  Sort  (cost=363.18..363.40 rows=91 width=35) (actual
> time=192.229..192.283 rows=10 loops=1)
>         Sort Key: id
>         Sort Method:  top-N heapsort  Memory: 25kB
>         ->  Bitmap Heap Scan on forum  (cost=29.21..361.21 rows=91
> width=35) (actual time=12.071..136.130 rows=8449 loops=1)
>               Recheck Cond: (_fts_ @@ '''fer'''::tsquery)
>               ->  Bitmap Index Scan on forum_fts  (cost=0.00..29.19
> rows=91 width=0) (actual time=11.169..11.169 rows=8449 loops=1)
>                     Index Cond: (_fts_ @@ '''fer'''::tsquery)
> Total runtime: 192.686 ms
> (9 rows)
>
> Any ideas on how to verify this?
>
>
>
>
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: , http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


В списке pgsql-performance по дате сообщения:

От: Ivan Voras
Дата:
Сообщение: Re: Understanding tsearch2 performance
От: Scott Carey
Дата:
Сообщение: Re: performance on new linux box