Re: [GENERAL] Slow TSearch2 performance for table with 1 million documents.

Поиск
Список
Период
Сортировка
От Benjamin Arai
Тема Re: [GENERAL] Slow TSearch2 performance for table with 1 million documents.
Дата
Msg-id 6936504E-3560-4BE9-86BE-BEDAB3CD355A@araisoft.com
обсуждение исходный текст
Ответ на Re: [GENERAL] Slow TSearch2 performance for table with 1 million documents.  (Oleg Bartunov <oleg@sai.msu.su>)
Ответы Re: [GENERAL] Slow TSearch2 performance for table with 1 million documents.  (Benjamin Arai <me@benjaminarai.com>)
Список pgsql-performance
On Oct 5, 2007, at 8:32 AM, Oleg Bartunov wrote:

> On Fri, 5 Oct 2007, Tom Lane wrote:
>
>> Benjamin Arai <benjamin@araisoft.com> writes:
>>> # explain analyze select * FROM fulltext_article, to_tsquery
>>> ('simple','dog') AS q  WHERE idxfti @@ q ORDER BY rank(idxfti, q)
>>> DESC;
>>
>>>    QUERY PLAN
>>> --------------------------------------------------------------------
>>> ----
>>> --------------------------------------------------------------------
>>> ----
>>> ------------
>>> Sort  (cost=6576.74..6579.07 rows=933 width=774) (actual
>>> time=12969.237..12970.490 rows=5119 loops=1)
>>>     Sort Key: rank(fulltext_article.idxfti, q.q)
>>>     ->  Nested Loop  (cost=3069.79..6530.71 rows=933 width=774)
>>> (actual time=209.513..12955.498 rows=5119 loops=1)
>>>           ->  Function Scan on q  (cost=0.00..0.01 rows=1 width=32)
>>> (actual time=0.005..0.006 rows=1 loops=1)
>>>           ->  Bitmap Heap Scan on fulltext_article
>>> (cost=3069.79..6516.70 rows=933 width=742) (actual
>>> time=209.322..234.390 rows=5119 loops=1)
>>>                 Recheck Cond: (fulltext_article.idxfti @@ q.q)
>>>                 ->  Bitmap Index Scan on fulltext_article_idxfti_idx
>>> (cost=0.00..3069.56 rows=933 width=0) (actual time=208.373..208.373
>>> rows=5119 loops=1)
>>>                       Index Cond: (fulltext_article.idxfti @@ q.q)
>>> Total runtime: 12973.035 ms
>>> (9 rows)
>>
>> The time seems all spent at the join step, which is odd because it
>> really hasn't got much to do.  AFAICS all it has to do is compute the
>> rank() values that the sort step will use.  Is it possible that
>> rank() is really slow?
>
> can you try rank_cd() instead ?
>
Using Rank:

-# ('simple','dog') AS q  WHERE idxfti @@ q ORDER BY rank(idxfti, q)
DESC;

   QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
------------
Sort  (cost=6576.74..6579.07 rows=933 width=774) (actual
time=98083.081..98084.351 rows=5119 loops=1)
    Sort Key: rank(fulltext_article.idxfti, q.q)
    ->  Nested Loop  (cost=3069.79..6530.71 rows=933 width=774)
(actual time=479.122..98067.594 rows=5119 loops=1)
          ->  Function Scan on q  (cost=0.00..0.01 rows=1 width=32)
(actual time=0.003..0.004 rows=1 loops=1)
          ->  Bitmap Heap Scan on fulltext_article
(cost=3069.79..6516.70 rows=933 width=742) (actual
time=341.739..37112.110 rows=5119 loops=1)
                Recheck Cond: (fulltext_article.idxfti @@ q.q)
                ->  Bitmap Index Scan on fulltext_article_idxfti_idx
(cost=0.00..3069.56 rows=933 width=0) (actual time=321.443..321.443
rows=5119 loops=1)
                      Index Cond: (fulltext_article.idxfti @@ q.q)
Total runtime: 98087.575 ms
(9 rows)

Using Rank_cd:

# explain analyze select * FROM fulltext_article, to_tsquery
('simple','cat') AS q  WHERE idxfti @@ q ORDER BY rank_cd(idxfti, q)
DESC;

   QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
-------------
Sort  (cost=6576.74..6579.07 rows=933 width=774) (actual
time=199316.648..199324.631 rows=26054 loops=1)
    Sort Key: rank_cd(fulltext_article.idxfti, q.q)
    ->  Nested Loop  (cost=3069.79..6530.71 rows=933 width=774)
(actual time=871.428..199244.330 rows=26054 loops=1)
          ->  Function Scan on q  (cost=0.00..0.01 rows=1 width=32)
(actual time=0.006..0.007 rows=1 loops=1)
          ->  Bitmap Heap Scan on fulltext_article
(cost=3069.79..6516.70 rows=933 width=742) (actual
time=850.674..50146.477 rows=26054 loops=1)
                Recheck Cond: (fulltext_article.idxfti @@ q.q)
                ->  Bitmap Index Scan on fulltext_article_idxfti_idx
(cost=0.00..3069.56 rows=933 width=0) (actual time=838.120..838.120
rows=26054 loops=1)
                      Index Cond: (fulltext_article.idxfti @@ q.q)
Total runtime: 199338.297 ms
(9 rows)

>
>>
>>             regards, tom lane
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 5: don't forget to increase your free space map settings
>>
>
>     Regards,
>         Oleg
> _____________________________________________________________
> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
> Sternberg Astronomical Institute, Moscow University, Russia
> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
> phone: +007(495)939-16-83, +007(495)939-23-83
>


В списке pgsql-performance по дате отправления:

Предыдущее
От: "Joshua D. Drake"
Дата:
Сообщение: Re: Problems with + 1 million record table
Следующее
От: Shane Ambler
Дата:
Сообщение: Re: Problems with + 1 million record table