Re: tsearch2 headline and postgresql.conf

Поиск
Список
Период
Сортировка
От Oleg Bartunov
Тема Re: tsearch2 headline and postgresql.conf
Дата
Msg-id Pine.GSO.4.63.0601221110190.14417@ra.sai.msu.su
обсуждение исходный текст
Ответ на tsearch2 headline and postgresql.conf  (pgsql-performance@nullmx.com)
Ответы Re: tsearch2 headline and postgresql.conf  (pgsql-performance@nullmx.com)
Список pgsql-performance
You didn't provides us any query with explain analyze.
Just to make sure you're fine.

     Oleg
On Sun, 22 Jan 2006, pgsql-performance@nullmx.com wrote:

> Hi folks,
>
> I'm not sure if this is the right place for this but thought I'd ask.  I'm
> relateively new to postgres having only used it on 3 projects and am just
> delving into the setup and admin for the second time.
>
> I decided to try tsearch2 for this project's search requirements but am
> having trouble attaining adequate performance.  I think I've nailed it down
> to trouble with the headline() function in tsearch2.
> In short, there is a crawler that grabs HTML docs and places them in a
> database.  The search is done using tsearch2 pretty much installed according
> to instructions.  I have read a couple online guides suggested by this list
> for tuning the postgresql.conf file.  I only made modest adjustments because
> I'm not working with top-end hardware and am still uncertain of the actual
> impact of the different paramenters.
>
> I've been learning 'explain' and over the course of reading I have done
> enough query tweaking to discover the source of my headache seems to be
> headline().
>
> On a query of 429 documents, of which the avg size of the stripped down
> document as stored is 21KB, and the max is 518KB (an anomaly), tsearch2
> performs exceptionally well returning most queries in about 100ms.
>
> On the other hand, following the tsearch2 guide which suggests returning that
> first portion as a subquery and then generating the headline() from those
> results, I see the query increase to 4 seconds!
>
> This seems to be directly related to document size.  If I filter out that
> 518KB doc along with some 100KB docs by returning "substring( stripped_text
> FROM 0 FOR 50000) AS stripped_text" I decrease the time to 1.4 seconds, but
> increase the risk of not getting a headline.
>
> Seeing as how this problem is directly tied to document size, I'm wondering
> if there are any specific settings in postgresql.conf that may help, or is
> this just a fact of life for the headline() function?  Or, does anyone know
> what the problem is and how to overcome it?
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>              http://www.postgresql.org/docs/faq
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

В списке pgsql-performance по дате отправления:

Предыдущее
От: pgsql-performance@nullmx.com
Дата:
Сообщение: tsearch2 headline and postgresql.conf
Следующее
От: August Zajonc
Дата:
Сообщение: Re: Suspending SELECTs