Re: procost for to_tsvector

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: procost for to_tsvector
Дата
Msg-id 20150311162604.GL12445@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: procost for to_tsvector  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On 2015-03-11 12:07:20 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2015-03-11 14:40:16 +0000, Andrew Gierth wrote:
> >> ,but even without doing that, there's a strong
> >> argument that it should be increased to at least the order of 100.
> 
> Nyet ... at least not without you actually making that argument, with
> numbers, rather than just handwaving.  We use 100 for plpgsql and suchlike
> functions.  I'd be OK with making it 10 just on general principles, but
> claiming that it's as expensive as a plpgsql function requires
> evidence.

I'll note that you proposed a higher cost than 10 years back ;):
http://www.postgresql.org/message-id/8971.1255891843@sss.pgh.pa.us

What you said back then makes sense to me:

On 2009-10-18 14:50:43 -0400, Tom Lane wrote:
> In another case I was looking at just now, it seems that to_tsquery()
> and to_tsvector() are noticeably slower than most other built-in
> functions, which is not surprising given the amount of mechanism that
> gets invoked inside them.  It would be useful to tell the planner
> about that to discourage it from picking seqscan plans that involve
> repeated execution of these functions.

A trivial comparison shows with a simple plpgsql function:
CREATE FUNCTION a_simple_plpgsql_function(a text) RETURNS text LANGUAGE plpgsql AS $$BEGIN RETURN repeat(a, 3);END;$$;

SELECT a_simple_plpgsql_function('This is a long sentence in english. Or maybe not so long after all. But it includes a
MetalÜmlaut. And parens: ()! Also a number: ' ||g.i)
 
FROM generate_series(1, 10000) g(i)
Time: 32.898 ms

and
SELECT to_tsvector('english',                  'This is a long sentence in english. Or maybe not so
longafter all. But it includes a Metal Ümlaut. And                  parens: ()! Also a number: ' ||g.i)
 
FROM generate_series(1, 10000) g(i);
Time: 450.996 ms

Given that this is a short sentence and a simple text search
configuration a factor of 10 between them doesn't sound wrong. This is
obviously completely unscientific, but ...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: One question about security label command
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: One question about security label command