Re: tsearch parser inefficiency if text includes urls or emails - new version

Поиск
Список
Период
Сортировка
От Kevin Grittner
Тема Re: tsearch parser inefficiency if text includes urls or emails - new version
Дата
Msg-id 4B20D4F1020000250002D2F1@gw.wicourts.gov
обсуждение исходный текст
Ответ на Re: tsearch parser inefficiency if text includes urls or emails - new version  (Andres Freund <andres@anarazel.de>)
Ответы Re: tsearch parser inefficiency if text includes urls or emails - new version  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Список pgsql-hackers
Andres Freund <andres@anarazel.de> wrote:
> I think you see no real benefit, because your strings are rather
> short - the documents I scanned when noticing the issue where
> rather long.
The document I used in the test which showed the regression was
672,585 characters, containing 10,000 URLs.
> A rather extreme/contrived example:
> postgres=# SELECT 1 FROM to_tsvector(array_to_string(ARRAY(SELECT 
> 'andres@anarazel.de http://www.postgresql.org/'::text FROM 
> generate_series(1, 
> 20000) g(i)), ' -  '));
The most extreme of your examples uses a 979,996 character string,
which is less than 50% larger than my test.  I am, however, able to
see the performance difference for this particular example, so I now
have something to work with.  I'm seeing some odd behavior in terms
of when there is what sort of difference.  Once I can categorize it
better, I'll follow up.
Thanks for the sample which shows the difference.
-Kevin


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ron Mayer
Дата:
Сообщение: Re: explain output infelicity in psql
Следующее
От: Andrew Dunstan
Дата:
Сообщение: Re: explain output infelicity in psql