Re: dot to be considered as a word delimiter?

Поиск
Список
Период
Сортировка
От Kenneth Marshall
Тема Re: dot to be considered as a word delimiter?
Дата
Msg-id 20090602124725.GD18879@it.is.rice.edu
обсуждение исходный текст
Ответ на Re: dot to be considered as a word delimiter?  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Ответы Re: dot to be considered as a word delimiter?  (Sushant Sinha <sushant354@gmail.com>)
Список pgsql-hackers
On Mon, Jun 01, 2009 at 08:22:23PM -0500, Kevin Grittner wrote:
> Sushant Sinha <sushant354@gmail.com> wrote: 
>  
> > I think that dot should be considered by as a word delimiter because
> > when dot is not followed by a space, most of the time it is an error
> > in typing. Beside they are not many valid english words that have
> > dot in between.
>  
> It's not treating it as an English word, but as a host name.
>  
> select ts_debug('english', 'Mr.J.Sai Deepak');
>                                  ts_debug
> ---------------------------------------------------------------------------
>  (host,Host,Mr.J.Sai,{simple},simple,{mr.j.sai})
>  (blank,"Space symbols"," ",{},,)
>  (asciiword,"Word, all
> ASCII",Deepak,{english_stem},english_stem,{deepak})
> (3 rows)
>  
> You could run it through a dictionary which would deal with host
> tokens differently.  Just be aware of what you'll be doing to
> www.google.com if you run into it.
>  
> I hope this helps.
>  
> -Kevin
> 

In our uses for full text indexing, it is much more important to
be able to find host name and URLs than to find mistyped names.
My two cents.

Cheers,
Ken


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Marko Kreen
Дата:
Сообщение: Re: PostgreSQL Developer meeting minutes up
Следующее
От: Aidan Van Dyk
Дата:
Сообщение: Re: PostgreSQL Developer meeting minutes up