Re: making tsearch2 dictionaries
От | Ben |
---|---|
Тема | Re: making tsearch2 dictionaries |
Дата | |
Msg-id | Pine.LNX.4.44.0402171000290.32605-100000@localhost.localdomain обсуждение исходный текст |
Ответ на | Re: making tsearch2 dictionaries (Oleg Bartunov <oleg@sai.msu.su>) |
Список | pgsql-general |
On Tue, 17 Feb 2004, Oleg Bartunov wrote: > it's unpredictable and I still don't get your idea of pipilining, but > in general, I have nothing agains it. Oh, well, the idea is that instead of the dictionary searching stopping at the first dictionary in the chain that returns a lexeme, it would take each of the lexemes returned and pass them on to the next dictionary in the chain. So if I specified numbers were to be handled by my num2english dictionary, followed by en_stem, and then tried to deal get a vector for "100", num2english would return "one" and "hundred". Then both "one" and "hundred" would each be looked up in en_stem, and the union of these lexems would be the final result. Similarly, if a latin word gets piped through an ispell dictionary before being sent to en_stem, each possible spelling would be stemmed. > Aha, the same way as we handle complex words with hyphen - we return > the whole word and its parts. So you need to introduce new type of token > in parser and use synonym dictionary which in one's turn will returns > the symbol token and human readable word. Okay, that makes sense. I'll look more into how hyphenated words are being handled now.
В списке pgsql-general по дате отправления: