tsearch2 and hyphenated terms

Поиск
Список
Период
Сортировка
От Reece Hart
Тема tsearch2 and hyphenated terms
Дата
Msg-id 1207891045.6903.14.camel@snafu
обсуждение исходный текст
Ответы Re: tsearch2 and hyphenated terms
Re: tsearch2 and hyphenated terms
Список pgsql-general
I'd like to use tsearch2 to index protein and gene names. Unfortunately,
such names are written inconsistently and sometimes with hyphens. For
example, MCL-1 and MCL1 are semantically equivalent but with the default
parser and to_tsvector, I see this:

        unison@u8.3=> select to_tsvector('MCL1 MCL-1');
               to_tsvector
        -------------------------
         '-1':3 'mcl':2 'mcl1':1

For the purposes of indexing these names, I suspect I'd get the majority
of cases by removing a hyphen when it's followed by 1 or 2 chars from
[a-zA-Z0-9]. Does that require a custom parser?

Thanks,
Reece

--
Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0


В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Operator COMMUTATOR - how does postgresql use this information
Следующее
От: Andrew Sullivan
Дата:
Сообщение: Re: PostgreSQL Processes on a linux box