Re: using Tsearch2 for chemical text

Поиск
Список
Период
Сортировка
От Oleg Bartunov
Тема Re: using Tsearch2 for chemical text
Дата
Msg-id Pine.LNX.4.64.0707260950280.18739@sn.sai.msu.ru
обсуждение исходный текст
Ответ на using Tsearch2 for chemical text  (Rajarshi Guha <rguha@indiana.edu>)
Список pgsql-general
On Wed, 25 Jul 2007, Rajarshi Guha wrote:

> Hi, I have a table with about 9M entries. The table has 2 fields: id and name
> which are of serial and text types respectively. I have a ordinary index on
> the text field which allows me to do searches in reasonable time. Most of my
> searches are of the form
>
> select * from mytable where name ~ 'some text query'
>
> I know that the Tsearch2 module will let me have very efficient text
> searches. But if I understand correctly, it's based on a language specific
> dictionary.

wrong ! it comes with some written human language dictionaries, but you can
write your very own dictionaries. dictionary is just a C-program.

>
> My problem is that the name column contains names of chemicals. Now for many
> cases this may simply be a number (1674-56-2) and in other cases it may be an
> alphanumeric string (such as (-)O-acetylcarnitine or
> 1,2-cis-dihydroxybenzoate). In some cases it is a well-known word (say viagra
> or calcium  chloride or pentathol).
>
> My question is: will Tsearch2 be able to handle this type of text? Or will it
> be hampered by the fact that the bulk of the rows do not correspond to
> ordinary English

Oh, sure. See, for example, our dict_regex dictionary, we use for
astronomical search.
http://lynx.sao.ru/~karpov/software/postgres_dict_regex.html

This is a work in progress, but it works.

>
> -------------------------------------------------------------------
> Rajarshi Guha  <rguha@indiana.edu>
> GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04  06F7 1BB9 E634 9B87 56EE
> -------------------------------------------------------------------
> My Ethicator machine must have had a built-in moral
> compromise spectral phantasmatron! I'm a genius."
>               -Calvin
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>     choose an index scan if your joining column's datatypes do not
>     match

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

В списке pgsql-general по дате отправления:

Предыдущее
От: Naz Gassiep
Дата:
Сообщение: Re: using Tsearch2 for chemical text
Следующее
От: Oleg Bartunov
Дата:
Сообщение: Re: using Tsearch2 for chemical text