Re: Comparing tsvector results

Поиск
Список
Период
Сортировка
От David G. Johnston
Тема Re: Comparing tsvector results
Дата
Msg-id CAKFQuwZktEw+5a1bYSX=7izT=h59Qr24P6MUzziAPz13=v+sXg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Comparing tsvector results  (Liza Sazonova <liza@getpitstop.io>)
Список pgsql-general
Please do not top-post.

On Thu, Jun 16, 2016 at 2:38 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Thu, Jun 16, 2016 at 2:21 PM, Liza Sazonova <liza@getpitstop.io> wrote:
Hello,

I'm trying to implement a keyword-based search on my database. 

I have some text input (Say, "The A or B and C with D") and a list of keywords ("B","X","Y"). I want to identify which keywords are present in the text input. I also want to use a thesaurus to process the text input, since the keywords have synonyms (for example, "The A : X"). 

As far as I understand, the only way for me to process text input with a thesaurus would be to use tsvector / tsquery; or at least this is the fastest way. 
Is it possible to retrieve the tsvector from the text input, then retrieve the tsvector from the keyword list, and identify which words are present in both?

 
From its description the tsvector_to_array function sounds like it would be helpful here.

​tsvector_to_array(input) -> unnest
intersect
tsvector_to_array(keywords) -> unnest

​David J.​


On Thu, Jun 16, 2016 at 2:51 PM, Liza Sazonova <liza@getpitstop.io> wrote:
Thank you!

It seems like this function's not available in PostgreSQL 9.5 that I'm using 
I'm getting: "ERROR:  function ​tsvector_to_array(unknown) does not exist".

Do you know of any equivalent function in earlier releases?

I was thinking of casting ts_vector into text, and then splitting the text using a delimiter; but this seems like a slow solution and I was wondering if there's anything better.

Not that I can see nor would expect given someone to the time to write it for 9.6 (sorry, about not pointing that out - blindly suggesting solutions only available in beta is not something I usually do)​.

Your best bet I'd say is using "strip(tsvector)" to clean the inputs, then use "string_to_array(...)" to get your array and then unnest/intersect


There may be other solutions available in 3rd-party extensions or that I'm just missing but just using standard 9.5 stuff that's what I see.

​I'd solve your problem using a custom function and if you indeed feel it is too slow then explore alternatives.

David J.


В списке pgsql-general по дате отправления:

Предыдущее
От: Liza Sazonova
Дата:
Сообщение: Re: Comparing tsvector results
Следующее
От: avi Singh
Дата:
Сообщение: Slony error please help