ts_rank and ts_rank_cd with multiple search terms

Поиск
Список
Период
Сортировка
От Robert Nikander
Тема ts_rank and ts_rank_cd with multiple search terms
Дата
Msg-id CE80FBBF-AB65-4F76-8024-177F33C0B517@gmail.com
обсуждение исходный текст
Список pgsql-general
Hi,

I’m reading about the ranking functions [1], and I have a couple questions…

1. Is ts_rank taking proximity of terms into account? It seems like it is, but the docs suggest that only ts_rank_cd
doesthat. 
2. Is there a way to search multiple terms like ‘a | b | c …’ but score higher when multiple match, AND take into
accountdistance between words? It doesn’t seem like basic use of ts_rank or ts_rank_cd is doing this.  Do you recommend
acustom ranking function here? 

For example, I want to search for “black bear” and get better results ordered so that documents with both words close
togetherscore highest, and the document with only “bear" is the last. 

    create table search_test ( title text, body text, vec tsvector );
    — These 3 have “black” and “bear” at different distances from each other
    insert into search_test values ('close', 'The black bear sat on a brown rock and ate a barrel of red berries.');
    insert into search_test values ('medium', 'The brown bear sat on a black rock and ate a barrel of red berries.’);
    insert into search_test values ('far', 'The brown bear sat on a red rock and ate a barrel of black berries.’);
    — This one has the word “bear”, but not “black"
    insert into search_test values ('only bear', 'The brown bear sat on a red rock and ate a barrel of orange
berries.');
    update search_test set vec = to_tsvector(body);

Now a query:

    select title, ts_rank(vec, q) as rank
    from search_test, to_tsquery('black & bear') q
    where vec @@ q order by rank desc;

That surprises me by scoring close > medium > far. Hence, my question #1.  Substituting ts_rank_cd also works, as
expected.

If I change the query to `black | bear`, to try to match “only bear” as well, then both ts_rank and ts_rank_cd return
equalrankings for “close”, “medium” and “far”. 

Any recommendations?

thanks,
Rob



[1] http://www.postgresql.org/docs/9.4/static/textsearch-controls.html#TEXTSEARCH-RANKING

В списке pgsql-general по дате отправления:

Предыдущее
От: rob stone
Дата:
Сообщение: Re: Using the database to validate data
Следующее
От: "krzf83@gmail.com "
Дата:
Сообщение: how to compile postgresql with other version of openssl?