Re: Phrase search vs. multi-lexeme tokens

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Phrase search vs. multi-lexeme tokens
Дата
Msg-id 10026.1609953512@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Phrase search vs. multi-lexeme tokens  (Alexander Korotkov <aekorotkov@gmail.com>)
Ответы Re: Phrase search vs. multi-lexeme tokens  (Alexander Korotkov <aekorotkov@gmail.com>)
Список pgsql-hackers
Alexander Korotkov <aekorotkov@gmail.com> writes:
> # select to_tsvector('pg_class foo') @@ websearch_to_tsquery('"pg_class foo"');
>  ?column?
> ----------
>  f

Yeah, surely this is wrong.

> # select to_tsquery('pg_class <-> foo');
>           to_tsquery
> ------------------------------
>  ( 'pg' & 'class' ) <-> 'foo'

> I think if a user writes 'pg_class <-> foo', then it's expected to
> match 'pg_class foo' independently on which lexemes 'pg_class' is
> split into.

Indeed.  It seems to me that this:

regression=# select to_tsquery('pg_class');
   to_tsquery
----------------
 'pg' & 'class'
(1 row)

is wrong all by itself.  Now that we have phrase search, a much
saner translation would be "'pg' <-> 'class'".  If we fixed that
then it seems like the more complex case would just work.

I read your patch over quickly and it seems like a reasonable
approach (but sadly underdocumented).  Can we extend the idea
to fix the to_tsquery case?

            regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: [PoC] Non-volatile WAL buffer
Следующее
От: Magnus Hagander
Дата:
Сообщение: Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)