Обсуждение: BUG #17125: Operator precedence bug in websearch_to_tsquery function

Поиск
Список
Период
Сортировка

BUG #17125: Operator precedence bug in websearch_to_tsquery function

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      17125
Logged by:          Tim Connolly
Email address:      tim.connolly@oovvuu.com
PostgreSQL version: 11.12
Operating system:   Alpine Linux
Description:

Expectation: A web-search query of 'foo bar or baz' should match documents
that contain 'foo' and 'bar', and documents that contain 'foo' and 'baz'.


postgres=# select to_tsvector('english', 'baz') @@
websearch_to_tsquery('english', 'foo bar or baz ');
 ?column?
----------
 t
(1 row)

Expected: f

postgres=# select websearch_to_tsquery('english', 'foo bar or baz');
 websearch_to_tsquery
-----------------------
 'foo' & 'bar' | 'baz'
(1 row)

Expected: 'foo' & ('bar' | 'baz')


Re: BUG #17125: Operator precedence bug in websearch_to_tsquery function

От
"David G. Johnston"
Дата:
On Tuesday, July 27, 2021, PG Bug reporting form <noreply@postgresql.org> wrote:

postgres=# select websearch_to_tsquery('english', 'foo bar or baz');
 websearch_to_tsquery
-----------------------
 'foo' & 'bar' | 'baz'
(1 row)

Expected: 'foo' & ('bar' | 'baz')


The documentation describes the operator precedence and it isn’t what you expect.



David J.

Re: BUG #17125: Operator precedence bug in websearch_to_tsquery function

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Tuesday, July 27, 2021, PG Bug reporting form <noreply@postgresql.org>
> wrote:
>> postgres=# select websearch_to_tsquery('english', 'foo bar or baz');
>> websearch_to_tsquery
>> -----------------------
>> 'foo' & 'bar' | 'baz'
>> (1 row)
>> 
>> Expected: 'foo' & ('bar' | 'baz')

> The documentation describes the operator precedence and it isn’t what you
> expect.

It does appear from what I could find on the web that Google does it
the other way.  Whether that's enough reason to change a behavior
that's stood since v11 is hard to say.  We're not trying to be
entirely bug-compatible with Google here ... and even if we were,
who's to say whether they might not change this tomorrow?

Perhaps a more useful way to think about it is whether it's possible
to get the behavior opposite to the default.  AFAICS there isn't any
way to get 'a & (b | c)' out of websearch_to_tsquery.  However, if
we changed the default precedence, then there'd be no way to get the
old behavior, which is not nice at all.  I first thought that maybe
you could write '"a b" or c', but that produces 'a <-> b | c' which
isn't the same.

Anyway, given that most people probably have no idea about this fine
point, I doubt that the benefits of changing it would outweigh the
costs.

            regards, tom lane