Re: BUG #14245: Segfault on weird to_tsquery

Поиск
Список
Период
Сортировка
От David Kellum
Тема Re: BUG #14245: Segfault on weird to_tsquery
Дата
Msg-id 1468356893.2574.7@smtp.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #14245: Segfault on weird to_tsquery  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: BUG #14245: Segfault on weird to_tsquery  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
On Tue, Jul 12, 2016 at 12:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> david@gravitext.com writes:
>>  I am doing some (fuzz) testing of full text queries and managed to
>>  generate the following case which causes a SEGFAULT on PostgreSQL
>> 9.6
>>  beta1 and beta2:
>>  select to_tsquery('!(a & !b) & c') as tsquery
>>  This weird query outputs the following on 9.5.2, instead of
>> crashing:
>>  "!( !'b' ) & 'c'"
>
> Note that while crashing is certainly not good, the pre-9.6 behavior
> can hardly be called correct either.  What happened to 'a'?

'a' is a stopword, dropped by to_tsquery() as described here:

https://www.postgresql.org/docs/9.6/static/textsearch-controls.html#TEXTSEARCH-PARSING-QUERIES
> The difference is that while basic tsquery input takes the tokens at
> face value, to_tsquery normalizes each token into a lexeme using the
> specified or default configuration, and discards any tokens that are
> stop words according to the configuration.

...and I believe I want this behavior.  Otherwise queries with stopword
in '&' condition will not match anything.  In truth I have no reason to
want to support this kind of weird double negative, on any version, and
will also look at filtering it out in my code before calling
to_tsquery().

It might be worth noting that these other slightly different cases are
fine on 9.6:

select to_tsquery('!(apple & !b) & c'); ---> !( 'appl' & !'b' ) & 'c'
select to_tsquery('!(apple & !a) & c'); ---> !'appl' & 'c'\

Clearly a pretty obscure case, but a crash nonetheless.

> Also, it looks like this is specific to to_tsquery; if you just feed
> the same thing to tsqueryin, it seems fine with it:
>
> # select '!(a & !b) & c'::tsquery;
>         tsquery
> -----------------------
>  !( 'a' & !'b' ) & 'c'
> (1 row)

Against another test table, English search config, I confirmed that 'a
& ball'::tsquery doesn't match anything, but to_tsquery('a & ball')
does.

Thanks,
David

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #14245: Segfault on weird to_tsquery
Следующее
От: Tom Lane
Дата:
Сообщение: Re: BUG #14245: Segfault on weird to_tsquery