Re: [HACKERS] new function for tsquery creartion

Поиск
Список
Период
Сортировка
От Victor Drobny
Тема Re: [HACKERS] new function for tsquery creartion
Дата
Msg-id 3e1b0851d3a6a2da42f78d31cc241d0b@postgrespro.ru
обсуждение исходный текст
Ответ на Re: [HACKERS] new function for tsquery creartion  (Alexey Chernyshov <a.chernyshov@postgrespro.ru>)
Список pgsql-hackers
On 2017-10-13 16:37, Alexey Chernyshov wrote:
> Hi all,
> I am extending phrase operator <n> is such way that it will have <n,m>
> syntax that means from n to m words, so I will use such syntax (<n,m>)
> further. I found that a AROUND(N) b is exactly the same as a <-N,N> b
> and it can be replaced while parsing. So, what do you think of such
> idea? In this patch I have noticed some unobvious behavior.

Thank you for the interest and review!

> # select to_tsvector('Hello, cat world!') @@ queryto_tsquery('cat
> AROUND(1) cat') as match;
> match
> -------
>  t
> 
> cat AROUND(1) cat is the same is "cat <1> cat || cat <0> cat" and:
> 
> # select to_tsvector('Hello, cat world!') @@ to_tsquery('cat <0> cat');
>  ?column?
> -------
>  t
> 
> It seems to be a proper logic behavior but it is a possible pitfall,
> maybe it should be documented?

It is a tricky question. I think that this interpretation is confusing, 
so
better to make it as <-N, -1> and <1, N>.

> But more important question is how AROUND() operator should handle stop
> words? Now it works as:
> 
> # select queryto_tsquery('cat <2> (a AROUND(10) rat)');
>  queryto_tsquery
> ------------------
>  'cat' <12> 'rat'
> (1 row)
> 
> # select queryto_tsquery('cat <2> a AROUND(10) rat');
>     queryto_tsquery
> ------------------------
>  'cat' AROUND(12) 'rat'
> (1 row)
> 
> In my opinion it should be like:
> cat <2> (a AROUND(10) rat) == cat <2,2> (a <-10,10> rat) == cat <-8,12>
> rat

I think that correct version is:
cat <2> (a AROUND(10) rat) == cat <2,2> (a <-10,10> rat) == cat <-2,12> 
rat.

> cat <2> a AROUND(10) rat == cat <2,2> a <-10,10> rat = cat <-8, 12>
> rat

It is a problem indeed. I did not catch it during implementation. Thank 
you
for pointing it out.

> Now <n,m> operator can be replaced with combination of phrase
> operator <n>, AROUND(), and logical operators, but with <n,m> operator
> it will be much painless. Correct me, please, if I am wrong.

I think that <n,m> operator is more general than around(n) so the last 
one
should be based on yours. However, i think, that taking negative 
parameters
is not the best idea because it is confusing. On top of that it is not 
so
necessary and i think it won`t be popular among users.
It seems to me that AROUND operator can be easily implemented with 
<n,m>,
also, it helps to avoid problems, that you showed above.

-- 
Victor Drobny
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexey Chernyshov
Дата:
Сообщение: Re: [HACKERS] new function for tsquery creartion
Следующее
От: Magnus Hagander
Дата:
Сообщение: Re: [HACKERS] v10 bottom-listed