Re: 9.5.3: substring: regex greedy operator not picking up chars as expected

Поиск
Список
Период
Сортировка
От David G. Johnston
Тема Re: 9.5.3: substring: regex greedy operator not picking up chars as expected
Дата
Msg-id CAKFQuwaAt6wYJQjKM9i-jm7hmfbi0ptiEt4SN8_vGQ43V+z-5Q@mail.gmail.com
обсуждение исходный текст
Ответ на 9.5.3: substring: regex greedy operator not picking up chars as expected  ("Foster, Russell" <Russell.Foster@crl.com>)
Ответы Re: 9.5.3: substring: regex greedy operator not picking up chars as expected
Список pgsql-bugs
=E2=80=8BWorking as documented.=E2=80=8B

https://www.postgresql.org/docs/9.5/static/functions-matching.html#POSIX-MA=
TCHING-RULES

Specifically, this implementation considers greediness at a level higher
than just the atom/expression - and in a mixed "branch" if there is a
non-greedy quantifier in a branch the entire branch is non-greedy and can
in many situations cause greedy atoms to behave non-greedily.

In might help to consider that there aren't really any explicit "greedy"
operators like other engines have (i.e., ??, ?, ?+) but rather non-greedy
(lazy) and default.  The default inherits the non-greedy trait from its
parent if applicable otherwise is behaves greedily.

On Mon, Aug 15, 2016 at 7:53 AM, Foster, Russell <Russell.Foster@crl.com>
wrote:

> Hello,
>
>
>
> For the following query:
>
>
>
> select substring('>772' from '.*?[0-9]+')
>

=E2=80=8BThe pattern itself is non-greedy=E2=80=8B due to their only being =
a single branch
and it having a non-greedy quantifier within it.

.*? matches ">" and [0-9]+ only needs a single character to generate a
non-greedy match conforming match


>
> I would expect the output to be =E2=80=98>772=E2=80=99, but it is =E2=80=
=98>7=E2=80=99.  You can also see
> the expected result on https://regex101.com/, although I am aware not all
> regex processors work the same.
>
>
>
> The following queries:
>
>
>
> select substring('>772' from '^.*?[0-9]+$')
>

=E2=80=8BThis is treated exactly the same as the above but because of the ^=
$ the
shortest possible output string is the entire string=E2=80=8B


>
> and:
>
>
>
> select substring('>772' from '[0-9]+')
>
>
>
> both return =E2=80=98>772=E2=80=99, which is expected.  Could the less gr=
eedy operator on
> the left (.*?) be affecting the more greedy right one (+)?
>
>
>

Typo here? I'm not fluent with substring(regex).

Anyway, the entire RE (single branch) is now greedy so the greedy [0-9]+
atom matches as many numbers as possible.

David J.

В списке pgsql-bugs по дате отправления:

Предыдущее
От:
Дата:
Сообщение: Re: BUG #14288: sd_notify not called on startup using rhel RPMs
Следующее
От: "Foster, Russell"
Дата:
Сообщение: Re: 9.5.3: substring: regex greedy operator not picking up chars as expected