Re: Our regex vs. POSIX on "longest match"

Поиск
Список
Период
Сортировка
От Brendan Jurd
Тема Re: Our regex vs. POSIX on "longest match"
Дата
Msg-id CADxJZo1fbE9FA+pW89dNqqiPpLstSxYKug9TLQcS_q+J7wF+_A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Our regex vs. POSIX on "longest match"  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On 5 March 2012 17:23, Robert Haas <robertmhaas@gmail.com> wrote:
> This is different from what Perl does, but I think Perl's behavior
> here is batty: given a+|a+b+ and the string aaabbb, it picks the first
> branch and matches only aaa.

Yeah, this is sometimes referred to as "ordered alternation",
basically that the branches of the alternation are prioritised in the
same order in which they are described.  It is fairly commonplace
among regex implementations.

> apparently, it selects the syntactically first
> branch that can match, regardless of the length of the match, which
> strikes me as nearly pure evil.

As long as it's documented that alternation prioritises in this way, I
don't feel upset about it.  At least it still provides you with a
sensible way to get whatever you want from your RE; if you want a
shorter alternative to be preferred, put it up the front.  Ordered
alternation also gives you a way to specify which of several
same-length alternatives you would prefer to be matched, which can
come in handy.  It also means you can specify less-complex
alternatives before more-complex ones, which can have performance
advantages.

I do agree with you that if you *don't* do ordered alternation, then
it is right to treat alternation as greedy by default.

Cheers,
BJ


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Shigeru Hanada
Дата:
Сообщение: Re: pgsql_fdw, FDW for PostgreSQL server
Следующее
От: Gregg Jaskiewicz
Дата:
Сообщение: Re: autovacuum locks