Обсуждение: Unnecessary use of .* in examples

Поиск
Список
Период
Сортировка

Unnecessary use of .* in examples

От
PG Doc comments form
Дата:
The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/13/functions-matching.html
Description:

In the table for the ~ (and friends) operator, every example has a pointless
set of '.*' surrounding the text to be matched. These unnecessary operators
add visual clutter making the examples both harder to read and understand,
and since they're official examples, they teach bad habits.

That is to say, 'thomas' ~ 'thom' is the exact same regex as 'thomas' ~
'.*thom.*' but the first is shorter, easier to read and easier to
understand, and, presumably, faster as well.

Operator

Description

Example(s)

text ~ text → boolean

String matches regular expression, case sensitively

'thomas' ~ '.*thom.*' → t

text ~* text → boolean

String matches regular expression, case insensitively

'thomas' ~* '.*Thom.*' → t

text !~ text → boolean

String does not match regular expression, case sensitively

'thomas' !~ '.*thomas.*' → f

text !~* text → boolean

String does not match regular expression, case insensitively

'thomas' !~* '.*vadim.*' → t

Re: Unnecessary use of .* in examples

От
Laurenz Albe
Дата:
On Mon, 2021-02-01 at 05:46 +0000, PG Doc comments form wrote:
> Page: https://www.postgresql.org/docs/13/functions-matching.html
> Description:
> 
> In the table for the ~ (and friends) operator, every example has a pointless
> set of '.*' surrounding the text to be matched. These unnecessary operators
> add visual clutter making the examples both harder to read and understand,
> and since they're official examples, they teach bad habits.
> 
> That is to say, 'thomas' ~ 'thom' is the exact same regex as 'thomas' ~
> '.*thom.*' but the first is shorter, easier to read and easier to
> understand, and, presumably, faster as well.
> 
> Operator
> 
> Description
> 
> Example(s)
> 
> text ~ text → boolean
> 
> String matches regular expression, case sensitively
> 
> 'thomas' ~ '.*thom.*' → t
> 
> text ~* text → boolean
> 
> String matches regular expression, case insensitively
> 
> 'thomas' ~* '.*Thom.*' → t
> 
> text !~ text → boolean
> 
> String does not match regular expression, case sensitively
> 
> 'thomas' !~ '.*thomas.*' → f
> 
> text !~* text → boolean
> 
> String does not match regular expression, case insensitively
> 
> 'thomas' !~* '.*vadim.*' → t

I agree that that is comewhat confusing for people who understand
regular expressions.  On the other hand, the example should show some
special characters, so that people who don't know regular expressions
understand that this is more than substring matching.

Perhaps 'thomas' ~ '^thom' and so on?

Yours,
Laurenz Albe




Re: Unnecessary use of .* in examples

От
Tom Lane
Дата:
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> On Mon, 2021-02-01 at 05:46 +0000, PG Doc comments form wrote:
>> In the table for the ~ (and friends) operator, every example has a pointless
>> set of '.*' surrounding the text to be matched. These unnecessary operators
>> add visual clutter making the examples both harder to read and understand,
>> and since they're official examples, they teach bad habits.

> I agree that that is comewhat confusing for people who understand
> regular expressions.  On the other hand, the example should show some
> special characters, so that people who don't know regular expressions
> understand that this is more than substring matching.

> Perhaps 'thomas' ~ '^thom' and so on?

There are examples just a bit further down that include special
characters.  I agree with the OP that the useless ".*"s add nothing
except confusion as to the semantics; but I don't think we need these
very first examples to use a lot of bells and whistles.

Maybe what would be better is to have an example with embedded .*
such as 'thomas' ~ 't.*m'.

            regards, tom lane



Re: Unnecessary use of .* in examples

От
Laurenz Albe
Дата:
On Mon, 2021-02-01 at 10:08 -0500, Tom Lane wrote:
> > On Mon, 2021-02-01 at 05:46 +0000, PG Doc comments form wrote:
> > > In the table for the ~ (and friends) operator, every example has a pointless
> > > set of '.*' surrounding the text to be matched. These unnecessary operators
> > > add visual clutter making the examples both harder to read and understand,
> > > and since they're official examples, they teach bad habits.
> 
> > I agree that that is comewhat confusing for people who understand
> > regular expressions.  On the other hand, the example should show some
> > special characters, so that people who don't know regular expressions
> > understand that this is more than substring matching.
> 
> > Perhaps 'thomas' ~ '^thom' and so on?
> 
> There are examples just a bit further down that include special
> characters.  I agree with the OP that the useless ".*"s add nothing
> except confusion as to the semantics; but I don't think we need these
> very first examples to use a lot of bells and whistles.
> 
> Maybe what would be better is to have an example with embedded .*
> such as 'thomas' ~ 't.*m'.

I agree, and thanks for improving it in
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9522085ac917af66dba29939af328ae67300f10a

Yours,
Laurenz Albe