Обсуждение: BUG #5257: wrong results of SUBSTRING with SQL regular expressions
The following bug has been logged online:
Bug reference: 5257
Logged by: Roman Kononov
Email address: kononov@ftml.net
PostgreSQL version: 8.4.2
Operating system: GNU/Linux x86_64
Description: wrong results of SUBSTRING with SQL regular expressions
Details:
test=# select substring('34' from '(2|3)#"4#"' for '#');
substring
-----------
3
(1 row)
test=# select substring('^' from '#"^#"' for '#');
substring
-----------
(1 row)
test=# select substring('$' from '#"$#"' for '#');
substring
-----------
(1 row)
These look wrong according to the PG documentation.
"Roman Kononov" <kononov@ftml.net> writes:
> test=# select substring('34' from '(2|3)#"4#"' for '#');
> substring
> -----------
> 3
> (1 row)
Hmm. I guess we need to translate ( and ) to non-capturing parens.
> test=# select substring('^' from '#"^#"' for '#');
> substring
> -----------
>
> (1 row)
> test=# select substring('$' from '#"$#"' for '#');
> substring
> -----------
>
> (1 row)
These cases are already fixed in HEAD.
http://archives.postgresql.org/pgsql-committers/2009-10/msg00048.php
regards, tom lane
I don't know what SQL2008 says about SUBSTRING, but SQL2003 says in
ISO/IEC 9075-2:2003 (E), 6.29 <string value function>, General Rules,
5), page 261:
d) If R [the regular expression] does not contain exactly two
occurrences of the two-character sequence consisting of E [the escape
character], each immediately followed by <double quote>, then an
exception condition is raised: data exception - invalid use of escape
character.
This means that the following is wrong:
test-std=# select substring('a' from 'a' for '#');
substring
-----------
a
(1 row)