Re: Scadinavian characters in regular expressions

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Scadinavian characters in regular expressions
Дата
Msg-id 28561.1018359235@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Scadinavian characters in regular expressions  (Søren Vainio <sva@Netpointers.com>)
Список pgsql-sql
Søren Vainio <sva@Netpointers.com> writes:
> Using \s does produce FALSE for SELECT 'one� two three' ~
> '^[^\s]+[\s][^\s]+$';
> But it also produces FALSE for any two-word string ex:
> SELECT 'one two' ~ '^[^\s]+[\s][^\s]+$'; where I would expect TRUE???
> (I am using PostgreSQL 7.1.3)

I do not believe that Postgres' regular expression engine recognizes \s
as meaning anything except "s".  See
http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/functions-matching.html

In the above, it's even worse: the backslashes were eaten by the
string-literal parser, so what arrived at the RE engine was just
^[^s]+[s][^s]+$ ... not likely to produce what you wanted.

As for the original issue, I wonder whether you are storing the string
as UTF-8 or Latin1 encoding.  I have a suspicion that the � (å
å a-ring) is actually a multibyte sequence inside the database
and for some reason Postgres isn't configured to recognize it as a
single logical character.
        regards, tom lane


В списке pgsql-sql по дате отправления:

Предыдущее
От: Christopher Kings-Lynne
Дата:
Сообщение: Re: Hierarchical Queries
Следующее
От: Søren Vainio
Дата:
Сообщение: Re: Scadinavian characters in regular expressions