Re: Notes about fixing regexes and UTF-8 (yet again)

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Notes about fixing regexes and UTF-8 (yet again)
Дата
Msg-id 13378.1329626515@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Notes about fixing regexes and UTF-8 (yet again)  (Vik Reykja <vikreykja@gmail.com>)
Список pgsql-hackers
Vik Reykja <vikreykja@gmail.com> writes:
> On Sun, Feb 19, 2012 at 05:03, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Sat, Feb 18, 2012 at 10:38 PM, Vik Reykja <vikreykja@gmail.com> wrote:
>>> Does it make sense for regexps to have collations?

>> As I understand it, collations determine the sort-ordering of strings.
>> Regular expressions don't care about that.  Why do you ask?

> Perhaps I used the wrong term, but I was thinking the locale could tell us
> what alphabet we're dealing with. So a regexp using en_US would give
> different word-boundary results from one using zh_CN.

Our interpretation of a "collation" is that it sets both LC_COLLATE and
LC_CTYPE.  Regexps may not care about the first but they definitely care
about the second.  This is why the stuff in regc_pg_locale.c pays
attention to collation.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Initial 9.2 pgbench write results
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Future of our regular expression code