BUG #5743: Regexp engine fails to case-insensitively match multi-byte codepoints

Поиск
Список
Период
Сортировка
От Vlad Romascanu
Тема BUG #5743: Regexp engine fails to case-insensitively match multi-byte codepoints
Дата
Msg-id 201011040048.oA40md61095262@wwwmaster.postgresql.org
обсуждение исходный текст
Ответы Re: BUG #5743: Regexp engine fails to case-insensitively match multi-byte codepoints  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
The following bug has been logged online:

Bug reference:      5743
Logged by:          Vlad Romascanu
Email address:      vromascanu@accurev.com
PostgreSQL version: 8.4.3
Operating system:   Windows, Linux
Description:        Regexp engine fails to case-insensitively match
multi-byte codepoints
Details:

Already reported in 2006 but seems to have fallen through the cracks (I can
find no followup.)  Problem still exists in v8.4.3.

Problem still appears to be pg_wc_tolower downcasting to char before calling
tolower() (instead of calling towlower().)

This one of several inconsistencies unfortunately still present in
case-insensitive regexp vs. LOWER(str) [str_lower] treatment (including char
to wchar conversion using MultiByteToWideChar/mbstowcs vs. char2wchar, or
towlower vs. pg_wc_tolower.)

Current workaround is to use LOWER(str) ~ LOWER('regexp').

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Dirk Heinrichs
Дата:
Сообщение: Re: BUG #5740: contrib/spi/moddatetime.c doesn't work with timezones.
Следующее
От: Tom Lane
Дата:
Сообщение: Re: BUG #5743: Regexp engine fails to case-insensitively match multi-byte codepoints