Обсуждение: BUG #12542: Incorrect behaviour of lower and upper on accented vocals in UTF8

Поиск
Список
Период
Сортировка

BUG #12542: Incorrect behaviour of lower and upper on accented vocals in UTF8

От
orsini@unive.it
Дата:
The following bug has been logged on the website:

Bug reference:      12542
Logged by:          Renzo Orsini
Email address:      orsini@unive.it
PostgreSQL version: 9.3.5
Operating system:   Mac OS X
Description:

When lower and upper are applied to UTF8 strings with accented letters, they
have an incorrect behaviour, for instance, upper('Autorità') returns
'AUTORITà' and not 'AUTORITÀ' as it should. Similarly, lower('AUTORITÀ')
returns lower('autoritÀ').

Re: BUG #12542: Incorrect behaviour of lower and upper on accented vocals in UTF8

От
Tom Lane
Дата:
orsini@unive.it writes:
> The following bug has been logged on the website:
> Bug reference:      12542
> Logged by:          Renzo Orsini
> Email address:      orsini@unive.it
> PostgreSQL version: 9.3.5
> Operating system:   Mac OS X
> Description:

> When lower and upper are applied to UTF8 strings with accented letters, they
> have an incorrect behaviour, for instance, upper('Autorità') returns
> 'AUTORITà' and not 'AUTORITÀ' as it should. Similarly, lower('AUTORITÀ')
> returns lower('autoritÀ').

Yeah, unfortunately, this is a bug in Mac OS X itself: the UTF8 locales
don't really work right.  You might have better luck if you can adopt an
ISO8859 encoding.

There has been some discussion of working around OS X's deficiencies
in this area, but it's a significant bit of work and hasn't been
done yet.

            regards, tom lane