Re: Collate order on Mac OS X, text with diacritics in UTF-8

Поиск
Список
Период
Сортировка
От Martijn van Oosterhout
Тема Re: Collate order on Mac OS X, text with diacritics in UTF-8
Дата
Msg-id 20100113220218.GB23892@svana.org
обсуждение исходный текст
Ответ на Re: Collate order on Mac OS X, text with diacritics in UTF-8  (Martin Flahault <martin@billjobs.com>)
Ответы Re: Collate order on Mac OS X, text with diacritics in UTF-8  (Craig Ringer <craig@postnewspapers.com.au>)
Список pgsql-general
On Wed, Jan 13, 2010 at 04:15:06PM +0100, Martin Flahault wrote:

[postgres]
> newbase=# select * from t1 order by contenu;
>  contenu
> ---------
>  A
>  E
>  a
>  e

Postgresql outputs whatever the C library does on the underlying
system. The quality of this varies wildly.
>  à
> As with others DBMS (MySQL for example), diacritics should be ignored when determining the sort order. Here is the
expectedoutput: 

MySQL implements the unicode collation algorithm, which means it
essentially does what you want.
>
> It seems there is a problem with the collating order on BSD systems with diacritics using UTF8.

Last I checked, BSD did not support useful sorting on UTF-8 at all, so
it's not surprised it doesn't work.

> in a UTF8 text file and use the "sort" command on it, you will have the same wrong output as with PostgreSQL :

Yes, that's the basic idea. Mac OS X apparently provides ICU underneath
for programs that would like true unicode collation, but there is
little chance that postgresql will ever use this.

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

Вложения

В списке pgsql-general по дате отправления:

Предыдущее
От: Vincenzo Romano
Дата:
Сообщение: R: Re: R: Re: R: Re: Weird EXECUTE ... USING behaviour
Следующее
От: Scott Mead
Дата:
Сообщение: Re: R: Re: R: Re: Weird EXECUTE ... USING behaviour