Re: Vague idea for allowing per-column locale

Поиск

Список

Период

Сортировка

От	Tim Allen
Тема	Re: Vague idea for allowing per-column locale
Дата	14 августа 2001 г. 01:36:36
Msg-id	Pine.LNX.4.21.0108141214570.22874-100000@bee.proximity.com.au обсуждение исходный текст
Ответ на	Re: Vague idea for allowing per-column locale (Tatsuo Ishii <t-ishii@sra.co.jp>)
Ответы	Re: Vague idea for allowing per-column locale (Tatsuo Ishii <t-ishii@sra.co.jp>)
Список	pgsql-hackers

Дерево обсуждения

On Tue, 14 Aug 2001, Tatsuo Ishii wrote:

> Storing everything as Unicode is not a good idea, actually. First,
> Unicode tends to consume more storage space than other character
> sets. For example, UTF-8, one of the most commonly used encoding for
> Unicode consumes 3 bytes for Japanese characters, while SJIS only
> consumes 2 bytes. Second, a round trip converison between Unicode and
> other character sets is not always possible. Third, sorting
> issue. There is no convenient way to sort Unicode correctly.

UTF-16 can handle most Japanese characters in two bytes, afaict. Generally
it seems that utf8 encodes European text more efficiently on average,
whereas utf16 is better for most Asian languages. I may be mistaken, but I
was under the impression that sorting of unicode characters was a solved
problem. The IBM ICU class library (which does have a C interface), for
example, claims to provide everything you need to sort unicode text in
various locales, and uses utf16 internally:

http://oss.software.ibm.com/developerworks/opensource/icu/project/index.html

The licence is, I gather, the X licence, which presumably is compatible
enough with BSD; not that I would necessarily advocate building this into
postgres at a fundamental level, but it demonstrates that it can be done.

Note that I'm not speaking from experience here, I've just read the docs,
and a book on unicode, never actually performed a Japanese-language (or
any other non-English language) sort, so no need to take me too seriously
:).

> Tatsuo Ishii

Tim

-- 
-----------------------------------------------
Tim Allen          tim@proximity.com.au
Proximity Pty Ltd  http://www.proximity.com.au/ http://www4.tpg.com.au/users/rita_tim/

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Larry Rosenman
Дата: 14 августа 2001 г., 01:26:42
Сообщение: Re: OID unsigned long long

Следующее

От: Tom Lane
Дата: 14 августа 2001 г., 01:37:14
Сообщение: Re: OID unsigned long long

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Vague idea for allowing per-column locale

Предыдущее

Следующее