Re: Unicode + LC_COLLATE

Поиск

Список

Период

Сортировка

От	Peter Eisentraut
Тема	Re: Unicode + LC_COLLATE
Дата	22 апреля 2004 г. 10:41:34
Msg-id	200404221539.05444.peter_e@gmx.net обсуждение исходный текст
Ответ на	Unicode + LC_COLLATE ("John Sidney-Woollett" <johnsw@wardbrook.com>)
Ответы	Re: Unicode + LC_COLLATE
Список	pgsql-general

Дерево обсуждения

Am Donnerstag, 22. April 2004 13:17 schrieb John Sidney-Woollett:
> Does anyone know what the effect of --lc-collate=C --encoding=UNICODE will
> be for sorts (and indexes?) when a multibyte unicode character is
> encountered?

You get your strings sorted in binary order of the UTF-8 encoding, which is
probably not very interesting, but it's possible.

> Is it also true that if LC_COLLATE != 'C' that indexes cannot be used for
> LIKE comparisons (and is this also true for en_US.iso885915)?

No, see <http://www.postgresql.org/docs/7.4/static/indexes-opclass.html>.

> Our database is UNICODE with LC_COLLATE=en_US.iso885915. Does anyone know
> what the effect of someone storing a cyrillic/chinese or korean character
> is?

This setup will result in UTF-8 characters being sorted by the system thinking
they are actually ISO-8859-15 characters.  So the result will be random at
best.

> (We are using JDBC with a webapp so all the unicode concerns are
> handled transparently, apparantly). When the data is extracted from the DB
> will it render correctly in the browser provided we send all responses
> encoded in UTF-8?

If your database is in UNICODE and you're using JDBC then you should be all
set as far as PostgreSQL is concerned.  Of course, your HTML pages need to
declare the encoding correctly as well.

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Unicode + LC_COLLATE