Re: Unicode + LC_COLLATE

Поиск
Список
Период
Сортировка
От Peter Eisentraut
Тема Re: Unicode + LC_COLLATE
Дата
Msg-id 200404221539.05444.peter_e@gmx.net
обсуждение исходный текст
Ответ на Unicode + LC_COLLATE  ("John Sidney-Woollett" <johnsw@wardbrook.com>)
Ответы Re: Unicode + LC_COLLATE  ("John Sidney-Woollett" <johnsw@wardbrook.com>)
Список pgsql-general
Am Donnerstag, 22. April 2004 13:17 schrieb John Sidney-Woollett:
> Does anyone know what the effect of --lc-collate=C --encoding=UNICODE will
> be for sorts (and indexes?) when a multibyte unicode character is
> encountered?

You get your strings sorted in binary order of the UTF-8 encoding, which is
probably not very interesting, but it's possible.

> Is it also true that if LC_COLLATE != 'C' that indexes cannot be used for
> LIKE comparisons (and is this also true for en_US.iso885915)?

No, see <http://www.postgresql.org/docs/7.4/static/indexes-opclass.html>.

> Our database is UNICODE with LC_COLLATE=en_US.iso885915. Does anyone know
> what the effect of someone storing a cyrillic/chinese or korean character
> is?

This setup will result in UTF-8 characters being sorted by the system thinking
they are actually ISO-8859-15 characters.  So the result will be random at
best.

> (We are using JDBC with a webapp so all the unicode concerns are
> handled transparently, apparantly). When the data is extracted from the DB
> will it render correctly in the browser provided we send all responses
> encoded in UTF-8?

If your database is in UNICODE and you're using JDBC then you should be all
set as far as PostgreSQL is concerned.  Of course, your HTML pages need to
declare the encoding correctly as well.

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Unicode + LC_COLLATE
Следующее
От: "Priem, Alexander"
Дата:
Сообщение: Re: Unicode problem ???