Re: [WIP] collation support revisited (phase 1)

Поиск
Список
Период
Сортировка
От Zdenek Kotala
Тема Re: [WIP] collation support revisited (phase 1)
Дата
Msg-id 4885EF87.4020608@sun.com
обсуждение исходный текст
Ответ на Re: [WIP] collation support revisited (phase 1)  (Martijn van Oosterhout <kleptog@svana.org>)
Ответы Re: [WIP] collation support revisited (phase 1)  (Martijn van Oosterhout <kleptog@svana.org>)
Список pgsql-hackers
Martijn van Oosterhout napsal(a):
> On Sat, Jul 12, 2008 at 10:02:24AM +0200, Zdenek Kotala wrote:
>> Background:
>> We specify encoding in initdb phase. ANSI specify repertoire, charset, 
>> encoding and collation. If I understand it correctly, then charset is 
>> subset of repertoire and specify list of allowed characters for 
>> language->collation. Encoding is mapping of character set to binary format. 
>> For example for Czech alphabet(charset) we have 6 different encoding for 
>> 8bit ASCII, but on other side for UTF8 there is specified multi charsets.
> 
> Oh, so you're thinking of a charset as a sort of check constraint. If
> your locale is turkish and you have a column marked charset ASCII then
> storing lower('HI') results in an error.

Yeah, if you use strcoll function it fails when illegal character is found.
See
http://www.opengroup.org/onlinepubs/009695399/functions/strcoll.html

> A collation must be defined over all possible characters, it can't
> depend on the character set. That doesn't mean sorting in en_US must do
> something meaningful with japanese characters, it does mean it can't
> throw an error (the usual procedure is to sort on unicode point).

Collation cannot be defined on any character. There is not any relation between
Latin and Chines characters. Collation has sense when you are able to specify <  = > operators.

If you need compare Japanese and Latin characters then ansi specify default 
collation for each repertoire. I think it is usually bitwise comparing.

    Zdenek

-- 
Zdenek Kotala              Sun Microsystems
Prague, Czech Republic     http://sun.com/postgresql



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: pltcl_*mod commands are broken on Solaris 10
Следующее
От: Andrew Sullivan
Дата:
Сообщение: Re: [patch] plproxy v2