Re: [HACKERS] UNICODE characters above 0x10000

Поиск

Список

Период

Сортировка

От	John Hansen
Тема	Re: [HACKERS] UNICODE characters above 0x10000
Дата	7 августа 2004 г. 10:56:20
Msg-id	5066E5A966339E42AA04BA10BA706AE5608C@rodrick.geeknet.com.au обсуждение исходный текст
Список	pgsql-patches

Дерево обсуждения

4 actually,
10FFFF needs four bytes:

11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
10FFFF = 00001010 11111111 11111111

Fill in the blanks, starting from the bottom, you get:
11110000 10101111 10111111 10111111

Regards,

John Hansen

-----Original Message-----
From: Christopher Kings-Lynne [mailto:chriskl@familyhealth.com.au]
Sent: Saturday, August 07, 2004 8:47 PM
To: Tom Lane
Cc: John Hansen; Hackers; Patches
Subject: Re: [HACKERS] UNICODE characters above 0x10000

> Now it's entirely possible that the underlying support is a few bricks

> shy of a load --- for instance I see that pg_utf_mblen thinks there
> are no UTF8 codes longer than 3 bytes whereas your code goes to 4.
> I'm not an expert on this stuff, so I don't know what the UTF8 spec
> actually says.  But I do think you are fixing the code at the wrong
level.

Surely there are UTF-8 codes that are at least 3 bytes.  I have a
_vague_ recollection that you have to keep escaping and escaping to get
up to like 4 bytes for some asian code points?

Chris

В списке pgsql-patches по дате отправления:

Предыдущее

От: Christopher Kings-Lynne
Дата: 07 августа 2004 г., 10:48:04
Сообщение: Re: [HACKERS] UNICODE characters above 0x10000

Следующее

От: Dennis Bjorklund
Дата: 07 августа 2004 г., 11:06:15
Сообщение: Re: [HACKERS] UNICODE characters above 0x10000

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [HACKERS] UNICODE characters above 0x10000

Предыдущее

Следующее