Re: Multibyte still broken

Поиск
Список
Период
Сортировка
От Michael Robinson
Тема Re: Multibyte still broken
Дата
Msg-id 200005111756.BAA10220@netrinsics.com
обсуждение исходный текст
Ответ на Re: Multibyte still broken  (Tatsuo Ishii <t-ishii@sra.co.jp>)
Список pgsql-hackers
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
>I am supprised to hear that you have so poor quality tools that
>produce illegal code sequences of Simplified Chinese. In Japan, as far
>as I know, we never have such a low quality tools which generate
>illegal Japanese charaters just because they are not accepted in the
>market, even in the case of email attachments, or cut-and-past or
>whatever.

The problem is not that the tools produce "illegal characters".  The problem
is that, as an EUC code, GB permits the coexistance of standard ascii
characters with double-byte hanzi characters.  Furthermore, most Chinese 
software is an operating-system "hack" on top of English-language software
based on a Latin-1 character set (the Chinese software market is underserved
compared to Japan, so we have to cope as best we can).

The result is that it is possible to, for example, insert a carriage return
or ASCII comma into the middle of a hanzi, which breaks the alignment for all 
the hanzi on the rest of the line.  It's also possible, in non-native Chinese
applications, to select one byte of a hanzi character in a cut or copy 
operation.

So the problem is that the tools do not uniformly respect the integrity of
a double-byte hanzi character, but rather treat it as two individual Latin-1
characters.

The important point, though, is that all tools, whether native Chinese or
"hacked" English, accept the resulting invalid code sequences consistently,
robustly, and without complaint.
-Michael



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Orphaned locks in 7.0?
Следующее
От: Michael Robinson
Дата:
Сообщение: Re: Multibyte still broken