Re: MINUS SIGN (U+2212) in EUC-JP encoding is mapped to FULLWIDTH HYPHEN-MINUS (U+FF0D) in UTF-8

Поиск

Список

Период

Сортировка

От	Kyotaro Horiguchi
Тема	Re: MINUS SIGN (U+2212) in EUC-JP encoding is mapped to FULLWIDTH HYPHEN-MINUS (U+FF0D) in UTF-8
Дата	30 октября 2020 г. 06:28:51
Msg-id	20201030.122851.538415294986124838.horikyota.ntt@gmail.com обсуждение исходный текст
Ответ на	Re: MINUS SIGN (U+2212) in EUC-JP encoding is mapped to FULLWIDTH HYPHEN-MINUS (U+FF0D) in UTF-8 (Amit Langote <amitlangote09@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

At Fri, 30 Oct 2020 12:08:51 +0900, Amit Langote <amitlangote09@gmail.com> wrote in 
> I noticed that the commit a8bd7e1c6e02 from ages ago removed
> conversions from and to utf-8's e28892, in favor of efbc8d, and that
> change has stuck.  (Note though that these maps looked pretty
> different back then.)
> 
> --- a/src/backend/utils/mb/Unicode/euc_jp_to_utf8.map
> +++ b/src/backend/utils/mb/Unicode/euc_jp_to_utf8.map
> -  {0xa1dd, 0xe28892},
> +  {0xa1dd, 0xefbc8d},
> 
> --- a/src/backend/utils/mb/Unicode/utf8_to_euc_jp.map
> +++ b/src/backend/utils/mb/Unicode/utf8_to_euc_jp.map
> -  {0xe28892, 0xa1dd},
> +  {0xefbc8d, 0xa1dd},
> 
> Can't tell what reason there was to do that, but there must have been
> some.  Maybe the Japanese character sets prefer full-width hyphen
> minus (unicode U+FF0D) over mathematical minus sign (U+2212)?

It's a decsion made by Microsoft.  Several other characters are in
similar issues. I remember many people complained but in the end that
wasn't "fixed" and led to the well-known conversion messes of Japanese
character conversion involving Unicode in Java.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Fujii Masao
Дата: 30 октября 2020 г., 06:25:10
Сообщение: Re: Add Information during standby recovery conflicts

Следующее

От: Amit Kapila
Дата: 30 октября 2020 г., 06:47:00
Сообщение: Re: Enumize logical replication message actions

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: MINUS SIGN (U+2212) in EUC-JP encoding is mapped to FULLWIDTH HYPHEN-MINUS (U+FF0D) in UTF-8

Предыдущее

Следующее