MINUS SIGN (U+2212) in EUC-JP encoding is mapped to FULLWIDTH HYPHEN-MINUS (U+FF0D) in UTF-8

Поиск
Список
Период
Сортировка
От Ashutosh Sharma
Тема MINUS SIGN (U+2212) in EUC-JP encoding is mapped to FULLWIDTH HYPHEN-MINUS (U+FF0D) in UTF-8
Дата
Msg-id CAE9k0Pmcq3cveGQ0de8nwj4nroamoNYLUY0tYs+oNOZ9ZRrv3A@mail.gmail.com
обсуждение исходный текст
Ответы Re: MINUS SIGN (U+2212) in EUC-JP encoding is mapped to FULLWIDTH HYPHEN-MINUS (U+FF0D) in UTF-8  (Amit Langote <amitlangote09@gmail.com>)
Re: MINUS SIGN (U+2212) in EUC-JP encoding is mapped to FULLWIDTH HYPHEN-MINUS (U+FF0D) in UTF-8  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Re: MINUS SIGN (U+2212) in EUC-JP encoding is mapped to FULLWIDTH HYPHEN-MINUS (U+FF0D) in UTF-8  (Tatsuo Ishii <ishii@sraoss.co.jp>)
Список pgsql-hackers
Hi All,

Today while working on some other task related to database encoding, I
noticed that the MINUS SIGN (with byte sequence a1-dd) in EUC-JP is
mapped to FULLWIDTH HYPHEN-MINUS (with byte sequence ef-bc-8d) in
UTF-8. See below:

postgres=# select convert('\xa1dd', 'euc_jp', 'utf8');
 convert
----------
 \xefbc8d
(1 row)

Isn't this a bug? Shouldn't this have been converted to the MINUS SIGN
(with byte sequence e2-88-92) in UTF-8 instead of FULLWIDTH
HYPHEN-MINUS SIGN.

When the MINUS SIGN (with byte sequence e2-88-92) in UTF-8 is
converted to EUC-JP, the convert functions fails with an error saying:
"character with byte sequence 0xe2 0x88 0x92 in encoding UTF8 has no
equivalent in encoding EUC_JP". See below:

postgres=# select convert('\xe28892', 'utf-8', 'euc_jp');
ERROR:  character with byte sequence 0xe2 0x88 0x92 in encoding "UTF8"
has no equivalent in encoding "EUC_JP"

However, when the same MINUS SIGN in UTF-8 is converted to SJIS
encoding, the convert function returns the correct result. See below:

postgres=# select convert('\xe28892', 'utf-8', 'sjis');
 convert
---------
 \x817c
(1 row)

Please note that the byte sequence (81-7c) in SJIS represents MINUS
SIGN in SJIS which means the MINUS SIGN in UTF8 got converted to the
MINUS SIGN in SJIS and that is what we expect. Isn't it?

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: Add Information during standby recovery conflicts