Microsoft harmful extensions to 8859-X charsets (was: Continuing encoding fun....)

Поиск
Список
Период
Сортировка
От Marc Herbert
Тема Microsoft harmful extensions to 8859-X charsets (was: Continuing encoding fun....)
Дата
Msg-id 878xveyw4w.fsf@meije.emic.fr
обсуждение исходный текст
Ответ на Re: Continuing encoding fun....  ("Dave Page" <dpage@vale-housing.co.uk>)
Список pgsql-odbc
"Dave Page" <dpage@vale-housing.co.uk> writes:

>> By the way 0x8A is not in the range of latin4
>> <http://czyborra.com/charsets/iso8859.html#ISO-8859-4>
>
> http://www.gar.no/home/mats/8859-4.htm says differently, however, I
> can't claim to know enough about encoding issues to refute
> either. I've been forced to learn what I can about the subject to help
> maintain this driver and certainly may have got the wrong end of the
> stick on one or more points!

The page from gar.no is just a dump of the *Microsoft-extended* latin4
charset.

The standards comittee carefully left a gap in all LATIN-X charsets
between 0x80 and 0x9F, because those characters become (harmful)
control characters once stripped of their 8th bit (by accident).
You can see that very clearly in this table for instance
 <http://en.wikipedia.org/wiki/ISO_8859-4>

If you follow the links from gar.no itself, you can land here:
<http://en.wikipedia.org/wiki/ISO_8859> with tons of links (like the
ECMA standards for instance) showing this gap.

Microsoft, being Microsoft, jumped in that gap. Those non-standard
Microsoft characters now plague the web as clearly explained here:

<http://home.earthlink.net/~bobbau/platforms/specialchars/#windows>
or here:
<http://www.cs.tut.fi/~jkorpela/www/windows-chars.html>



В списке pgsql-odbc по дате отправления:

Предыдущее
От: "Dave Page"
Дата:
Сообщение: Re: Postgresql odbc and Visual studio 2005 .net 2.0
Следующее
От: Marc Herbert
Дата:
Сообщение: Re: Continuing encoding fun....