Re: FW: Character set equivalent for AL32UTF8

Поиск
Список
Период
Сортировка
От Mridul Mathew
Тема Re: FW: Character set equivalent for AL32UTF8
Дата
Msg-id CAFm5QJwk3N5+5g2whZBUee2jHmNssxO98ybBreDLLUDQHh7zuA@mail.gmail.com
обсуждение исходный текст
Ответ на Character set equivalent for AL32UTF8  (RBharathi <rajeshwarbharathi@gmail.com>)
Ответы Re: FW: Character set equivalent for AL32UTF8  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Список pgsql-admin
Hello Craig,

Thanks for the response. You are correct in that the difference between al32utf8 and utf8 is in better support for supplementary characters with al32utf8.

If supplementary characters are inserted in a UTF8 database, they will be treated as 2 separate undefined characters, occupying 6 bytes in storage. Oracle recommends using al32utf8 for any newly defined supplementary characters.

Does PostgreSQL make a distinction within Unicode in a similar fashion? We have not tested our Oracle al32utf8 databases on PostgreSQL, but while creating databases in PostgreSQL, we see UTF8 as an option, but not al32.

Thanks,
Mridul.

On Wed, Aug 10, 2011 at 1:26 PM, Mridul Mathew <mmathew@fiberlink.com> wrote:

 

 

From: Rajeshwar Bharathi [mailto:rajeshwarbharathi@gmail.com]
Sent: Wednesday, August 10, 2011 1:14 PM
To: Mridul Mathew
Subject: Fwd: [ADMIN] Character set equivalent for AL32UTF8

 

 

---------- Forwarded message ----------
From: Craig Ringer <ringerc@ringerc.id.au>
Date: Wed, Aug 10, 2011 at 11:49 AM
Subject: Re: [ADMIN] Character set equivalent for AL32UTF8
To: pgsql.admin@googlegroups.com
Cc: RBharathi <rajeshwarbharathi@gmail.com>, pgsql-admin@postgresql.org


On 2/08/2011 8:52 PM, RBharathi wrote:

Hi,
We plan to migrate data from Oracle 11g with characterset AL32UTF8 to a Postgres db.

What is the euivalent charecterset to use in Postgress. We see only the UTF-8 option.


What's AL32UTF8 ? That's not a standard charset name or widely recognised charset. Is it some Oracle specific feature? If so, what makes it different to UTF-8 and why do you need it?

Documentation link? References?

A 30-second Google search turned up this:

http://decipherinfosys.wordpress.com/2007/01/28/difference-between-utf8-and-al32utf8-character-sets-in-oracle/

"As far as these two character sets go in Oracle,  the only difference between AL32UTF8 and UTF8 character sets is that AL32UTF8 stores characters beyond U+FFFF as four bytes (exactly as Unicode defines UTF-8). Oracle’s “UTF8” stores these characters as a sequence of two UTF-16 surrogate characters encoded using UTF-8 (or six bytes per character).  Besides this storage difference, another difference is better support for supplementary characters in AL32UTF8 character set."


Is this what you're taking about? If so, what's the concern? Have you checked to see if PostgreSQL's behavior fits your needs?


--
Craig Ringer




--
Rajeshwar BM
Bangalore INDIA



Fiberlink Disclaimer: The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.

В списке pgsql-admin по дате отправления:

Предыдущее
От: Vladimir Protasov
Дата:
Сообщение: Re: Read-only postgres instance
Следующее
От: "Kevin Grittner"
Дата:
Сообщение: Re: FW: Character set equivalent for AL32UTF8