Обсуждение: Character set conversion

Поиск
Список
Период
Сортировка

Character set conversion

От
Bastiaan Olij
Дата:
Hi All,

I have a client application that uses an 8 bit character set that is not
supported by Postgresql. I'm using UTF-8 to store data within my
database and would like to create a character set conversion converting
between my native set and Postgresql. I have all the information I need
as far as which 8bit value should be mapped to what UTF-8 'character'.

I read in the documentation about the 'Create conversion' command
writing a function to do the conversion job. Is this the best way
forward or are there better ways to attempt this? Is there any sample
code available for implementing such a conversion? I don't want to
reinvent the wheel here...

--
Kindest Regards,

Bastiaan Olij
e-mail/MSN: bastiaan@basenlily.nl
web: http://www.basenlily.nl
Skype: Mux213
http://www.linkedin.com/in/bastiaanolij


Re: Character set conversion

От
Tom Lane
Дата:
Bastiaan Olij <lists@basenlily.nl> writes:
> I read in the documentation about the 'Create conversion' command
> writing a function to do the conversion job. Is this the best way
> forward or are there better ways to attempt this? Is there any sample
> code available for implementing such a conversion? I don't want to
> reinvent the wheel here...

Look into the PG source code under
src/backend/utils/mb/conversion_procs.

While an add-on conversion procedure isn't too hard, I don't think
there's any way to define a whole new encoding without modifying the
source code --- the encodings are listed in some hard-coded tables
in the C code rather than being defined by a system catalog.  It
wouldn't be too hard if you don't mind running a custom Postgres
build; but if you do, then the best answer might be to cannibalize
one of the existing encoding names and just replace its conversion
procedures.

            regards, tom lane

Re: Character set conversion

От
"Daniel T. Staal"
Дата:
On Mon, July 21, 2008 10:16 am, Tom Lane wrote:

> It wouldn't be too hard if you don't mind running a custom Postgres
> build; but if you do, then the best answer might be to cannibalize one of
> the existing encoding names and just replace its conversion procedures.

Actually, I'd recommend not doing that.  Instead, put in your encoding as
a new encoding, and send a diff of your changes to the Postgres team.  I
bet next release it would be a supported encoding, and you would be able
to upgrade without any issues...

Daniel T. STaal

---------------------------------------------------------------
This email copyright the author.  Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes.  This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------