Re: Encoding Conversion

Поиск
Список
Период
Сортировка
От Rick Gigger
Тема Re: Encoding Conversion
Дата
Msg-id 44621F81.8060701@alpinenetworking.com
обсуждение исходный текст
Ответ на Re: Encoding Conversion  (jef peeraer <jef.peeraer@pandora.be>)
Список pgsql-general
jef peeraer wrote:
> beer schreef:
>> Hello All
>>
>> So I have an old database that is ASCII_SQL encoded.  For a variety
>> of reasons I need to convert the database to UNICODE.  I did some
>> googling on this but have yet to find anything that looked like a
>> viable option, so i thought I'd post to the group and see what sort
>> of advice might arise. :)
> well i recently struggled with the same problem. After a lot of trial
> and error and reading, it seems that an ascii encoded database can't
> use its client encoding capabilities ( set client_encoding to utf8 ).
> i think the easist solution is to do a dump, recreate the database
> with a proper encoding, and restore the dump.
>
> jef peeraer
>>
>> TIA
>>
>> -b
>>
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 1: if posting/reading through Usenet, please send an appropriate
>>        subscribe-nomail command to majordomo@postgresql.org so that your
>>        message can get through to the mailing list cleanly
>>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>
>

In my experience ASCII_SQL will let you put anything in there.  You need
to figure out the actual encoding of the data.  Is it LATIN1?  Is it
UTF-8?  UTF-16?  I found that my old ASCII_SQL dbs, before they were
converted to unicode, contained 99.9% LATIN1 chars but also had a few
random weird characters thrown in from people copying and pasting from
office.  For instance MS Word uses these non-ascii standard characters
to implement it's "magic quotes" or whatever they call it where the
quotes curl in towards each other.

I had to identify what the bad chars were.  I think that viewing the
dump in vi showed me the hex codes for the non-ascii chars.  Then I
changed the encoding specified at the top of the dump as LATIN1.  Then I
used sed to remove them as I piped it into a postgres unicode db.

Rick


В списке pgsql-general по дате отправления:

Предыдущее
От: "Bruno Almeida do Lago"
Дата:
Сообщение: Friendly catalog views
Следующее
От: "ftoliveira"
Дата:
Сообщение: PG_CONFIG MISSING