retry: converting ASCII to UTF-8

Поиск
Список
Период
Сортировка
От Tom Hart
Тема retry: converting ASCII to UTF-8
Дата
Msg-id 4797A468.1040800@coopfed.org
обсуждение исходный текст
Ответы Re: retry: converting ASCII to UTF-8  ("Albe Laurenz" <laurenz.albe@wien.gv.at>)
Список pgsql-general
I didn't see this come through the first time, so I'm retrying. I
apologize if this comes through twice.
------------------

Hello everybody. I hope your week's going well so far.

I built our data mine in postgreSQL around 3 months ago and I've been
working with it since. Postgres is great and I'm really enjoying it, but
I've hit a bit of a hitch. Originally (and against pgAdmin's good
advice, duh!) I set up the database to use ASCII encoding. However we
have a large base of Spanish speaking members and services, and we need
utf-8 encoding to maintain and support the extended character sets. In
my naivety I thought it would be a relatively simple process to convert
the db but I've found this to not be the case. I tried doing a dump and
restore into a new database with the proper encoding, but pg_restore is
getting hung up on one of the tables, our largest by far (~1gb, not huge
I know). When I tried pg_restore from a command line (I was using
pgAdmin, I know i'm a nub) I received this error.

C:\Program Files\PostgreSQL\8.2\bin>pg_restore.exe -i -h 192.168.1.xxx
-p 5432 -U foobar -d warehouse_utf8 -a -t "transaction" -v
"O:\foo\bar\pg_dump_transaction.backup"
pg_restore: connecting to database for restore
Password:
pg_restore: restoring data for table "transaction"
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 1706; 0 17861 TABLE
DATA transaction foobar
pg_restore: [archiver (db)] COPY failed: ERROR:  invalid byte sequence
for encoding "UTF8": 0xc52f
HINT:  This error can also happen if the byte sequence does not match
the encoding expected by the server, which is controlled by
"client_encoding".
CONTEXT:  COPY transaction, line 209487
WARNING: errors ignored on restore: 1

I remember reading somewhere recently that I could use iconv to convert
the ASCII encoded dump to UTF-8 encoding, but I'm currently on a windows
box, and a windows server, so is there an easier way to do this? Also I
was thinking perhaps it was possible to do an ETL type setup, where I
can SELECT from the ASCII db and INSERT into the UTF-8 db.

If you haven't gathered yet, I'm pretty in the dark regarding encoding
issues, especially when applied to pg, so any help here would be
appreciated.

--
Tom Hart
IT Specialist
Cooperative Federal
723 Westcott St.
Syracuse, NY 13210
(315) 471-1116 ext. 202
(315) 476-0567 (fax)


В списке pgsql-general по дате отправления:

Предыдущее
От: Bill Moran
Дата:
Сообщение: Re: Best practices for protect applications agains Sql injection.
Следующее
От: Erik Jones
Дата:
Сообщение: Re: pg_xlog and standby