Обсуждение: Character encoding problems and dump import
Hello, I have a dump (non-binary, if it matters) of a DB that has some characters in it that my DB doesn't want to take. I'm using PG 8.0.3 and it was created with Unicode support: => \encoding UNICODE Characters that cause problems during the import are things like: é and other characters from the Extended ASCII table (c.f. bottom of http://www.lookuptables.com/ ) Also: 'ÇѱÛÀÌ Á¦´ë·Î µÇ´ÂÁö ½ÇÇè... ¿©±â¿¡´Â ¾Æ¹« ¸µÅ©µµ ¾ø½À´Ï´Ù.±×³É ÀÌ ³»¿ë¹Û¿¡ ¾ø½À´Ï´Ù.' The errors I get on import are of this type: ERROR: invalid byte sequence for encoding "UNICODE": 0xdb20 The data may not be the cleanest, and I have limited control over that. But I am wondering if there is any way I can import this data, even if that means converting some of the characters intosomething else. Thanks, Otis
Am Montag, 20. März 2006 21.21 schrieb ogjunk-pgjedan@yahoo.com: > Hello, > > I have a dump (non-binary, if it matters) of a DB that has some characters > in it that my DB doesn't want to take. I'm using PG 8.0.3 and it was > created with Unicode support: > > => \encoding > UNICODE This is the client encoding (see \?). To get server encoding you can do show server_encoding; (see command show in the command reference) Then have a look at the dump to check what the encoding was when the dump was taken. (there is a line like set client_encoding = .... somewhere at the beginning of the dump) There where some changes within the unicode handling some time ago. If the dump was taken by an other server version there migth be differences. (search the archives, there are serveral threads about the issue. Best regards Ivo > > Characters that cause problems during the import are things like: > é and other characters from the Extended ASCII table (c.f. bottom of > http://www.lookuptables.com/ ) Also: > 'ÇѱÛÀÌ Á¦´ë·Î µÇ´ÂÁö ½ÇÇè... ¿©±â¿¡´Â ¾Æ¹« ¸µÅ©µµ ¾ø½À´Ï´Ù.±×³É ÀÌ > ³»¿ë¹Û¿¡ ¾ø½À´Ï´Ù.' > > > The errors I get on import are of this type: > ERROR: invalid byte sequence for encoding "UNICODE": 0xdb20 > > The data may not be the cleanest, and I have limited control over that. > But I am wondering if there is any way I can import this data, even if that > means converting some of the characters into something else. > > Thanks, > Otis > > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq
On Mar 20, 2006, at 3:21 PM, <ogjunk-pgjedan@yahoo.com> <ogjunk- pgjedan@yahoo.com> wrote: > The data may not be the cleanest, and I have limited control over > that. > But I am wondering if there is any way I can import this data, even > if that means converting some of the characters into something else. inconv might be able to help you fix encoding problems http://www.gnu.org/software/libiconv/documentation/libiconv/iconv.1.html John DeSoi, Ph.D. http://pgedit.com/ Power Tools for PostgreSQL
Thanks John and Ivo for help. It turned out that I had to manually SET CLIENT_ENCODING TO 'LATIN1' before processing the dump (which didn't have this specified). This fixed the problem. I thought a DB set to UNICODE char encoding (server_encoding) would process the Extended ASCII characters, but it didn't...not sure why. Otis ----- Original Message ---- From: John DeSoi <desoi@pgedit.com> To: ogjunk-pgjedan@yahoo.com Cc: pgsql-admin@postgresql.org Sent: Tuesday, March 21, 2006 12:31:16 AM Subject: Re: [ADMIN] Character encoding problems and dump import On Mar 20, 2006, at 3:21 PM, <ogjunk-pgjedan@yahoo.com> <ogjunk- pgjedan@yahoo.com> wrote: > The data may not be the cleanest, and I have limited control over > that. > But I am wondering if there is any way I can import this data, even > if that means converting some of the characters into something else. inconv might be able to help you fix encoding problems http://www.gnu.org/software/libiconv/documentation/libiconv/iconv.1.html John DeSoi, Ph.D. http://pgedit.com/ Power Tools for PostgreSQL ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org