Re: [GENERAL] unicode error and problem
Re: [GENERAL] unicode error and problem
От:
Markus Bertheau <twanger@bluetwanger.de>
Дата:
В Срд, 24.03.2004, в 11:33, Paolo Supino пишет: > Hi > > I received a unicode CSV file from someone (the file was created on a > windows system) and I'm trying to import it into postgresql. When it gets to > a line that isn't ascii it prints the following error and aborts: "ERROR: > copy: line 33, Invalid UNICODE character sequence found (0xd956)". Try to convert the file from UTF-16 (which might be the encoding of the file) to UTF-8 with iconv: iconv --from UTF-16 --to UTF-8 file > file.UTF-8 Maybe the file is not in UTF-16 but in some other encoding - convert accordingly then. By the way, Unicode is just a number -> glyph mapping, it doesn't say anything about the representation of that number in the byte stream. UTF-8 and UTF-16 are such representation specifications. The encoding name in PostgreSQL should be changed from UNICODE to UTF-8 because UNICODE really just isn't an encoding. -- Markus Bertheau
Re: [GENERAL] unicode error and problem
От:
Tatsuo Ishii <t-ishii@sra.co.jp>
Дата:
> By the way, Unicode is just a number -> glyph mapping, it doesn't say > anything about the representation of that number in the byte stream. > UTF-8 and UTF-16 are such representation specifications. > > The encoding name in PostgreSQL should be changed from UNICODE to UTF-8 > because UNICODE really just isn't an encoding. Actually you can use "UTF-8" instead of "UNICODE" when using PostgreSQL. However the "primary" name is still UNICODE, and I agree it's better to change to UTF-8 for the primary name. Maybe for 7.5? -- Tatsuo Ishii