Re: BUG #4890: Allow insert character has no equivalent in "LATIN2"
От | Craig Ringer |
---|---|
Тема | Re: BUG #4890: Allow insert character has no equivalent in "LATIN2" |
Дата | |
Msg-id | 1247507930.17862.111.camel@ayaki обсуждение исходный текст |
Ответ на | BUG #4890: Allow insert character has no equivalent in "LATIN2" ("saint" <saint@akpa.pl>) |
Ответы |
Re: BUG #4890: Allow insert character has no equivalent in "LATIN2"
Re: BUG #4890: Allow insert character has no equivalent in "LATIN2" |
Список | pgsql-bugs |
(Please reply to the list, not just to me) I'm not sure about this so far. Re the specific issue you mention of conversion between cp1250 and latin-2 (ISO-8859-2) the Unicode tables at: http://unicode.org/Public/MAPPINGS/ISO8859/8859-2.TXT appear to agree - there's no PER MILLE in ISO-8859-2. With a UTF-8 database, Pg correctly doesn't accept PER MILLE as a valid ISO-8859-2 char: -- Connecting with unicode (utf-8) client CREATE TABLE test (x); INSERT INTO test(x) VALUES ('â°'); SET client_encoding='iso-8859-2'; SELECT * from test; ERROR: character 0xe280b0 of encoding "UTF8" has no equivalent in "LATIN2" If the encoding is set to WIN1250 Pg outputs the appropriate byte. So it's doing the right thing in each individual case where a UTF-8 DB is concerned. Your problem, though, is that if you connect to a LATIN2 database with a WIN1250 client and INSERT a string containing the per-mille glyph, Pg accepts it and it should not. If it does, indeed, accept it, then I agree that's a bug. I haven't tested with a LATIN2 database as I'd have to re-initdb and the machine I'm working on has semi-useful databases on it. What you're saying makes sense, though, presuming your client really is sending win1250 per-mille (byte 0x89). I'd still like to know how you're setting your client encoding. You can't just run "SET client_encoding='win1250'" - you must tell the client program, or the terminal it runs in, to use the appropriate encoding as well. Otherwise when you paste the per-mille character you'll see the right glyph, but the CLIENT will interpret that as the character in the encoding you specified. So, if you're using a utf-8 terminal, that means that the terminal will send 0xe2 0x80 0xb0 for per-mille, which when interpreted as win1250 becomes ââ¬Â° , so that's what the server thinks you sent it. In that case, though, you'd find that the euro symbol, which isn't defined in latin-2, will cause an error: ERROR: character 0xe282ac of encoding "UTF8" has no equivalent in "LATIN2" -- Craig Ringer
В списке pgsql-bugs по дате отправления: