Обсуждение: invalid byte sequence for encoding "UNICODE"
Hi there. Many times, I'm confronting with that strange problem: invalid byte sequence for encoding "UNICODE". So, I guess, Postgresql can't allow me to use some symbols which is not a part of UNICODE. But what is that symbals? I'm attaching a screenshot with THAT dead-symbol. As you can see - it's an unknown symbol in the end of Cyrillic. First of all, I have checked my data with iconv (iconv -f UTF-8 -t UTF-8 data.txt) and there are no errors, so, I guess, there are no dead-symbols. So the question is: is it possible to find a *table* with forbitten characters for encoding "UNICODE"? If I can get it -> I can kill that dead-characters in my program ;-) Thank you.
AlannY <m@alanny.ru> writes: > Many times, I'm confronting with that strange problem: invalid byte > sequence for encoding "UNICODE". So, I guess, Postgresql can't allow me > to use some symbols which is not a part of UNICODE. But what is that > symbals? Doesn't it tell you? AFAICS every PG version that uses that error message phrasing gives you the exact byte sequence it's complaining about. It would also be worth asking what PG version you are using anyway. If it's not a pretty recent update then updating might help --- I think there were some bugs in the encoding verification stuff awhile back. regards, tom lane
On Jul 24, 8:06 pm, m...@alanny.ru (AlannY) wrote: > Hi there. > > Many times, I'm confronting with that strange problem: invalid byte > sequence for encoding "UNICODE". So, I guess, Postgresql can't allow me > to use some symbols which is not a part of UNICODE. But what is that > symbals? > > I'm attaching a screenshot with THAT dead-symbol. As you can see - it's > an unknown symbol in the end of Cyrillic. First of all, I have checked > my data with iconv (iconv -f UTF-8 -t UTF-8 data.txt) and there are no > errors, so, I guess, there are no dead-symbols. > > So the question is: is it possible to find a *table* with forbitten > characters for encoding "UNICODE"? If I can get it -> I can kill that > dead-characters in my program ;-) > > Thank you. > > -- > Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org) > To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-general To say the truth, there are no characters, forbidden in UNICODE as there are no characters, that you can have, that are not in UNICODE. The other thing is UTF8, that encodes real UNICODE into 8bit byte sequence. There errors occur. What does the command: show lc_ctype; show? As Tom has said, more information about your system would be really handy... With best regards, -- Valentine Gogichashvili