Re: Significance of Database Encoding
От | PFC |
---|---|
Тема | Re: Significance of Database Encoding |
Дата | |
Msg-id | op.sqt1blpqth1vuj@localhost обсуждение исходный текст |
Ответ на | Re: Significance of Database Encoding (Rajesh Mallah <mallah_rajesh@yahoo.com>) |
Список | pgsql-sql |
> $ iconv -f US-ASCII -t UTF-8 < test.sql > out.sql > iconv: illegal input sequence at position 114500 > > Any ideas how the job can be accomplised reliably. > > Also my database may contain data in multiple encodings > like WINDOWS-1251 and WINDOWS-1256 in various places > as data has been inserted by different peoples using > different sources and client software. You could use a simple program like that (in Python): output = open( "unidump", "w" ) for line in open( "your dump" ):for encoding in "utf-8", "iso-8859-15", "whatever": try: output.write( unicode(line, encoding ).encode( "utf-8" )) break except UnicodeError: passelse: print "No suitable encodingfor line..." I'd say this might work, if UTF-8 cannot absorb an apostrophe inside a multibit character. Can it ? Or you could do that to all your table using SELECTs but it's going to be painful...
В списке pgsql-sql по дате отправления: