Re: UTF-8 data migration problem in Postgresql 7.2
От | Jean-Michel POURE |
---|---|
Тема | Re: UTF-8 data migration problem in Postgresql 7.2 |
Дата | |
Msg-id | 200202210913.g1L9DNFP032755@www1.translationforge обсуждение исходный текст |
Ответ на | Re: UTF-8 data migration problem in Postgresql 7.2 (Tatsuo Ishii <t-ishii@sra.co.jp>) |
Ответы |
Re: UTF-8 data migration problem in Postgresql 7.2
Re: UTF-8 data migration problem in Postgresql 7.2 |
Список | pgsql-hackers |
Dear Tatsuo, Thank you for your previous answer. > o Were server/clien encodings UTF-8 for PostgreSQL? Yes. > o What are versions of these softwares? Especially of PHP? Is it a > PHP4? if so, what version? What is the "Php with UTF-8 extensions"? > I've never heard of it. It is PHP 4.0.6 with : --enable-mbstring : Enable mbstring functions. This option is required to use mbstring functions. --enable-mbstr-enc-trans : Enable HTTP input character encoding conversion using mbstring conversion engine. If this feature is enabled, HTTP input character encoding may be converted to mbstring.internal_encoding automatically. Now, some more information: 1) Dutch text was entered using IE5.5. It is not faulty. 2) Japanese text was entered using OpenOffice latest release (sorry, I said IE5 but I was wrong), saved under UTF-8 and imported in PostgreSQL. Only Japanese data has problems. 3) When opening a faulty Japanese record using Apache/IE5, the record is displayed correctly. Each faulty character is replaced by a Japanese 30A7 gryph (looks like a French cross with two horizontal lines). What is this gryph? Does it mean 'I don't know' in Japanese. The record is saved correctly using this 30A1 gryph (then it looks like it is fixed as I can dump it and import it in 7.2, but this is not a solution). 4) In PostgreSQL 7.1.3 original dump, there is only one faulty UTF-8 character repeated 700 times. If you open my file in Yudit, it is displayed as =E3=82' Why is it always the same character everywhere? Maybe you could have a look at my source file again. Sounds like a bug (Open Office or PostgreSQL). 5) Surrogate pairs I heard PostgreSQL did not support surrogate pairs. Is this a problem of surrogate pair? Just my 0.02 cents, I know very little about UTF-8. Any help appreciated, Thanks, Jean-Michel POURE
В списке pgsql-hackers по дате отправления: