Re: International support
От | Tatsuo Ishii |
---|---|
Тема | Re: International support |
Дата | |
Msg-id | 20010223100224U.t-ishii@sra.co.jp обсуждение исходный текст |
Ответ на | International support (Soma Interesting <dfunct@telus.net>) |
Ответы |
Re: International support
|
Список | pgsql-general |
> I'm currently working a project that is intended to handle Japanese > character sets - and now I'm told ideally iMode too. :) The iMode isn't > such an issue at the moment - but the article below has spooked me a > little. At an early point in the project we tested if putting some input > into a web form, which ultimately was handled by php then stored in > postgres would return fully intact - and it did. This left me comfortable > that PHP and Postgres don't seem to care what language they're storing in > fields or variables. I'm 'guessing' that this is because the data, whether > its English or Japanese is being stored in binary (or something > else?). No. You are just lucky, I guess. If data submitted by PHP is encoded in EUC, it's ok, since EUC does not conflict with ASCII. However, it is encoded in SJIS, you are going into big problem. The second byte of SJIS *sometimes* conflict with ASCII meta characters such as "\", and this will make the parser of PostgreSQL crazy. Of courese the i18n version of PHP will help (it does the conversion SJIS <--> EUC), but be ware that some characters in SJIS (such as User define characters especially used in i-mode) are not well supported in it. > Of > course I wouldn't be able to sort the data or do anything else that would > require PHP/Postgres to be able to interpret the data. That would depend on how you define "sort". Just doing a normal sort as you are alredy do it with ASCII, you could get more or less resonable results, I guess. But if your client requires more "high level sorts" such as "sorting by YOMIGANA (Japanese pronounciation)" you need to do something... probably you need to define an extract field in your table. > However if I compile > Postgres with locals support for the character set/language in question - > then postgres will be able to sort Japanese. Is this right? No. locale support is useless for Japanese, just slows down PostgreSQL. Turn it off. >Have I got this all right so far? I have attempted to do my research on >this - but finding a real beginners guide to international web development >has been a trick. And the best sources I have found on this topic generally >are specific to Oracle. Any links would be appreciated. Try: ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf > For the postgres folks, these developers went with MySQL - I've chosen > Postgres. Is there anything MySQL does that Postgres doesn't in terms of > language support that I should be aware of? I believe PostgreSQL's language support is much better than MySQL's especially for Japanese. PostgreSQL can handle both EUC/SJIS on the fly (and even Unicode for 7.1!), and has the ability to do an automatic encoding conversion between them. Moreover, PostgreSQL has many "multibyte aware" functions including regular expression search, which MySQL cannot do, I think. > >PHP's Japanese challenge > >Since r-newbold.com is in Japanese only, Studio Omame made sure to utilize > >PHP's Japanese character set conversion functions. However, this proved to > >be a challenge. > > Is this available for v4 of PHP yet? No. -- Tatsuo Ishii
В списке pgsql-general по дате отправления: