Обсуждение: What encoding to use for English, French, Spanish
My project is currently SQL_ASCII encoded. I will need to accomodate both French and Spanish in addition to English. I don't anticipate needing Far East languages. Reading here on the forums I come up with Latin9 as perhaps adequate. But others recommend unicode for relatively simple needs like my own. I'd appreciate any advice on this topic. Unicode is the most versatile? What's the downside of unicode? If Far East languages do become a requirement, unicode is the way to go? -- View this message in context: http://www.nabble.com/What-encoding-to-use-for-English%2C-French%2C-Spanish-tf4622283.html#a13200459 Sent from the PostgreSQL - general mailing list archive at Nabble.com.
novnov wrote: > My project is currently SQL_ASCII encoded. I will need to accomodate > both French and Spanish in addition to English. I don't anticipate > needing Far East languages. Reading here on the forums I come up with > Latin9 as perhaps adequate. But others recommend unicode for > relatively simple needs like my own. LATIN9 or UTF-8 are the appropriate choices for your project. The choice between these is mostly a matter of taste, unless there are additional requirements in the project. Nowadays, many operating systems configure themselves to use Unicode by default, and so there is probably no reason to use a more restricted character set. Note that some versions of PostgreSQL have various degrees of trouble with UTF-8 support. Be sure to use the latest version. -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut escribió: > novnov wrote: > > My project is currently SQL_ASCII encoded. I will need to accomodate > > both French and Spanish in addition to English. I don't anticipate > > needing Far East languages. Reading here on the forums I come up with > > Latin9 as perhaps adequate. But others recommend unicode for > > relatively simple needs like my own. > > LATIN9 or UTF-8 are the appropriate choices for your project. The > choice between these is mostly a matter of taste, unless there are > additional requirements in the project. I used to think that there was no practical difference between using LATIN9 or UTF8, but experience (not my own, but those from people in the pgsql-es-ayuda list) has told me otherwise. When people start mixing environments, it is quite common that they get the client_encoding wrong in some cases. In those cases, having an encoding able to tell a valid string from an invalid one is really helpful -- thus using UTF8 as the server encoding is the way to go. Latin9 is _capable_ of storing your data, yes, but if you fail to set client_encoding then it is also capable of storing something you don't really want to store. I'd stay away from it. -- Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J "Los románticos son seres que mueren de deseos de vida"