Обсуждение: Default PostgreSQL server encoding - Change to unicode (utf8)
Hello, Thank you for reading my post. When I run the command: I get the following messages: I would like the cluster (and the databases) encoding to be unicode (UTF8). What can I do? Can I set the default encoding I want for the whole PostgreSQL server somewhere? Thank you for helping and best regards. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Default-PostgreSQL-server-encoding-Change-to-unicode-utf8-tp5505985p5505985.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On 02/22/2012 11:20 AM, Léa Massiot wrote: > Hello, > > Thank you for reading my post. > > When I run the command: > > I get the following messages: The messages ? > > I would like the cluster (and the databases) encoding to be unicode (UTF8). > > What can I do? > Can I set the default encoding I want for the whole PostgreSQL server > somewhere? A good place to start for your options is: http://www.postgresql.org/docs/9.0/interactive/locale.html http://www.postgresql.org/docs/9.0/interactive/multibyte.html > > Thank you for helping and best regards. > > -- > View this message in context: http://postgresql.1045698.n5.nabble.com/Default-PostgreSQL-server-encoding-Change-to-unicode-utf8-tp5505985p5505985.html > Sent from the PostgreSQL - general mailing list archive at Nabble.com. > -- Adrian Klaver adrian.klaver@gmail.com
Hello.
Thank you for your answer.
I used the <raw> and </raw> tags, this is probably the reason
why you couldn't see the messages...
Thank you for the two links.
I read this (in the second one): "On Windows, however, UTF-8 encoding can be
used with any locale." yet I still have some questions...
On Unix (Debian GNU Linux Squeeze):
=========================================================================================
psql_cmd> \l
----------+----------+----------+-------------+------------
Name | Owner | Encoding | Collation | Ctype
----------+----------+----------+-------------+------------
template1 | postgres | UTF8 | en_us.UTF-8 | en_us.UTF-8
=========================================================================================
On Windows (XP):
=========================================================================================
psql_cmd> \l
----------+----------+----------+----------------------------+---------------------------
Name | Owner | Encoding | Collation | Ctype
----------+----------+----------+----------------------------+---------------------------
template1 | postgres | UTF8 | English_United States.1252 |
English_United States.1252
=========================================================================================
Question 1
Focusing on the "Collation" and "Ctype" columns,
has "English_United States.1252" something to do with "Windows-1252"
("CP-1252")?
"CP-1252" is an 8 bits character encoding (so, it can map codes to 2^8
characters at most).
How compatible is this with an "UTF8" "Encoding"?
For people testing PostgreSQL under Windows, is there any other more
appropriate "Collation" that could be used to set a database collation?
There is no "locale -a" command avaiblable under Windows. Is there any
workaround?
Question 2
Suppose I have a PostgreSQL table which has a VARCHAR column "text".
Suppose I want to insert the string "Li 李" which contains the Chinese
ideograph 李.
How can I do this with an "INSERT INTO" command?
I wish I could do something like:
INSERT INTO t (text) VALUES ('Li U+674E')
or
INSERT INTO t (text) VALUES ('Li \u674E')
How can I do this?
Thanks and best regards.
--
Léa
--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Default-PostgreSQL-server-encoding-Change-to-unicode-utf8-tp5505985p5518720.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On Monday, February 27, 2012 3:55:43 am Léa Massiot wrote:
> Hello.
> Thank you for your answer.
> Thank you for the two links.
> I read this (in the second one): "On Windows, however, UTF-8 encoding can
> be used with any locale." yet I still have some questions...
>
> Question 1
> Focusing on the "Collation" and "Ctype" columns,
> has "English_United States.1252" something to do with "Windows-1252"
> ("CP-1252")?
> "CP-1252" is an 8 bits character encoding (so, it can map codes to 2^8
> characters at most).
> How compatible is this with an "UTF8" "Encoding"?
> For people testing PostgreSQL under Windows, is there any other more
> appropriate "Collation" that could be used to set a database collation?
This is answered in the first link I sent:
http://www.postgresql.org/docs/9.0/interactive/locale.html
" Windows uses more verbose locale names, such as German_Germany or Swedish_Sweden.1252,
but the principles are the same."
"
LC_COLLATE String sort order
LC_CTYPE Character classification (What is a letter? Its upper-case equivalent?
"
So appropriate depends on what sorting character rules you want to follow. By the way
both of these are fixed at database creation and cannot be changed.
> There is no "locale -a" command avaiblable under Windows. Is there any
> workaround?
A little Googling found this. I am not a regular Windows user, so there may be
better options out there:
http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/systeminfo.mspx?mfr=true
>
> Thanks and best regards.
> --
> Léa
>
--
Adrian Klaver
adrian.klaver@gmail.com