Обсуждение: Charset Win1250 on Windows and Ubuntu

Поиск
Список
Период
Сортировка

Charset Win1250 on Windows and Ubuntu

От
Durumdara
Дата:
Hi!

I have a software that uses Postgresql. This program (and website) developed and working on Window (XP/2003), with native charset (win1250).

Prior week we got a special request to install this software to a Linux server.

Yesterday I installed Ubu9.10 on VirtualBox, and tried to moving the database under Linux.

First big problem is that when I tried to create a database with same parameters as in Windows, the PGAdmin show an error.
The errormessage is:
"Error: new encoding (Win1250) is incompatible with the encoding of the template database (UTF8)."

Ok, I changed to "template0".

Then I got error that Win1250 is not good for collation hu_HU.UTF8.

When I tried to insert hungarian chars (to check sort order), the C and POSIX return wrong result - as I thought before.

The Windows version of PG and Admin is not supports collation, so these two options are disable (collation, character type).

But in Linux I have only UTF version that can sort rows in good order.

The problem that the client program is win1250 based, and I must rewrite all things to make same results.

Have anybody some way, some tricky solution for this problem?

Thanks for your help:
    dd

Re: Charset Win1250 on Windows and Ubuntu

От
Adrian Klaver
Дата:
On Friday 18 December 2009 4:30:46 am Durumdara wrote:
> Hi!
>
> I have a software that uses Postgresql. This program (and website)
> developed and working on Window (XP/2003), with native charset (win1250).
>
> Prior week we got a special request to install this software to a Linux
> server.
>
> Yesterday I installed Ubu9.10 on VirtualBox, and tried to moving the
> database under Linux.
>
> First big problem is that when I tried to create a database with same
> parameters as in Windows, the PGAdmin show an error.
> The errormessage is:
> "Error: new encoding (Win1250) is incompatible with the encoding of the
> template database (UTF8)."
>
> Ok, I changed to "template0".
>
> Then I got error that Win1250 is not good for collation hu_HU.UTF8.
>
> When I tried to insert hungarian chars (to check sort order), the C and
> POSIX return wrong result - as I thought before.
>
> The Windows version of PG and Admin is not supports collation, so these two
> options are disable (collation, character type).

There is a Linux version of PGAdmin available for Ubuntu 9.10.

>
> But in Linux I have only UTF version that can sort rows in good order.
>
> The problem that the client program is win1250 based, and I must rewrite
> all things to make same results.
>
> Have anybody some way, some tricky solution for this problem?

Use psql and CREATE DATABASE:
http://www.postgresql.org/docs/8.4/interactive/sql-createdatabase.html

>
> Thanks for your help:
>     dd



--
Adrian Klaver
aklaver@comcast.net

Re: Charset Win1250 on Windows and Ubuntu

От
Dave Page
Дата:
On Sat, Dec 19, 2009 at 8:54 PM, Adrian Klaver <aklaver@comcast.net> wrote:
>> The Windows version of PG and Admin is not supports collation, so these two
>> options are disable (collation, character type).
>
> There is a Linux version of PGAdmin available for Ubuntu 9.10.

Doesn't matter - pgAdmin supports collation and ctype on all platforms
when creating databases. If the options are disabled, it's because the
OP is running a server older than 8.4.


--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com

Re: Charset Win1250 on Windows and Ubuntu

От
Adrian Klaver
Дата:
On Saturday 19 December 2009 1:04:30 pm Dave Page wrote:
> On Sat, Dec 19, 2009 at 8:54 PM, Adrian Klaver <aklaver@comcast.net> wrote:
> >> The Windows version of PG and Admin is not supports collation, so these
> >> two options are disable (collation, character type).
> >
> > There is a Linux version of PGAdmin available for Ubuntu 9.10.
>
> Doesn't matter - pgAdmin supports collation and ctype on all platforms
> when creating databases. If the options are disabled, it's because the
> OP is running a server older than 8.4.

That is what I get for assuming. I figured since the OP was using Ubuntu 9.10
they where using the default version  of Postgres, 8.4.

--
Adrian Klaver
aklaver@comcast.net

Re: Charset Win1250 on Windows and Ubuntu

От
"Albe Laurenz"
Дата:
Durumdara wrote:
> I have a software that uses Postgresql. This program (and website) developed and working on Window (XP/2003),
> with native charset (win1250).
>
> Prior week we got a special request to install this software to a Linux server.
>
> Yesterday I installed Ubu9.10 on VirtualBox, and tried to moving the database under Linux.
>
> First big problem is that when I tried to create a database with same parameters as in Windows, the PGAdmin
> show an error.
> The errormessage is:
> "Error: new encoding (Win1250) is incompatible with the encoding of the template database (UTF8)."
>
> Ok, I changed to "template0".
>
> Then I got error that Win1250 is not good for collation hu_HU.UTF8.
>
> When I tried to insert hungarian chars (to check sort order), the C and POSIX return wrong result - as I thought
before.
>
> The Windows version of PG and Admin is not supports collation, so these two options are disable (collation,
> character type).
>
> But in Linux I have only UTF version that can sort rows in good order.
>
> The problem that the client program is win1250 based, and I must rewrite all things to make same results.
>
> Have anybody some way, some tricky solution for this problem?

If the collation ho_HU.UTF8 is what you want (can sort rows in good order), you
should use UTF8 as database encoding.

If you need the data in WIN1250 on the client side, change the client encoding to WIN1250.

So:
- Create the database with UTF8.
- Change the client encoding to WIN1250 (e.g. by setting the environment variable PGCLIENTENCODING).
- Import the dump of the Windows database. It will be converted to UTF-8.
- Make sure that the client program has client encoding WIN1250.

Yours,
Laurenz Albe

Re: Charset Win1250 on Windows and Ubuntu

От
Durumdara
Дата:
Hi!

2009/12/19 Albe Laurenz <laurenz.albe@wien.gv.at>
If you need the data in WIN1250 on the client side, change the client encoding to WIN1250.

So:
- Create the database with UTF8.
- Change the client encoding to WIN1250 (e.g. by setting the environment variable PGCLIENTENCODING).
- Import the dump of the Windows database. It will be converted to UTF-8.
- Make sure that the client program has client encoding WIN1250.

Yours,
Laurenz Albe

So if I have Python and pygresql, can I set this value in Python?
The main problem that I don't want to set this value globally - possible another applications want to use another encoding...

Thanks for your help:
   dd

Re: Charset Win1250 on Windows and Ubuntu

От
Martijn van Oosterhout
Дата:
On Mon, Dec 21, 2009 at 10:26:51AM +0100, Durumdara wrote:
> So if I have Python and pygresql, can I set this value in Python?
> The main problem that I don't want to set this value globally - possible
> another applications want to use another encoding...

Each connection can set the encoding to whatever they like. Something I
find useful is to setup the DB as UTF-8 but then do:

ALTER DATABASE foo SET client_encoding = latin9;

which sets the default for the DB, or

ALTER USER bar SET client_encoding = latin9;

Which lets you set the defauts for each user. This means that old
scripts can work unchanged but newer scripts can choose UTF-8 if they
want it.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

Вложения

Re: Charset Win1250 on Windows and Ubuntu

От
Alban Hertroys
Дата:
On 21 Dec 2009, at 10:26, Durumdara wrote:

> So if I have Python and pygresql, can I set this value in Python?
> The main problem that I don't want to set this value globally - possible another applications want to use another
encoding....

Sure you can, just execute SET client_encoding TO 'WIN1250' once you've set up your connection. You can even do that
betweenqueries if your client encoding requirements change between queries. 

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4b2f51b9228057414011521!



Re: Charset Win1250 on Windows and Ubuntu

От
"Albe Laurenz"
Дата:
Durumdara wrote:
>> - Change the client encoding to WIN1250 (e.g. by
>> setting the environment variable PGCLIENTENCODING).
>
> So if I have Python and pygresql, can I set this value in Python?
> The main problem that I don't want to set this value globally
> - possible another applications want to use another encoding...

There may be special Python functions, but you can use the following
SQL statement: SET client_encoding TO 'WIN1250'

Yours,
Laurenz Albe

Re: Charset Win1250 on Windows and Ubuntu

От
Durumdara
Дата:
Hi!

2009/12/21 Albe Laurenz <laurenz.albe@wien.gv.at>
Durumdara wrote:
>> - Change the client encoding to WIN1250 (e.g. by
>> setting the environment variable PGCLIENTENCODING).
>
> So if I have Python and pygresql, can I set this value in Python?
> The main problem that I don't want to set this value globally
> - possible another applications want to use another encoding...

There may be special Python functions, but you can use the following
SQL statement: SET client_encoding TO 'WIN1250'

And what happening what DB recognize not win1250 character in SQL?
Is it converted to "?" or an exception dropped?
And if the UTF db contains non win1250 character?
Is it replaced in result with "?" or some exception dropped?

Thanks:
   dd

Re: Charset Win1250 on Windows and Ubuntu

От
"Albe Laurenz"
Дата:
Durumdara wrote:
[client_encoding is switched to WIN1250]
> And what happening what DB recognize not win1250 character in SQL?
> Is it converted to "?" or an exception dropped?
> And if the UTF db contains non win1250 character?
> Is it replaced in result with "?" or some exception dropped?

What you wrote is very confusing/confused; this is problably a
language problem.

I'll try to reformulate your questions and answer them; if I
got something wrong, please tell me.

Q: What happens if your SQL statement contains a character that is not WIN1250 encoded?
   Is it converted to "?" or do you get an error?

A: You get an error (this is not Oracle). Here an example for hex 88:
   ERROR:  character 0x88 of encoding "WIN1250" has no equivalent in "UTF8"
   Since every known character is representable in UTF-8, that means
   that this is an invalid byte.

Q: What happens if you select a character in the UTF8 database that cannot be
   converted to WIN1250?

A: You will also get an error. Here is what you get for selecting a "G clef":
   ERROR:  character 0xf09d849e of encoding "UTF8" has no equivalent in "WIN1250"

Yours,
Laurenz Albe