Обсуждение: Hebrew support -- please help !

Поиск
Список
Период
Сортировка

Hebrew support -- please help !

От
Elie Nacache
Дата:
Hi all,
 
I develop an application (JAVA/JSP) on RedHat PostgreSQL 7.4.3 with PostgreSQL 7.4.2 JDBC3 with SSL (build 213).
 
This application needs to serve pages in LATIN1 and Hebrew.
For that I create 2 Databases, one DB in UNICODE encoding with client encoding LATIN1 when in the JSP the charset encoding is iso-8859-1 and it's working fine from client to server and server to client.
 
so I decide to do the same thing with Hebrew -- DB in UNICODE with client encoding ISO-8859-8 when in the JSP the charset encoding is iso-8859-8, then I got strange characters.
 
I also tried to create DB in UNICODE encoding with client encoding UNICODE when in the JSP the charset encoding is UNICODE, but same problem I got strange characters.
 
Does someone have a solution or a way to resolve this problem.
 
Elie


Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.

Re: Hebrew support -- please help !

От
Tatsuo Ishii
Дата:
> Hi all,
>
> I develop an application (JAVA/JSP) on RedHat PostgreSQL 7.4.3 with PostgreSQL 7.4.2 JDBC3 with SSL (build 213).
>
> This application needs to serve pages in LATIN1 and Hebrew.
> For that I create 2 Databases, one DB in UNICODE encoding with client encoding LATIN1 when in the JSP the charset
encodingis iso-8859-1 and it's working fine from client to server and server to client. 
>
> so I decide to do the same thing with Hebrew -- DB in UNICODE with client encoding ISO-8859-8 when in the JSP the
charsetencoding is iso-8859-8, then I got strange characters. 
>
> I also tried to create DB in UNICODE encoding with client encoding UNICODE when in the JSP the charset encoding is
UNICODE,but same problem I got strange characters. 
>
> Does someone have a solution or a way to resolve this problem.

If I understand correctly, JDBC driver issues "set client_encoding to
iso-8859-8" in your case. You should check it first. If it does the
right thing, then you might want to the conversion maps. They are located:

src/backend/utils/mb/Unicode/utf8_to_iso8859_8.map    // UNICODE(UTF-8) -> ISO-8859-8
src/backend/utils/mb/Unicode/iso8859_8_to_utf8.map    // ISO-8859-8 -> UNICODE(UTF-8)

If you find anything wrong, please let me know.
--
Tatsuo Ishii

Re: Hebrew support -- please help !

От
Guy Naor
Дата:
Hi Elie,

>>  I develop an application (JAVA/JSP) on RedHat PostgreSQL 7.4.3 with
>>   PostgreSQL 7.4.2 JDBC3 with SSL (build 213).

>>  This application needs to serve pages in LATIN1 and Hebrew.
>>  For that I create 2 Databases, one DB in UNICODE encoding with client
>>   encoding LATIN1 when in the JSP the charset encoding is iso-8859-1 and
>>   it's working fine from client to server and server to client.

>>  so I decide to do the same thing with Hebrew -- DB in UNICODE with client
>>   encoding ISO-8859-8 when in the JSP the charset encoding is iso-8859-8,
>>   then I got strange characters.

>>  I also tried to create DB in UNICODE encoding with client encoding UNICODE
>>   when in the JSP the charset encoding is UNICODE, but same problem I got
>>   strange characters.

>>  Does someone have a solution or a way to resolve this problem.

Are you sure your client side display and fonts are set correctly? Can you look at the ASCII codes you get back from
Postgresto make sure they are not the correct Hebrew characters. You can send me some strings you got by email and I'll
lookat it if you want. 

Bye,

Guy.

Re: Hebrew support -- please help !

От
Ulrich Wisser
Дата:
Hi Elie,

actually it is much simpler if you use Unicode encoding on both sides.
Anyway, where do you see the strange characters? If you look at your DB
through commandline psql it is crucial to have the correct encoding.

I use to work on the commandline of my Linux box. I log in via putty,
which on a Win98 Machine is not capable of displaying unicode characters
correctly.

Ulrich


--
------------------------------------------------------------
Relevant Traffic AB, Riddargatan 10, 11435 Stockholm, Sweden
Tel. +46-8-6789750             http://www.relevanttraffic.se


Re: Hebrew support -- please help !

От
Elie Nacache
Дата:
Hi,
 
> If I understand correctly, JDBC driver issues "set client_encoding to
> iso-8859-8" in your case. You should check it first. If it does the
> right thing, then you might want to the conversion maps. They are located:

> src/backend/utils/mb/Unicode/utf8_to_iso8859_8.map // UNICODE(UTF-8) -> ISO-8859-8
> src/backend/utils/mb/Unicode/iso8859_8_to_utf8.map // ISO-8859-8 -> UNICODE(UTF-8)

> If you find anything wrong, please let me know.
 
I installed postgresql from a rpm files:
  * postgresql-7.4.3-2PGDG.i686.rpm
  * postgresql-jdbc-7.4.3-2PGDG.i686.rpm
  * postgresql-libs-7.4.3-2PGDG.i686.rpm
  * postgresql-server-7.4.3-2PGDG.i686.rpm
 
So in my installation there are no map file but there are a lot of so's in /usr/lib/pgsql.
I can observe that there is no utf8_and_iso8859_8.so file. How can I got/compile this file ?
 
The second solution, that I prefer but that failed was to work server/client side only in utf8. The DB was UNICODE, the 'show client_encoding' returned unicode and the charset in the jsp was utf-8. Any idea !?
 
Elie


Do you Yahoo!?
Yahoo! Mail is new and improved - Check it out!

Re: Hebrew support -- please help !

От
Tatsuo Ishii
Дата:
> > If I understand correctly, JDBC driver issues "set client_encoding to
> > iso-8859-8" in your case. You should check it first. If it does the
> > right thing, then you might want to the conversion maps. They are located:
>
> > src/backend/utils/mb/Unicode/utf8_to_iso8859_8.map // UNICODE(UTF-8) -> ISO-8859-8
> > src/backend/utils/mb/Unicode/iso8859_8_to_utf8.map // ISO-8859-8 -> UNICODE(UTF-8)
>
> > If you find anything wrong, please let me know.
>
> I installed postgresql from a rpm files:
>   * postgresql-7.4.3-2PGDG.i686.rpm
>   * postgresql-jdbc-7.4.3-2PGDG.i686.rpm
>   * postgresql-libs-7.4.3-2PGDG.i686.rpm
>   * postgresql-server-7.4.3-2PGDG.i686.rpm
>
> So in my installation there are no map file but there are a lot of so's in /usr/lib/pgsql.
> I can observe that there is no utf8_and_iso8859_8.so file. How can I got/compile this file ?

It's in utf8_and_iso8859.so.

> The second solution, that I prefer but that failed was to work server/client side only in utf8. The DB was UNICODE,
the'show client_encoding' returned unicode and the charset in the jsp was utf-8. Any idea !? 

In PostgreSQL, unicode, utf8 and utf-8 are all equivalent.
--
Tatsuo Ishii

Re: Hebrew support -- please help !

От
Elie Nacache
Дата:
Hi Guy,
 
> Are you sure your client side display and fonts are set correctly?
 
Yes, I can display Hebrew font.
 
> Can you look at the ASCII codes you get back from Postgres to make sure they are not the correct Hebrew characters.
 
I use pgAdmin3 on W2K to insert row in Hebrew and can see the right character.
 
My /etc/sysconfig/i18n file:
LANG="en_US.UTF-8"
SUPPORTED="fr_FR.UTF-8:fr_FR:fr:en_US.UTF-8:en_US:en"
SYSFONT="latarcyrheb-sun16"
Now I tried this configuration: 
  * DB in UNICODE, 'set client_encoding to UNICODE'
  * Apache 2.0 + mod_jk2: don't see any encoding
  * Tomcat 5:  javaEncoding to UTF8 (default value)
  * In each JSP:
       <%@ page contentType="text/html;charset=utf-8" language="java"%>
       <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
 
Now I got the right information from the DB in Hebrew and french.
Still can't save value in Hebrew from client (browser) to server and in french if I save '�t�' in a column varchar(3) I got a range over exception.
 
What happen here ! something else to config ?
 
Elie


Do you Yahoo!?
Win 1 of 4,000 free domain names from Yahoo! Enter now.

Re: Hebrew support -- please help !

От
John Sidney-Woollett
Дата:
> Now I got the right information from the DB in Hebrew and french.
 > Still can't save value in Hebrew from client (browser) to server and
 > in french if I save 'été' in a column varchar(3) I got a range over
 > exception.
 >
 > What happen here ! something else to config ?

This is a Tomcat question...

When processing forms set the encoding of the form element using the
following attribute (in addition to all that you are doing):

enctype= "text/plain;charset=UTF-8"

eg <form action="blah" ... enctype= "text/plain;charset=UTF-8">

Also add a hidden field with special/accented characters in all your
forms. Query the parameter value (of the hidden field) in your
servlet/filter/jsp to see what you actually got. It could be that the
browser didn't respect your encoding, and you don't get back what you
put there in the first place!

Hope that helps.

John Sidney-Woollett

Elie Nacache wrote:

> Hi Guy,
>
>
>>Are you sure your client side display and fonts are set correctly?
>
>
> Yes, I can display Hebrew font.
>
>
>>Can you look at the ASCII codes you get back from Postgres to make sure they are not the correct Hebrew characters.
>
>
> I use pgAdmin3 on W2K to insert row in Hebrew and can see the right character.
>
> My /etc/sysconfig/i18n file:
> LANG="en_US.UTF-8"
> SUPPORTED="fr_FR.UTF-8:fr_FR:fr:en_US.UTF-8:en_US:en"
> SYSFONT="latarcyrheb-sun16"
>
> Now I tried this configuration:
>   * DB in UNICODE, 'set client_encoding to UNICODE'
>   * Apache 2.0 + mod_jk2: don't see any encoding
>   * Tomcat 5:  javaEncoding to UTF8 (default value)
>   * In each JSP:
>        <%@ page contentType="text/html;charset=utf-8" language="java"%>
>        <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
>
> Now I got the right information from the DB in Hebrew and french.
> Still can't save value in Hebrew from client (browser) to server and in french if I save 'été' in a column varchar(3)
Igot a range over exception. 
>
> What happen here ! something else to config ?
>
> Elie
>
>
> ---------------------------------
> Do you Yahoo!?
> Win 1 of 4,000 free domain names from Yahoo! Enter now.

Re: Hebrew support -- please help !

От
Ulrich Wisser
Дата:
Hi Elie,

> Now I got the right information from the DB in Hebrew and french.
> Still can't save value in Hebrew from client (browser) to server and in
> french if I save 'été' in a column varchar(3) I got a range over exception.

you are working through the web. That makes the whole thing a lot more
complex. Try to use UTF-8/UNICODE encoding all the way. That means
configure Apache(!!!) and Tomact to server pages in UTF-8 encoding.
All web pages pages should have

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

Check in your browser if UTF-8 encoding is used. On the other operating
system in IE use View > Encoding, it should show UTF-8.

Actually the browser to server communication is the hardest to ensure
the right encoding is used.

We have a database with Japanese, Hebrew, German, Swedish, French,
Spanish and English content.

Ulrich



Re: Hebrew support -- please help !

От
Elie Nacache
Дата:
Hi Ulrich
 
Finally I resolve the problem.
 
Here is the solution:
DB server side:
===========
* DB encoding UNICODE
* DB client encoding UNICODE
 
JSP side:
=======
* <%@ page contentType="text/html;charset=UTF-8" language="java"%>
* <%@ page pageEncoding="ISO-8859-1"%>
* request.setCharacterEncoding("UTF-8");
* <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
 
Tomcat side: (server.xml file)
=========
<Connector port="8009" enableLookups="false" redirectPort="8443" debug="0"
    protocol="AJP/1.3" URIEncoding="UTF-8"/>
 
For those that use a property bunle file (key/value) you need to save this file in
ISO-8859-1. if you have unicode characters in it you have to convert the file with native2ascii tool.
 
Thanks to all,
Elie


Do you Yahoo!?
Win 1 of 4,000 free domain names from Yahoo! Enter now.