Обсуждение: [Fwd: [GENERAL] [Please Help!!!!!!!!] Problem in Chinese (Big5)!!! Version 7.2.1 (come with Redhat 7.3)]
-------- Original Message -------- Subject: [GENERAL] [Please Help!!!!!!!!] Problem in Chinese (Big5)!!! Version 7.2.1 (come with Redhat 7.3) Date: Tue, 25 Jun 2002 11:03:17 +0800 From: Gordon Luk <gordon@gforce.ods.org> To: pgsql-general@postgresql.org Hi all, I am runing redhat 7.3, and install the postgresql 7.2.1 from Redhat CD. I try to create a new database encode with EUC_TW... it should be support chinese (Big5). And then i use Pgadmin II to input chinese character "¤¤¤å¦r" , it reject me... like following : ERROR : Invalid EUC_TW character sequence found (0xa672).... when i input "¤¤¤å" , it fine... i know the problem in the chinese character "¦r"... but the character just normal ... just like in english "A", "B", "C", not a special character in chinese.... i have try more more chinese word with different encode.. like unicode, euc_cn..and more... also reject me... "invalid.... character sequece...". Anyone experience about case.... how to solve the problem ? Please help, thanks. Gordon PS: In verison 7.1.3, it work fine with EUC_TW, now, i still could not restore to 7.2.1... :-( _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster
> I am runing redhat 7.3, and install the postgresql 7.2.1 from Redhat CD. > I try to create a new database encode with EUC_TW... it should be > support chinese (Big5). And then i use Pgadmin II to input chinese > character "¤¤¤å¦r" , it reject me... like following : > > ERROR : Invalid EUC_TW character sequence found (0xa672).... > > when i input "¤¤¤å" , it fine... i know the problem in the chinese > character "¦r"... but the character just normal ... just like in english > "A", "B", "C", not a special character in chinese.... i have try more > more chinese word with different encode.. like unicode, euc_cn..and > more... also reject me... "invalid.... character sequece...". > > > Anyone experience about case.... how to solve the problem ? Please help, > thanks. Honestly I'm tired of this kind of complains. Please verify your "correct" EUC_TW character sequences first. ¤¤¤å¦r" cannot be correct EUC_TW at all. I have already shown Gene Leung "rules to verify your EUC_TW character sequences". See followings. BTW, I have no idea what Pgadmin II is. Are you sure that it supports EUC_TW? I suspect it only supports Big5. (EUC_TW and Big5 are completely different beasts). --------------------------------------------------------------- Ok, here are some rules to verify EUC_TW characters: (1) if the first byte is 0x8e, then the 8th bit of following three bytes must be set (2) else if the first byte is 0x8f, then the 8th bit of following two bytes must be set (3) else if the 8th bit of the first byte is set, then the 8th bit of following one bytes must be set (4) else (that means the 8th bit of the first byte is not set) then that must be an ASCII character. Apparently 0xa672 does not satisfy all of above. -- Tatsuo Ishii
> Ok, but problem is, when i try encode with unicode, it also reject > me.... invalid UNICODE charater.... :-( Show me the entire error message please. > I already try few client, like borland's SQL explorer, zde... and > restore program come with postgresql... > > Sorry, i would like to a special request. After i read preious message > from you to Gene Leung, let me fully understand under EUC_TW rule , > postgresql should reject me (input such invalid charaters). So i request > a special patch that could to support Big5 or disable the validation. > > If postgres do not support Big5, that is big problem in chinese... > Please help. Actually PostgreSQL does support Big5. To use Big5, set the client encoding to Big5 and set the server(DB) encoding to EUC_TW. PostgreSQL will take care of the conversion between Big5 and EUC_TW. There are several ways to set the client encoding to Big5: SQL: set client_encoding to 'Big5'; from psql: \encoding Big5 using environment variable: export PGCLIENTENCODING=Big5 (example for bash) Hope this helps, -- Tatsuo Ishii
Hi Tatsue Ishii, Ok, but problem is, when i try encode with unicode, it also reject me.... invalid UNICODE charater.... :-( I already try few client, like borland's SQL explorer, zde... and restore program come with postgresql... Sorry, i would like to a special request. After i read preious message from you to Gene Leung, let me fully understand under EUC_TW rule , postgresql should reject me (input such invalid charaters). So i request a special patch that could to support Big5 or disable the validation. If postgres do not support Big5, that is big problem in chinese... Please help. Thanks for your quick response ( i have not response in General Mailling list :-( , may be no one using postgresql in chinese [Big5].) Gordon Tatsuo Ishii wrote: >Honestly I'm tired of this kind of complains. Please verify your >"correct" EUC_TW character sequences first. ¤¤¤å¦r" >cannot be correct EUC_TW at all. I have already shown >Gene Leung "rules to verify your EUC_TW character sequences". >See followings. > >BTW, I have no idea what Pgadmin II is. Are you sure that it supports >EUC_TW? I suspect it only supports Big5. (EUC_TW and Big5 are >completely different beasts). > >--------------------------------------------------------------- >Ok, here are some rules to verify EUC_TW characters: > >(1) if the first byte is 0x8e, then the 8th bit of following three > bytes must be set > >(2) else if the first byte is 0x8f, then the 8th bit of following two > bytes must be set > >(3) else if the 8th bit of the first byte is set, then the 8th bit of > following one bytes must be set > >(4) else (that means the 8th bit of the first byte is not set) then > that must be an ASCII character. > >Apparently 0xa672 does not satisfy all of above. > >-- >Tatsuo Ishii > > _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
Tatsuo Ishii wrote: >>Ok, but problem is, when i try encode with unicode, it also reject >>me.... invalid UNICODE charater.... :-( >> >> > >Show me the entire error message please. > Ok... error like this... ERROR : Invalid UNICODE character sequence found (0xe5a672)... the input charater also "¤¤¤å¦r".... >Actually PostgreSQL does support Big5. To use Big5, set the client >encoding to Big5 and set the server(DB) encoding to EUC_TW. PostgreSQL >will take care of the conversion between Big5 and EUC_TW. > >There are several ways to set the client encoding to Big5: > >SQL: set client_encoding to 'Big5'; >from psql: \encoding Big5 >using environment variable: export PGCLIENTENCODING=Big5 (example for bash) > >Hope this helps, >-- >Tatsuo Ishii > > O... you are right, i use Pgadmin II , and type SQL by hand ... and add the "set client_encoding to 'Big5';" before insert statement.... It WORK!!!! Thanks... Gordon