Обсуждение: Encoding for error messages during connect
Hi, when the server is set to e.g. lc_messages = 'German_Germany.1252' then error messages during connect are not properly decodedby the driver (or encoded by the server?) At least when the passwort is incorrect the german error message Passwort-Authentifizierung für Benutzer »thomas« fehlgeschlagen is incorrectly received by the driver as Passwort-Authentifizierung f?r Benutzer ?thomas? fehlgeschlagen After debugging the driver I found out that the driver creates the stream for the startup communication using US_ASCII encodingwhich will yield incorrect characters beyond ASCII 127. I debugged the data that is received from the server and that proofed that the message is received as a single byte encoding.Which seems correct as 'German_Germany.1252' is indeed a single byte encoding. I changed the stream that the driver uses during connect to use a different encoding, by changing org.postgresql.core.v3.ConnectionFactoryImpland adding the line newStream.setEncoding(Encoding.getDatabaseEncoding("ISO-8859-1")); after Line 77 (where newStream = new PGStream(host, port) is done) And in that case the error message is decoded properly by the driver. Now I don't think it would be possible for the driver to find out which encoding to use for that stream before actually havinga connection. So it would need to evaluate some kind of client side information, e.g. the lc_messages environment variableon the client or through a connection property that would then be used to initialize the stream correctly. Personally I'd prefer a connection property (something like "messageEncoding") to control this as this can be part of theJDBC URL which is usually configurable in a Java environment. What do you think? Regards Thomas
Any comments on this? Thomas Kellerer, 05.11.2011 12:12: > Hi, > > when the server is set to e.g. lc_messages = 'German_Germany.1252' > then error messages during connect are not properly decoded by the > driver (or encoded by the server?) > > At least when the passwort is incorrect the german error message > > Passwort-Authentifizierung für Benutzer »thomas« fehlgeschlagen > > is incorrectly received by the driver as > > Passwort-Authentifizierung f?r Benutzer ?thomas? fehlgeschlagen > > After debugging the driver I found out that the driver creates the > stream for the startup communication using US_ASCII encoding which > will yield incorrect characters beyond ASCII 127. > > I debugged the data that is received from the server and that proofed > that the message is received as a single byte encoding. Which seems > correct as 'German_Germany.1252' is indeed a single byte encoding. > > I changed the stream that the driver uses during connect to use a > different encoding, by changing > org.postgresql.core.v3.ConnectionFactoryImpl and adding the line > > newStream.setEncoding(Encoding.getDatabaseEncoding("ISO-8859-1")); > > after Line 77 (where newStream = new PGStream(host, port) is done) > > And in that case the error message is decoded properly by the > driver. > > Now I don't think it would be possible for the driver to find out > which encoding to use for that stream before actually having a > connection. So it would need to evaluate some kind of client side > information, e.g. the lc_messages environment variable on the client > or through a connection property that would then be used to > initialize the stream correctly. > > Personally I'd prefer a connection property (something like > "messageEncoding") to control this as this can be part of the JDBC > URL which is usually configurable in a Java environment. > > What do you think? > > Regards Thomas
On 11/17/2011 12:17 AM, Thomas Kellerer wrote: >> Now I don't think it would be possible for the driver to find out >> which encoding to use for that stream before actually having a >> connection. So it would need to evaluate some kind of client side >> information, e.g. the lc_messages environment variable on the client >> or through a connection property that would then be used to >> initialize the stream correctly. >> It's not a good assumption that the client environment will match the server environment. >> Personally I'd prefer a connection property (something like >> "messageEncoding") to control this as this can be part of the JDBC >> URL which is usually configurable in a Java environment. This seems more reasonable. Previously we discussed how to send usernames and passwords to the database because the encoding they are sent in must match the encoding of the database these values were set in (which may be different than the database you're connecting to). At the time we decided that a connection option to configure this wasn't the right idea and now always send these values as UTF-8. I don't recall why we made that decision, but checking the archives might provide some additional information on this case. Kris Jurka