Обсуждение: Re: [GENERAL] invalid byte sequence ?

Поиск
Список
Период
Сортировка

Re: [GENERAL] invalid byte sequence ?

От
Peter Eisentraut
Дата:
Am Donnerstag, 24. August 2006 00:52 schrieb Tom Lane:
> A possible solution therefore is to have psql or libpq drive the
> client_encoding off the client's locale environment instead of letting
> it default to equal the server_encoding.

I got started on this and just wanted to post an intermediate patch.  I have 
taken the logic from initdb and placed it into libpq and refined the API a 
bit.  At this point, there should be no behaviorial change.  It remains to 
make libpq use this stuff if PGCLIENTENCODING is not set.  Unless someone 
beats me, I'll figure that out later.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: [GENERAL] invalid byte sequence ?

От
Martijn van Oosterhout
Дата:
On Fri, Aug 25, 2006 at 05:07:03PM +0200, Peter Eisentraut wrote:
> I got started on this and just wanted to post an intermediate patch.  I have
> taken the logic from initdb and placed it into libpq and refined the API a
> bit.  At this point, there should be no behaviorial change.  It remains to
> make libpq use this stuff if PGCLIENTENCODING is not set.  Unless someone
> beats me, I'll figure that out later.

Umm, why export all these functions. For starters, does this even need
to be in libpq? I wouldn't have thought so the first time round,
especially not three functions. The only thing you need is to take a
locale name and return the charset you can pass to PQsetClientEncoding.

In fact, the only thing you need is PQsetClientEncodingFromLocale(),
anything else is just sugar. Why would the user care about what the OS
calls it? We have a "pg_enc" enum, so lets use it.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: [GENERAL] invalid byte sequence ?

От
Peter Eisentraut
Дата:
Am Freitag, 25. August 2006 17:30 schrieb Martijn van Oosterhout:
> Umm, why export all these functions. For starters, does this even need
> to be in libpq?

Where else would you put it?

> In fact, the only thing you need is PQsetClientEncodingFromLocale(),
> anything else is just sugar. Why would the user care about what the OS
> calls it? We have a "pg_enc" enum, so lets use it.

initdb has different requirements.  Let me know if you have a different way to 
refactor it that satisfies initdb.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: [GENERAL] invalid byte sequence ?

От
Martijn van Oosterhout
Дата:
On Fri, Aug 25, 2006 at 05:38:20PM +0200, Peter Eisentraut wrote:
> > In fact, the only thing you need is PQsetClientEncodingFromLocale(),
> > anything else is just sugar. Why would the user care about what the OS
> > calls it? We have a "pg_enc" enum, so lets use it.
>
> initdb has different requirements.  Let me know if you have a different way to
> refactor it that satisfies initdb.

Well, check_encodings_match(pg_enc,ctype) is simply a short way of
saying: if(find_matching_encoding(ctype) != pg_enc ) { error }.
And get_encoding_from_locale() is not used outside of those functions.

So the only thing initdb actually needs is an implementation of
find_matching_encoding(ctype), which returns a value of "enum pg_enc".
check_encodings_match() stays in initdb, and get_encoding_from_locale()
becomes internal to libpq.

How does that sound?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: [GENERAL] invalid byte sequence ?

От
Tom Lane
Дата:
Peter Eisentraut <peter_e@gmx.net> writes:
> Am Freitag, 25. August 2006 17:30 schrieb Martijn van Oosterhout:
>> Umm, why export all these functions. For starters, does this even need
>> to be in libpq?

> Where else would you put it?
> ...
> initdb has different requirements.  Let me know if you have a different way to 
> refactor it that satisfies initdb.

Um, but initdb doesn't use libpq, so it's going to need its own copy
anyway.  I agree with Martijn that putting these into libpq's API
seems like useless clutter.
        regards, tom lane


Re: [GENERAL] invalid byte sequence ?

От
Peter Eisentraut
Дата:
Tom Lane wrote:
> Um, but initdb doesn't use libpq, so it's going to need its own copy
> anyway.

initdb certainly links against libpq.

> I agree with Martijn that putting these into libpq's API 
> seems like useless clutter.

Where else to put it?  We need it in libpq anyway if we want this 
behavior in all client applications (by default).

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: [GENERAL] invalid byte sequence ?

От
Tom Lane
Дата:
Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane wrote:
>> I agree with Martijn that putting these into libpq's API 
>> seems like useless clutter.

> Where else to put it?  We need it in libpq anyway if we want this 
> behavior in all client applications (by default).

Having the code in libpq doesn't necessarily mean exposing it to the
outside world.  I can't see a reason for these to be in the API at all.

Possibly we could avoid the duplication-of-source-code issue by putting
the code in libpgport, or someplace, whence both initdb and libpq could
get at it?
        regards, tom lane


Re: [GENERAL] invalid byte sequence ?

От
Martijn van Oosterhout
Дата:
On Fri, Aug 25, 2006 at 08:13:39PM +0200, Peter Eisentraut wrote:
> > I agree with Martijn that putting these into libpq's API
> > seems like useless clutter.
>
> Where else to put it?  We need it in libpq anyway if we want this
> behavior in all client applications (by default).

Is that so? I thought we were only talkng about psql. Even then, I'm
wondering if we should alter the current behaviour at all if stdout is
not a tty (i.e. run as a pipe).

And as a counter-example: pg_dump should absolutly not use the client
locale, it should always dump as the same encoding as the server...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: [GENERAL] invalid byte sequence ?

От
Tom Lane
Дата:
Martijn van Oosterhout <kleptog@svana.org> writes:
> And as a counter-example: pg_dump should absolutly not use the client
> locale, it should always dump as the same encoding as the server...

Sure, but pg_dump should set that explicitly.  I'm prepared to believe
that looking at the locale is sane for all normal clients.

It might be worth providing a way to set the client_encoding through a
PQconnectdb connection-string keyword, just in case the override-via-
PGCLIENTENCODING dodge doesn't suit someone.  The priority order
would presumably be connection string, then PGCLIENTENCODING, then
locale.
        regards, tom lane


Re: [GENERAL] invalid byte sequence ?

От
Alvaro Herrera
Дата:
Tom Lane wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
> > And as a counter-example: pg_dump should absolutly not use the client
> > locale, it should always dump as the same encoding as the server...
> 
> Sure, but pg_dump should set that explicitly.  I'm prepared to believe
> that looking at the locale is sane for all normal clients.

What are "normal clients"?  I would think that programs in PHP or Perl
have their own idea of the correct encoding (JDBC already has one).

> It might be worth providing a way to set the client_encoding through a
> PQconnectdb connection-string keyword, just in case the override-via-
> PGCLIENTENCODING dodge doesn't suit someone.  The priority order
> would presumably be connection string, then PGCLIENTENCODING, then
> locale.

This sounds like a good idea anyway...

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support