Обсуждение: Patch for pl/tcl Tcl_ExternalToUtf and Tcl_UtfToExternal support
This patch adds calls to Tcl_ExternalToUtf and Tcl_UtfToExternal functions on parameters of SPI functions. Without this calls it's imposiible to work with 8-bit text in pl/tcl functions with tcl-8.3. Patch assumes that database encoding and system encoding of Tcl is equal. Patch is agains current CVS sources. It adds new configure switch --enable-pltcl-utf. Without this switch patch does nothing.
Вложения
Vsevolod Lobko <seva@sevasoft.kiev.ua> writes:
> This patch adds calls to Tcl_ExternalToUtf and Tcl_UtfToExternal
> functions on parameters of SPI functions.
I hate to say it, but this is an amazingly ugly patch.  Can't you
do it without so many #ifdefs and duplicating a lot of code?  Think
of the next guy who has to look at/work on this code.
Possibly a macro that invokes Tcl_ExternalToUtfDString or does nothing
might help.
> Patch assumes that database encoding and system encoding of Tcl is
> equal.
Hmm, is that a tenable assumption?  I don't know, I'm just asking.
> It adds new configure switch
> --enable-pltcl-utf. Without this switch patch does nothing.
This is not a good approach, since it relies on the user to know whether
he needs to do this or not.  Seems like looking at the Tcl version
number would be more appropriate: if version >= 8.3 then make these
changes.  You could do that in the code, without any configure patch,
using #if's on TCL_MAJOR_VERSION and TCL_MINOR_VERSION (see libpgtcl
for examples).
            regards, tom lane
			
		Is there a way to make this automatically enabled based on the TCL version? Does that make sense? > This patch adds calls to Tcl_ExternalToUtf and Tcl_UtfToExternal > functions on parameters of SPI functions. > > Without this calls it's imposiible to work with 8-bit text in pl/tcl > functions with tcl-8.3. > > Patch assumes that database encoding and system encoding of Tcl is > equal. > > Patch is agains current CVS sources. It adds new configure switch > --enable-pltcl-utf. Without this switch patch does nothing. > Content-Description: [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://www.postgresql.org/search.mpl -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
On Thu, 23 Aug 2001, Bruce Momjian wrote: > > Is there a way to make this automatically enabled based on the TCL > version? Does that make sense? This adds significant overhead to pl/tcl calls :(( so I'm not sure that anybody want this functionality enabled by default
On Thu, 23 Aug 2001, Tom Lane wrote: > Vsevolod Lobko <seva@sevasoft.kiev.ua> writes: > > This patch adds calls to Tcl_ExternalToUtf and Tcl_UtfToExternal > > functions on parameters of SPI functions. > > I hate to say it, but this is an amazingly ugly patch. Can't you > do it without so many #ifdefs and duplicating a lot of code? Think > of the next guy who has to look at/work on this code. There are only two problems: 1) I need temporary Tcl_DString 2) I need to free it after use So I can go with three macros (one for declaration, one for conversion, and one for free) but I dont think it will be better > Possibly a macro that invokes Tcl_ExternalToUtfDString or does nothing > might help. > > > Patch assumes that database encoding and system encoding of Tcl is > > equal. > > Hmm, is that a tenable assumption? I don't know, I'm just asking. Yes, because it does 8-bit to unicode conversion and must to know codepage for 8-bit characters. Unfortunately charset names for tcl and postgres does not match, so this demands additional field in charset tables or additional table :(( > > It adds new configure switch > > --enable-pltcl-utf. Without this switch patch does nothing. > > This is not a good approach, since it relies on the user to know whether > he needs to do this or not. Seems like looking at the Tcl version > number would be more appropriate: if version >= 8.3 then make these > changes. You could do that in the code, without any configure patch, > using #if's on TCL_MAJOR_VERSION and TCL_MINOR_VERSION (see libpgtcl > for examples). See previous reply, this patch adds significant overhead.
Vsevolod Lobko <seva@sevasoft.kiev.ua> writes:
> #ifdef ENABLE_PLTCL_UTF
> #warning bubu
> #       define UTF_BEGIN        do { Tcl_DString _pltcl_ds_tmp;
> #       define UTF_END          Tcl_DStringFree(&_pltcl_ds_tmp); } while (0);
> #       define UTF_U2E(x)       (Tcl_UtfToExternalDString(NULL,(x),-1,\
>                     &_pltcl_ds_tmp))
> #       define UTF_E2U(x)       (Tcl_ExternalToUtfDString(NULL,(x),-1,\
>                     &_pltcl_ds_tmp))
> #else /* ENABLE_PLTCL_UTF */
> #       define UTF_BEGIN
> #       define UTF_END
> #       define UTF_U2E(x)       (x)
> #       define UTF_E2U(x)       (x)
> #endif /* ENABLE_PLTCL_UTF */
> and
>                 if (part != NULL)
>                 {
>                         UTF_BEGIN
>                         Tcl_DStringAppend(&unknown_src, UTF_E2U(part), -1);
>                         UTF_END
>                         pfree(part);
>                 }
> Is this looks better?
It does, but one small gripe: the lack of semicolons will probably cause
pg_indent to mess up the indentation.  (I know emacs' autoindent mode
will not work nicely with it, either.)  Please set up the macros so that
you write
                        UTF_BEGIN;
                        Tcl_DStringAppend(&unknown_src, UTF_E2U(part), -1);
                        UTF_END;
and then I'll be happy.
Your point about overhead is a good one, so I retract the gripe about
using a configure switch.  But please include documentation patches to
describe the configure option in the administrator's guide (installation
section).
            regards, tom lane
			
		
On Thu, 23 Aug 2001, Tom Lane wrote:
> Vsevolod Lobko <seva@sevasoft.kiev.ua> writes:
> > This patch adds calls to Tcl_ExternalToUtf and Tcl_UtfToExternal
> > functions on parameters of SPI functions.
>
> I hate to say it, but this is an amazingly ugly patch.  Can't you
> do it without so many #ifdefs and duplicating a lot of code?  Think
> of the next guy who has to look at/work on this code.
>
> Possibly a macro that invokes Tcl_ExternalToUtfDString or does nothing
> might help.
something like this?
#ifdef ENABLE_PLTCL_UTF
#warning bubu
#       define UTF_BEGIN        do { Tcl_DString _pltcl_ds_tmp;
#       define UTF_END          Tcl_DStringFree(&_pltcl_ds_tmp); } while (0);
#       define UTF_U2E(x)       (Tcl_UtfToExternalDString(NULL,(x),-1,\
                    &_pltcl_ds_tmp))
#       define UTF_E2U(x)       (Tcl_ExternalToUtfDString(NULL,(x),-1,\
                    &_pltcl_ds_tmp))
#else /* ENABLE_PLTCL_UTF */
#       define UTF_BEGIN
#       define UTF_END
#       define UTF_U2E(x)       (x)
#       define UTF_E2U(x)       (x)
#endif /* ENABLE_PLTCL_UTF */
and
                if (part != NULL)
                {
                        UTF_BEGIN
                        Tcl_DStringAppend(&unknown_src, UTF_E2U(part), -1);
                        UTF_END
                        pfree(part);
                }
Is this looks better?
			
		> > It adds new configure switch > > --enable-pltcl-utf. Without this switch patch does nothing. > > This is not a good approach, since it relies on the user to know whether > he needs to do this or not. Seems like looking at the Tcl version > number would be more appropriate: if version >= 8.3 then make these > changes. You could do that in the code, without any configure patch, > using #if's on TCL_MAJOR_VERSION and TCL_MINOR_VERSION (see libpgtcl > for examples). I asked him this already and he said that there is extra overhead for the conversion that many wouldn't want. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Vsevolod Lobko writes: > > Is there a way to make this automatically enabled based on the TCL > > version? Does that make sense? > > This adds significant overhead to pl/tcl calls :(( > so I'm not sure that anybody want this functionality enabled by default It is my understanding that Tcl 8.3 will not work with 8 bit characters without this functionality. That would simply be unacceptable. If users want better performance they should use a different Tcl version, but we should not override Tcl's design decisions unilaterally. -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
On Thu, 23 Aug 2001, Peter Eisentraut wrote: > Vsevolod Lobko writes: > > > > Is there a way to make this automatically enabled based on the TCL > > > version? Does that make sense? > > > > This adds significant overhead to pl/tcl calls :(( > > so I'm not sure that anybody want this functionality enabled by default > > It is my understanding that Tcl 8.3 will not work with 8 bit characters > without this functionality. That would simply be unacceptable. If users Really - yes. I can make this automatic based on tcl version if performance penalty on tcl>=8.3 is acceptable. > want better performance they should use a different Tcl version, but we > should not override Tcl's design decisions unilaterally. > > -- > Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org >
> > > On Thu, 23 Aug 2001, Peter Eisentraut wrote: > > > Vsevolod Lobko writes: > > > > > > Is there a way to make this automatically enabled based on the TCL > > > > version? Does that make sense? > > > > > > This adds significant overhead to pl/tcl calls :(( > > > so I'm not sure that anybody want this functionality enabled by default > > > > It is my understanding that Tcl 8.3 will not work with 8 bit characters > > without this functionality. That would simply be unacceptable. If users > > Really - yes. > I can make this automatic based on tcl version if performance penalty on > tcl>=8.3 is acceptable. It is my understanding that the UTF penalty applies to all the TCL versions >= 8.1 and that there is debate whether this is a good idea or not. However, seeing as TCL has the penalty, I don't see a problem with having the same penalty in our code. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
On Thu, 23 Aug 2001, Tom Lane wrote:
> > Is this looks better?
>
> It does, but one small gripe: the lack of semicolons will probably cause
> pg_indent to mess up the indentation.  (I know emacs' autoindent mode
> will not work nicely with it, either.)  Please set up the macros so that
> you write
>
>                         UTF_BEGIN;
>                         Tcl_DStringAppend(&unknown_src, UTF_E2U(part), -1);
>                         UTF_END;
>
> and then I'll be happy.
Attached revised patch
> Your point about overhead is a good one, so I retract the gripe about
> using a configure switch.  But please include documentation patches to
> describe the configure option in the administrator's guide (installation
> section).
This patch still uses configure switch for enabling feature.
For enabling based on tcl version we have 2 posibilites:
 1) having feature enabled by default, but in pltcl.c check for tcl
    version and disable it for old versions
 2) enable or disable at configure time based on tcl version, but there
    are problem - current configure don't checks for tcl version at all
    and my configure skills not enought for adding this
			
		Вложения
Next version of patch. Now with documentation update and disabling of UTF conversion for Tcl <=8.0 On Fri, 24 Aug 2001, Vsevolod Lobko wrote: > On Thu, 23 Aug 2001, Tom Lane wrote: > > > > Is this looks better? > > > > It does, but one small gripe: the lack of semicolons will probably cause > > pg_indent to mess up the indentation. (I know emacs' autoindent mode > > will not work nicely with it, either.) Please set up the macros so that > > you write > > > > UTF_BEGIN; > > Tcl_DStringAppend(&unknown_src, UTF_E2U(part), -1); > > UTF_END; > > > > and then I'll be happy. > > Attached revised patch > > > Your point about overhead is a good one, so I retract the gripe about > > using a configure switch. But please include documentation patches to > > describe the configure option in the administrator's guide (installation > > section). > > This patch still uses configure switch for enabling feature. > > For enabling based on tcl version we have 2 posibilites: > 1) having feature enabled by default, but in pltcl.c check for tcl > version and disable it for old versions > 2) enable or disable at configure time based on tcl version, but there > are problem - current configure don't checks for tcl version at all > and my configure skills not enought for adding this >
Вложения
Your patch has been added to the PostgreSQL unapplied patches list at:
    http://candle.pha.pa.us/cgi-bin/pgpatches
I will try to apply it within the next 48 hours.
> Next version of patch.
> Now with documentation update and disabling of UTF conversion for Tcl <=8.0
>
> On Fri, 24 Aug 2001, Vsevolod Lobko wrote:
>
> > On Thu, 23 Aug 2001, Tom Lane wrote:
> >
> > > > Is this looks better?
> > >
> > > It does, but one small gripe: the lack of semicolons will probably cause
> > > pg_indent to mess up the indentation.  (I know emacs' autoindent mode
> > > will not work nicely with it, either.)  Please set up the macros so that
> > > you write
> > >
> > >                         UTF_BEGIN;
> > >                         Tcl_DStringAppend(&unknown_src, UTF_E2U(part), -1);
> > >                         UTF_END;
> > >
> > > and then I'll be happy.
> >
> > Attached revised patch
> >
> > > Your point about overhead is a good one, so I retract the gripe about
> > > using a configure switch.  But please include documentation patches to
> > > describe the configure option in the administrator's guide (installation
> > > section).
> >
> > This patch still uses configure switch for enabling feature.
> >
> > For enabling based on tcl version we have 2 posibilites:
> >  1) having feature enabled by default, but in pltcl.c check for tcl
> >     version and disable it for old versions
> >  2) enable or disable at configure time based on tcl version, but there
> >     are problem - current configure don't checks for tcl version at all
> >     and my configure skills not enought for adding this
> >
Content-Description:
[ Attachment, skipping... ]
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
			
		Vsevolod Lobko writes: > Next version of patch. > Now with documentation update and disabling of UTF conversion for Tcl <=8.0 You still have the configure option in there. Too many configure options are no good. If Tcl decided to move to Unicode, we should follow. It's their problem in the end. -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
> > Next version of patch. > > Now with documentation update and disabling of UTF conversion for Tcl <=8.0 > > You still have the configure option in there. Too many configure options > are no good. If Tcl decided to move to Unicode, we should follow. It's > their problem in the end. You missed the point Tom has given. > > Your point about overhead is a good one, so I retract the gripe about > > using a configure switch. But please include documentation patches to > > describe the configure option in the administrator's guide (installation > > section). I agree with Tom here. -- Tatsuo Ishii
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> Peter wrote:
>> If Tcl decided to move to Unicode, we should follow.
> Your point about overhead is a good one, so I retract the gripe about
> using a configure switch.  But please include documentation patches to
> describe the configure option in the administrator's guide (installation
> section).
> I agree with Tom here.
But Peter's point is a strong one as well.  Does anyone else have an
opinion about this?
            regards, tom lane
			
		> Tatsuo Ishii <t-ishii@sra.co.jp> writes: > > Peter wrote: > >> If Tcl decided to move to Unicode, we should follow. > > > Your point about overhead is a good one, so I retract the gripe about > > using a configure switch. But please include documentation patches to > > describe the configure option in the administrator's guide (installation > > section). > > > I agree with Tom here. > > But Peter's point is a strong one as well. Does anyone else have an > opinion about this? But if the config switch was removed, Tcls prior 8.1 would not be supported at all, no? -- Tatsuo Ishii
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> But if the config switch was removed, Tcls prior 8.1 would not be
> supported at all, no?
No, Peter wasn't proposing removing #if's from the code.  His thought
is that the #if's should be conditional only on Tcl version --- if you
have a Tcl version that does UTF, then you get the translation code.
Of course, there are some issues here about compiling against a
different set of Tcl headers than you later run against, but I think
you could get burnt by that anyway.
            regards, tom lane
			
		> Tatsuo Ishii <t-ishii@sra.co.jp> writes: > > But if the config switch was removed, Tcls prior 8.1 would not be > > supported at all, no? > > No, Peter wasn't proposing removing #if's from the code. His thought > is that the #if's should be conditional only on Tcl version --- if you > have a Tcl version that does UTF, then you get the translation code. Oh, I see. -- Tatsuo Ishii
Would you make the configure changes and resubmit? > Next version of patch. > Now with documentation update and disabling of UTF conversion for Tcl <=8.0 > > On Fri, 24 Aug 2001, Vsevolod Lobko wrote: > > > On Thu, 23 Aug 2001, Tom Lane wrote: > > > > > > Is this looks better? > > > > > > It does, but one small gripe: the lack of semicolons will probably cause > > > pg_indent to mess up the indentation. (I know emacs' autoindent mode > > > will not work nicely with it, either.) Please set up the macros so that > > > you write > > > > > > UTF_BEGIN; > > > Tcl_DStringAppend(&unknown_src, UTF_E2U(part), -1); > > > UTF_END; > > > > > > and then I'll be happy. > > > > Attached revised patch > > > > > Your point about overhead is a good one, so I retract the gripe about > > > using a configure switch. But please include documentation patches to > > > describe the configure option in the administrator's guide (installation > > > section). > > > > This patch still uses configure switch for enabling feature. > > > > For enabling based on tcl version we have 2 posibilites: > > 1) having feature enabled by default, but in pltcl.c check for tcl > > version and disable it for old versions > > 2) enable or disable at configure time based on tcl version, but there > > are problem - current configure don't checks for tcl version at all > > and my configure skills not enought for adding this > > Content-Description: [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
On Mon, 3 Sep 2001, Bruce Momjian wrote: > Would you make the configure changes and resubmit? which one? enable by default ? > > > This patch still uses configure switch for enabling feature. > > > > > > For enabling based on tcl version we have 2 posibilites: > > > 1) having feature enabled by default, but in pltcl.c check for tcl > > > version and disable it for old versions > > > 2) enable or disable at configure time based on tcl version, but there > > > are problem - current configure don't checks for tcl version at all > > > and my configure skills not enought for adding this
Tatsuo Ishii writes: > > Tatsuo Ishii <t-ishii@sra.co.jp> writes: > > > But if the config switch was removed, Tcls prior 8.1 would not be > > > supported at all, no? > > > > No, Peter wasn't proposing removing #if's from the code. His thought > > is that the #if's should be conditional only on Tcl version --- if you > > have a Tcl version that does UTF, then you get the translation code. > > Oh, I see. The important thing here is not "does UTF", but "requires UTF". Without this patch enabled in Tcl >=8.1 you would fly along at 7 bits, which is rather unusable for almost anyone. If there are really serious performance problems I could see a switch to turn it off. But we're already hearing every week about people with character set problems with the JDBC driver, so we shouldn't lay more traps like that while knowing better. -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
Hi,
sorry for stepping late into this discussion.
I've been on vacation for two weeks.
On Thu, 23 Aug 2001, Vsevolod Lobko wrote:
> > > Patch assumes that database encoding and system encoding of Tcl is
> > > equal.
> >
> > Hmm, is that a tenable assumption?  I don't know, I'm just asking.
>
> Yes, because it does 8-bit to unicode conversion and must to know
> codepage for 8-bit characters. Unfortunately charset names for tcl
> and postgres does not match, so this demands additional field in
> charset tables or additional table :((
I think you can't assume that a database has always the same encoding
as Tcl's system encoding. For pl/tcl you could set the system encoding
to the database's encoding, but then you'd need that additional name
conversion table anyway be it a database table or hardcoded. For PgTcl
it is definitely up to the user which system encoding the interpreter
has.
I for example create my databases in UNICODE (to get PostgreSQL
working with Tcl 8.3 and without patching pl/tcl or PgTcl), but my
Tcl-Interpreter's system encoding is iso-8859-1.
So basically there are two possibilities:
a) Patch pl/tcl and PgTcl to do the code conversion, but do it right
   by using the Database's encoding instead of Tcl's system encoding.
b) Require databases to be in UNICODE if they are to be accessed
   from Tcl >= 8.1 so that the strings that come out of the database
   are already UTF-8.
For b) it would be nice to have a per-database attribute that
specifies the default client encoding that is used for clients that
don't explicitely set an encoding. I think of something like:
$ createdb --encoing UNICODE --default-client-encoding LATIN1 foo
This database could be used from Tcl without any code conversion, but
would look like it was in LATIN1 for other clients (e.g. psql) if they
don't explicitely set an encoding.
I'd vote for b), because I think there is a general movement towards
Unicode anyways.
cu
    Reinhard
			
		> On Mon, 3 Sep 2001, Bruce Momjian wrote: > > > Would you make the configure changes and resubmit? > > which one? enable by default ? I don't know. :-) Something that makes everyone happy. :-) > > > > > This patch still uses configure switch for enabling feature. > > > > > > > > For enabling based on tcl version we have 2 posibilites: > > > > 1) having feature enabled by default, but in pltcl.c check for tcl > > > > version and disable it for old versions > > > > 2) enable or disable at configure time based on tcl version, but there > > > > are problem - current configure don't checks for tcl version at all > > > > and my configure skills not enought for adding this > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Thanks for this patch. It is something we really needed. > Next version of patch. > Now with documentation update and disabling of UTF conversion for Tcl <=8.0 > > On Fri, 24 Aug 2001, Vsevolod Lobko wrote: > > > On Thu, 23 Aug 2001, Tom Lane wrote: > > > > > > Is this looks better? > > > > > > It does, but one small gripe: the lack of semicolons will probably cause > > > pg_indent to mess up the indentation. (I know emacs' autoindent mode > > > will not work nicely with it, either.) Please set up the macros so that > > > you write > > > > > > UTF_BEGIN; > > > Tcl_DStringAppend(&unknown_src, UTF_E2U(part), -1); > > > UTF_END; > > > > > > and then I'll be happy. > > > > Attached revised patch > > > > > Your point about overhead is a good one, so I retract the gripe about > > > using a configure switch. But please include documentation patches to > > > describe the configure option in the administrator's guide (installation > > > section). > > > > This patch still uses configure switch for enabling feature. > > > > For enabling based on tcl version we have 2 posibilites: > > 1) having feature enabled by default, but in pltcl.c check for tcl > > version and disable it for old versions > > 2) enable or disable at configure time based on tcl version, but there > > are problem - current configure don't checks for tcl version at all > > and my configure skills not enought for adding this > > Content-Description: [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Reinhard Max writes: > So basically there are two possibilities: > > a) Patch pl/tcl and PgTcl to do the code conversion, but do it right > by using the Database's encoding instead of Tcl's system encoding. > > b) Require databases to be in UNICODE if they are to be accessed > from Tcl >= 8.1 so that the strings that come out of the database > are already UTF-8. Could you summarize how that affects the proposed patch? (The patch is already in, but I'm still not sure if it's not just an arbitrary workaround.) -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
Now i'm looking on using pg_do_encoding_conversion from /src/backend/utils/mb/mbutils.c instead of tcl conversion functions. So we can do a), without breaking b) :)) On Thu, 6 Sep 2001, Peter Eisentraut wrote: > Reinhard Max writes: > > > So basically there are two possibilities: > > > > a) Patch pl/tcl and PgTcl to do the code conversion, but do it right > > by using the Database's encoding instead of Tcl's system encoding. > > > > b) Require databases to be in UNICODE if they are to be accessed > > from Tcl >= 8.1 so that the strings that come out of the database > > are already UTF-8. > > Could you summarize how that affects the proposed patch? (The patch is > already in, but I'm still not sure if it's not just an arbitrary > workaround.) > > -- > Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter >
On Thu, 6 Sep 2001, Peter Eisentraut wrote:
> Reinhard Max writes:
>
> > So basically there are two possibilities:
> >
> > a) Patch pl/tcl and PgTcl to do the code conversion, but do it right
> >    by using the Database's encoding instead of Tcl's system encoding.
> >
> > b) Require databases to be in UNICODE if they are to be accessed
> >    from Tcl >= 8.1 so that the strings that come out of the database
> >    are already UTF-8.
>
> Could you summarize how that affects the proposed patch?  (The
> patch is already in, but I'm still not sure if it's not just an
> arbitrary workaround.)
for a) Tcl_ExternalToUtfDString and Tcl_UtfToExternalDString need to
be called with the first Argument (encoding) being a Tcl_Encoding
struct that is initialized to represent the encoding that is used in
the database.
for b) the proposed patch would become unneeded. Instead of that, a
warning could be issued, if one tried to access a non UTF-8 database
from Tcl.
For a third proposal which is IMHO the cleanest one so far, please
have a look at the mail, I've sent to Bruce and psql-bugs earlier
today. In short: PostgreSQL should be compiled with --enable-multibyte
and --enable-unicode-convertion and the Tcl interfaces (PgTcl and
PL/Tcl) should be changed to set their client encoding to UNICODE if
the Tcl version requires UTF-8 strings.
cu
    Reinhard
			
		> For a third proposal which is IMHO the cleanest one so far, please > have a look at the mail, I've sent to Bruce and psql-bugs earlier > today. In short: PostgreSQL should be compiled with --enable-multibyte > and --enable-unicode-convertion and the Tcl interfaces (PgTcl and > PL/Tcl) should be changed to set their client encoding to UNICODE if > the Tcl version requires UTF-8 strings. third proposal relates to pgtcl clients only, patch deals with pl/tcl. So it's unrelated problems.
On Thu, 6 Sep 2001, Vsevolod Lobko wrote:
> > For a third proposal which is IMHO the cleanest one so far, please
> > have a look at the mail, I've sent to Bruce and psql-bugs earlier
> > today. In short: PostgreSQL should be compiled with --enable-multibyte
> > and --enable-unicode-convertion and the Tcl interfaces (PgTcl and
> > PL/Tcl) should be changed to set their client encoding to UNICODE if
> > the Tcl version requires UTF-8 strings.
>
> third proposal relates to pgtcl clients only, patch deals with pl/tcl.
> So it's unrelated problems.
Not really. There is a similar patch proposed for PgTcl and I think
the problem should be resolved having both libraries in mind to get a
consistent behaviour on whatever side - client or server - Tcl is
used.
BTW, the proposal in your other mail to use pg_do_encoding_conversion
for PL/Tcl seems to be in the sense than my proposal to use UNICODE as
client encoding in PgTcl and it would also need PostgreSQL to be
compiled with --enable-multibyte and --enable-unicode-convertion.
So this seems to be the most clean and consistent solution:
- PostgreSQL requires to be compiled with --enable-multibyte
  and --enable-unicode-convertion if it ought to work correctly
  with Tcl/Tk >= 8.1 (client or server side).
- PL/Tcl needs to be changed to use pg_do_encoding_conversion
  if it runs on a Tcl version >= 8.1 .
- PgTcl needs to be changed to set it's cliend encoding to UNICODE
  if it runs on a Tcl version >= 8.1 .
Can we agree on that?
cu
    Reinhard
			
		On Thu, 6 Sep 2001, Reinhard Max wrote: > On Thu, 6 Sep 2001, Vsevolod Lobko wrote: > > > > For a third proposal which is IMHO the cleanest one so far, please > > > have a look at the mail, I've sent to Bruce and psql-bugs earlier > > > today. In short: PostgreSQL should be compiled with --enable-multibyte > > > and --enable-unicode-convertion and the Tcl interfaces (PgTcl and > > > PL/Tcl) should be changed to set their client encoding to UNICODE if > > > the Tcl version requires UTF-8 strings. > > > > third proposal relates to pgtcl clients only, patch deals with pl/tcl. > > So it's unrelated problems. > > Not really. There is a similar patch proposed for PgTcl and I think > the problem should be resolved having both libraries in mind to get a > consistent behaviour on whatever side - client or server - Tcl is > used. > > BTW, the proposal in your other mail to use pg_do_encoding_conversion > for PL/Tcl seems to be in the sense than my proposal to use UNICODE as > client encoding in PgTcl and it would also need PostgreSQL to be > compiled with --enable-multibyte and --enable-unicode-convertion. > > > So this seems to be the most clean and consistent solution: > > - PostgreSQL requires to be compiled with --enable-multibyte > and --enable-unicode-convertion if it ought to work correctly > with Tcl/Tk >= 8.1 (client or server side). > > - PL/Tcl needs to be changed to use pg_do_encoding_conversion > if it runs on a Tcl version >= 8.1 . > > - PgTcl needs to be changed to set it's cliend encoding to UNICODE > if it runs on a Tcl version >= 8.1 . > > Can we agree on that? yes I'll do pl/tcl part in the next version of patch. Using this approach we can eliminate overhead for databases in UNICODE. Can somebody add tcl version checking to configure.in so we can do that all on configure stage?
On Thu, 6 Sep 2001, Vsevolod Lobko wrote:
> On Thu, 6 Sep 2001, Reinhard Max wrote:
>
> > So this seems to be the most clean and consistent solution:
> >
> > - PostgreSQL requires to be compiled with --enable-multibyte
> >   and --enable-unicode-convertion if it ought to work correctly
> >   with Tcl/Tk >= 8.1 (client or server side).
> >
> > - PL/Tcl needs to be changed to use pg_do_encoding_conversion
> >   if it runs on a Tcl version >= 8.1 .
> >
> > - PgTcl needs to be changed to set it's cliend encoding to UNICODE
> >   if it runs on a Tcl version >= 8.1 .
> >
> > Can we agree on that?
>
> yes
Great!
> I'll do pl/tcl part in the next version of patch. Using this
> approach we can eliminate overhead for databases in UNICODE.
>
> Can somebody add tcl version checking to configure.in so we can do
> that all on configure stage?
Hmm, if we check the Tcl version at runtime instead, we could have
PL/Tcl and PgTcl libs that don't depend on the Tcl version they run
on. So one binary could possibly be used for all Tcl versions from 7.6
up to 8.4 .
The only useful change to configure.in I could think of is to issue a
warning if configure is called without --enable-multibyte and
--enable-unicode-convertion , but the available Tcl is >= 8.1 .
cu
    Reinhard
			
		> > > So this seems to be the most clean and consistent solution: > > > > > > - PostgreSQL requires to be compiled with --enable-multibyte > > > and --enable-unicode-convertion if it ought to work correctly > > > with Tcl/Tk >= 8.1 (client or server side). > > > > > > - PL/Tcl needs to be changed to use pg_do_encoding_conversion > > > if it runs on a Tcl version >= 8.1 . > > > > > > - PgTcl needs to be changed to set it's cliend encoding to UNICODE > > > if it runs on a Tcl version >= 8.1 . > > > > > > Can we agree on that? > > > > yes > > Great! > > > I'll do pl/tcl part in the next version of patch. Using this > > approach we can eliminate overhead for databases in UNICODE. > > > > Can somebody add tcl version checking to configure.in so we can do > > that all on configure stage? > > Hmm, if we check the Tcl version at runtime instead, we could have > PL/Tcl and PgTcl libs that don't depend on the Tcl version they run > on. So one binary could possibly be used for all Tcl versions from 7.6 > up to 8.4 . I'm hate tcl stubs :)) And I have tcl 8.0, 8.2 and 8.3 installed simultaneously almost anywhere and don't want autodetect. Yours approach ok for Pgtcl - because you load pgtcl into running tcl interpreter, but with pl/tcl you load pltcl.so and it as dependency loads tcl library, so stub library needs to select "best" tcl version to load :)) > The only useful change to configure.in I could think of is to issue a > warning if configure is called without --enable-multibyte and > --enable-unicode-convertion , but the available Tcl is >= 8.1 . ok, for now I do it at compile time, based on TCL_MAJOR_VERSION & TCL_MINOR_VERSION
Vsevolod Lobko <seva@sevasoft.kiev.ua> writes:
> Can somebody add tcl version checking to configure.in so we can do
> that all on configure stage?
> ok, for now I do it at compile time, based on TCL_MAJOR_VERSION &
> TCL_MINOR_VERSION
I think a compile-time #if on TCL_MAJOR_VERSION is the right thing to
do.  configure couldn't do any more than that for you, anyway.
Also, a runtime test will not work since if you are on an older version,
the calls to the UTF routines will fail to link.
            regards, tom lane
			
		Vsevolod Lobko writes: > > - PostgreSQL requires to be compiled with --enable-multibyte > > and --enable-unicode-convertion if it ought to work correctly > > with Tcl/Tk >= 8.1 (client or server side). > > > > - PL/Tcl needs to be changed to use pg_do_encoding_conversion > > if it runs on a Tcl version >= 8.1 . > I'll do pl/tcl part in the next version of patch. Using this approach we > can eliminate overhead for databases in UNICODE. Any progress on this? I'd prefer to get rid of this --enable-pltcl-utf option before release. -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
On Sun, 23 Sep 2001, Peter Eisentraut wrote: > Vsevolod Lobko writes: > > > > - PostgreSQL requires to be compiled with --enable-multibyte > > > and --enable-unicode-convertion if it ought to work correctly > > > with Tcl/Tk >= 8.1 (client or server side). > > > > > > - PL/Tcl needs to be changed to use pg_do_encoding_conversion > > > if it runs on a Tcl version >= 8.1 . > > > I'll do pl/tcl part in the next version of patch. Using this approach we > > can eliminate overhead for databases in UNICODE. > > Any progress on this? I'd prefer to get rid of this --enable-pltcl-utf > option before release. Done Next version removes --enable-pltcl-utf switch and enables embedded utf conversion of pgsql if tcl version >=8.1 and --enable-unicode-conversion
Вложения
Your patch has been added to the PostgreSQL unapplied patches list at:
    http://candle.pha.pa.us/cgi-bin/pgpatches
I will try to apply it within the next 48 hours.
> On Sun, 23 Sep 2001, Peter Eisentraut wrote:
>
> > Vsevolod Lobko writes:
> >
> > > > - PostgreSQL requires to be compiled with --enable-multibyte
> > > >   and --enable-unicode-convertion if it ought to work correctly
> > > >   with Tcl/Tk >= 8.1 (client or server side).
> > > >
> > > > - PL/Tcl needs to be changed to use pg_do_encoding_conversion
> > > >   if it runs on a Tcl version >= 8.1 .
> >
> > > I'll do pl/tcl part in the next version of patch. Using this approach we
> > > can eliminate overhead for databases in UNICODE.
> >
> > Any progress on this?  I'd prefer to get rid of this --enable-pltcl-utf
> > option before release.
>
> Done
>
> Next version removes --enable-pltcl-utf switch and enables embedded
> utf conversion of pgsql if tcl version >=8.1 and --enable-unicode-conversion
>
Content-Description:
[ Attachment, skipping... ]
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
			
		Patch applied. Thanks. > On Sun, 23 Sep 2001, Peter Eisentraut wrote: > > > Vsevolod Lobko writes: > > > > > > - PostgreSQL requires to be compiled with --enable-multibyte > > > > and --enable-unicode-convertion if it ought to work correctly > > > > with Tcl/Tk >= 8.1 (client or server side). > > > > > > > > - PL/Tcl needs to be changed to use pg_do_encoding_conversion > > > > if it runs on a Tcl version >= 8.1 . > > > > > I'll do pl/tcl part in the next version of patch. Using this approach we > > > can eliminate overhead for databases in UNICODE. > > > > Any progress on this? I'd prefer to get rid of this --enable-pltcl-utf > > option before release. > > Done > > Next version removes --enable-pltcl-utf switch and enables embedded > utf conversion of pgsql if tcl version >=8.1 and --enable-unicode-conversion > Content-Description: [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
On Fri, 28 Sep 2001, Bruce Momjian wrote: > Patch applied. Thanks. Patch seems to be lost in space.. So there are new version, now with documentation update > > On Sun, 23 Sep 2001, Peter Eisentraut wrote: > > > > > Vsevolod Lobko writes: > > > > > > > > - PostgreSQL requires to be compiled with --enable-multibyte > > > > > and --enable-unicode-convertion if it ought to work correctly > > > > > with Tcl/Tk >= 8.1 (client or server side). > > > > > > > > > > - PL/Tcl needs to be changed to use pg_do_encoding_conversion > > > > > if it runs on a Tcl version >= 8.1 . > > > > > > > I'll do pl/tcl part in the next version of patch. Using this approach we > > > > can eliminate overhead for databases in UNICODE. > > > > > > Any progress on this? I'd prefer to get rid of this --enable-pltcl-utf > > > option before release. > > > > Done > > > > Next version removes --enable-pltcl-utf switch and enables embedded > > utf conversion of pgsql if tcl version >=8.1 and --enable-unicode-conversion > > > > Content-Description: > > [ Attachment, skipping... ] > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 853-3000 > + If your life is a hard drive, | 830 Blythe Avenue > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html >
Вложения
You are right. The patch applied with no conflicts, meaning it wasn't in there. I have no idea how it got lost. I ran autoconf too. Thanks. > On Fri, 28 Sep 2001, Bruce Momjian wrote: > > > Patch applied. Thanks. > > Patch seems to be lost in space.. > So there are new version, now with documentation update > > > > On Sun, 23 Sep 2001, Peter Eisentraut wrote: > > > > > > > Vsevolod Lobko writes: > > > > > > > > > > - PostgreSQL requires to be compiled with --enable-multibyte > > > > > > and --enable-unicode-convertion if it ought to work correctly > > > > > > with Tcl/Tk >= 8.1 (client or server side). > > > > > > > > > > > > - PL/Tcl needs to be changed to use pg_do_encoding_conversion > > > > > > if it runs on a Tcl version >= 8.1 . > > > > > > > > > I'll do pl/tcl part in the next version of patch. Using this approach we > > > > > can eliminate overhead for databases in UNICODE. > > > > > > > > Any progress on this? I'd prefer to get rid of this --enable-pltcl-utf > > > > option before release. > > > > > > Done > > > > > > Next version removes --enable-pltcl-utf switch and enables embedded > > > utf conversion of pgsql if tcl version >=8.1 and --enable-unicode-conversion > > > > > > > Content-Description: > > > > [ Attachment, skipping... ] > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > > > > -- > > Bruce Momjian | http://candle.pha.pa.us > > pgman@candle.pha.pa.us | (610) 853-3000 > > + If your life is a hard drive, | 830 Blythe Avenue > > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 5: Have you checked our extensive FAQ? > > > > http://www.postgresql.org/users-lounge/docs/faq.html > > Content-Description: [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026