Обсуждение: Patch to make Turks happy.
Hi,
Yet another problem with Turkish encoding. clean_encoding_name()
in src/backend/utils/mb/encnames.c uses tolower() to convert locale
names to lower-case. This causes errors if locale name contains
capital "I" and current olcale is Turkish. Some examples:
aaa=# \l
List of databases
Name | Owner | Encoding
-----------+-------+----------
aaa | pgsql | LATIN5
bbb | pgsql | LATIN5
template0 | pgsql | LATIN5
template1 | pgsql | LATIN5
(4 rows)
aaa=# CREATE DATABASE ccc ENCODING='LATIN5';
ERROR: LATIN5 is not a valid encoding name
aaa=# \encoding
SQL_ASCII
aaa=# \encoding SQL_ASCII
SQL_ASCII: invalid encoding name or conversion procedure not found
aaa=# \encoding LATIN5
LATIN5: invalid encoding name or conversion procedure not found
Patch, is a simple change to use ASCII-only lower-case conversion
instead of locale-dependent tolower()
Best regards,
Nic.
*** ./src/backend/utils/mb/encnames.c.orig Mon Dec 2 15:58:49 2002
--- ./src/backend/utils/mb/encnames.c Mon Dec 2 18:13:23 2002
***************
*** 407,413 ****
for (p = key, np = newkey; *p != '\0'; p++)
{
if (isalnum((unsigned char) *p))
! *np++ = tolower((unsigned char) *p);
}
*np = '\0';
return newkey;
--- 407,416 ----
for (p = key, np = newkey; *p != '\0'; p++)
{
if (isalnum((unsigned char) *p))
! if (*p >= 'A' && *p <= 'Z')
! *np++ = *p + 'a' - 'A';
! else
! *np++ = *p;
}
*np = '\0';
return newkey;
I am not going to apply this patch because I think it will mess up the
handling of other locales.
---------------------------------------------------------------------------
Nicolai Tufar wrote:
> Hi,
>
> Yet another problem with Turkish encoding. clean_encoding_name()
> in src/backend/utils/mb/encnames.c uses tolower() to convert locale
> names to lower-case. This causes errors if locale name contains
> capital "I" and current olcale is Turkish. Some examples:
>
> aaa=# \l
> List of databases
> Name | Owner | Encoding
> -----------+-------+----------
> aaa | pgsql | LATIN5
> bbb | pgsql | LATIN5
> template0 | pgsql | LATIN5
> template1 | pgsql | LATIN5
> (4 rows)
> aaa=# CREATE DATABASE ccc ENCODING='LATIN5';
> ERROR: LATIN5 is not a valid encoding name
> aaa=# \encoding
> SQL_ASCII
> aaa=# \encoding SQL_ASCII
> SQL_ASCII: invalid encoding name or conversion procedure not found
> aaa=# \encoding LATIN5
> LATIN5: invalid encoding name or conversion procedure not found
>
>
> Patch, is a simple change to use ASCII-only lower-case conversion
> instead of locale-dependent tolower()
>
> Best regards,
> Nic.
>
>
>
>
>
>
> *** ./src/backend/utils/mb/encnames.c.orig Mon Dec 2 15:58:49 2002
> --- ./src/backend/utils/mb/encnames.c Mon Dec 2 18:13:23 2002
> ***************
> *** 407,413 ****
> for (p = key, np = newkey; *p != '\0'; p++)
> {
> if (isalnum((unsigned char) *p))
> ! *np++ = tolower((unsigned char) *p);
> }
> *np = '\0';
> return newkey;
> --- 407,416 ----
> for (p = key, np = newkey; *p != '\0'; p++)
> {
> if (isalnum((unsigned char) *p))
> ! if (*p >= 'A' && *p <= 'Z')
> ! *np++ = *p + 'a' - 'A';
> ! else
> ! *np++ = *p;
> }
> *np = '\0';
> return newkey;
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian wrote:
> I am not going to apply this patch because I think it will mess up the
> handling of other locales.
As far as I figured from the source code this function only deals with
cleaning up
locale names and nothing else. Since all the locale names are in plain
ASCII I think
it will be safe to use ASCII-only lower-case conversion.
By the way, I noticed only after sending the patch that compiler
complains about
ambiguous `else' so it can be rewritten as:
if (*p >= 'A' && *p <= 'Z'){
*np++ = *p + 'a' - 'A';
}else{
*np++ = *p;
}
Regards,
Nicolai
>
>
> ---------------------------------------------------------------------------
>
> Nicolai Tufar wrote:
>
>>Hi,
>>
>>Yet another problem with Turkish encoding. clean_encoding_name()
>>in src/backend/utils/mb/encnames.c uses tolower() to convert locale
>>names to lower-case. This causes errors if locale name contains
>>capital "I" and current olcale is Turkish. Some examples:
>>
>>aaa=# \l
>> List of databases
>> Name | Owner | Encoding
>>-----------+-------+----------
>> aaa | pgsql | LATIN5
>> bbb | pgsql | LATIN5
>> template0 | pgsql | LATIN5
>> template1 | pgsql | LATIN5
>>(4 rows)
>>aaa=# CREATE DATABASE ccc ENCODING='LATIN5';
>>ERROR: LATIN5 is not a valid encoding name
>>aaa=# \encoding
>>SQL_ASCII
>>aaa=# \encoding SQL_ASCII
>>SQL_ASCII: invalid encoding name or conversion procedure not found
>>aaa=# \encoding LATIN5
>>LATIN5: invalid encoding name or conversion procedure not found
>>
>>
>>Patch, is a simple change to use ASCII-only lower-case conversion
>>instead of locale-dependent tolower()
>>
>>Best regards,
>>Nic.
>>
>>
>>
>>
>>
>>
>>*** ./src/backend/utils/mb/encnames.c.orig Mon Dec 2 15:58:49 2002
>>--- ./src/backend/utils/mb/encnames.c Mon Dec 2 18:13:23 2002
>>***************
>>*** 407,413 ****
>> for (p = key, np = newkey; *p != '\0'; p++)
>> {
>> if (isalnum((unsigned char) *p))
>>! *np++ = tolower((unsigned char) *p);
>> }
>> *np = '\0';
>> return newkey;
>>--- 407,416 ----
>> for (p = key, np = newkey; *p != '\0'; p++)
>> {
>> if (isalnum((unsigned char) *p))
>>! if (*p >= 'A' && *p <= 'Z')
>>! *np++ = *p + 'a' - 'A';
>>! else
>>! *np++ = *p;
>> }
>> *np = '\0';
>> return newkey;
>>
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 4: Don't 'kill -9' the postmaster
>>
>
>
Bruce Momjian writes:
> I am not going to apply this patch because I think it will mess up the
> handling of other locales.
This patch looks OK to me. Normally, character set names should use
identifier case-folding rules anyway, so seems to be a step in the right
direction. Much better than saying that users of certain locales can't
properly use PostgreSQL.
>
>
> ---------------------------------------------------------------------------
>
> Nicolai Tufar wrote:
> > Hi,
> >
> > Yet another problem with Turkish encoding. clean_encoding_name()
> > in src/backend/utils/mb/encnames.c uses tolower() to convert locale
> > names to lower-case. This causes errors if locale name contains
> > capital "I" and current olcale is Turkish. Some examples:
> >
> > aaa=# \l
> > List of databases
> > Name | Owner | Encoding
> > -----------+-------+----------
> > aaa | pgsql | LATIN5
> > bbb | pgsql | LATIN5
> > template0 | pgsql | LATIN5
> > template1 | pgsql | LATIN5
> > (4 rows)
> > aaa=# CREATE DATABASE ccc ENCODING='LATIN5';
> > ERROR: LATIN5 is not a valid encoding name
> > aaa=# \encoding
> > SQL_ASCII
> > aaa=# \encoding SQL_ASCII
> > SQL_ASCII: invalid encoding name or conversion procedure not found
> > aaa=# \encoding LATIN5
> > LATIN5: invalid encoding name or conversion procedure not found
> >
> >
> > Patch, is a simple change to use ASCII-only lower-case conversion
> > instead of locale-dependent tolower()
> >
> > Best regards,
> > Nic.
> >
> >
> >
> >
> >
> >
> > *** ./src/backend/utils/mb/encnames.c.orig Mon Dec 2 15:58:49 2002
> > --- ./src/backend/utils/mb/encnames.c Mon Dec 2 18:13:23 2002
> > ***************
> > *** 407,413 ****
> > for (p = key, np = newkey; *p != '\0'; p++)
> > {
> > if (isalnum((unsigned char) *p))
> > ! *np++ = tolower((unsigned char) *p);
> > }
> > *np = '\0';
> > return newkey;
> > --- 407,416 ----
> > for (p = key, np = newkey; *p != '\0'; p++)
> > {
> > if (isalnum((unsigned char) *p))
> > ! if (*p >= 'A' && *p <= 'Z')
> > ! *np++ = *p + 'a' - 'A';
> > ! else
> > ! *np++ = *p;
> > }
> > *np = '\0';
> > return newkey;
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 4: Don't 'kill -9' the postmaster
> >
>
>
--
Peter Eisentraut peter_e@gmx.net
OK, Peter, that helps. Thanks. I will apply it.
---------------------------------------------------------------------------
Peter Eisentraut wrote:
> Bruce Momjian writes:
>
> > I am not going to apply this patch because I think it will mess up the
> > handling of other locales.
>
> This patch looks OK to me. Normally, character set names should use
> identifier case-folding rules anyway, so seems to be a step in the right
> direction. Much better than saying that users of certain locales can't
> properly use PostgreSQL.
>
> >
> >
> > ---------------------------------------------------------------------------
> >
> > Nicolai Tufar wrote:
> > > Hi,
> > >
> > > Yet another problem with Turkish encoding. clean_encoding_name()
> > > in src/backend/utils/mb/encnames.c uses tolower() to convert locale
> > > names to lower-case. This causes errors if locale name contains
> > > capital "I" and current olcale is Turkish. Some examples:
> > >
> > > aaa=# \l
> > > List of databases
> > > Name | Owner | Encoding
> > > -----------+-------+----------
> > > aaa | pgsql | LATIN5
> > > bbb | pgsql | LATIN5
> > > template0 | pgsql | LATIN5
> > > template1 | pgsql | LATIN5
> > > (4 rows)
> > > aaa=# CREATE DATABASE ccc ENCODING='LATIN5';
> > > ERROR: LATIN5 is not a valid encoding name
> > > aaa=# \encoding
> > > SQL_ASCII
> > > aaa=# \encoding SQL_ASCII
> > > SQL_ASCII: invalid encoding name or conversion procedure not found
> > > aaa=# \encoding LATIN5
> > > LATIN5: invalid encoding name or conversion procedure not found
> > >
> > >
> > > Patch, is a simple change to use ASCII-only lower-case conversion
> > > instead of locale-dependent tolower()
> > >
> > > Best regards,
> > > Nic.
> > >
> > >
> > >
> > >
> > >
> > >
> > > *** ./src/backend/utils/mb/encnames.c.orig Mon Dec 2 15:58:49 2002
> > > --- ./src/backend/utils/mb/encnames.c Mon Dec 2 18:13:23 2002
> > > ***************
> > > *** 407,413 ****
> > > for (p = key, np = newkey; *p != '\0'; p++)
> > > {
> > > if (isalnum((unsigned char) *p))
> > > ! *np++ = tolower((unsigned char) *p);
> > > }
> > > *np = '\0';
> > > return newkey;
> > > --- 407,416 ----
> > > for (p = key, np = newkey; *p != '\0'; p++)
> > > {
> > > if (isalnum((unsigned char) *p))
> > > ! if (*p >= 'A' && *p <= 'Z')
> > > ! *np++ = *p + 'a' - 'A';
> > > ! else
> > > ! *np++ = *p;
> > > }
> > > *np = '\0';
> > > return newkey;
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 4: Don't 'kill -9' the postmaster
> > >
> >
> >
>
> --
> Peter Eisentraut peter_e@gmx.net
>
>
>
>
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
OK, patch applied. Peter, should this appear in 7.3.1 too?
---------------------------------------------------------------------------
Peter Eisentraut wrote:
> Bruce Momjian writes:
>
> > I am not going to apply this patch because I think it will mess up the
> > handling of other locales.
>
> This patch looks OK to me. Normally, character set names should use
> identifier case-folding rules anyway, so seems to be a step in the right
> direction. Much better than saying that users of certain locales can't
> properly use PostgreSQL.
>
> >
> >
> > ---------------------------------------------------------------------------
> >
> > Nicolai Tufar wrote:
> > > Hi,
> > >
> > > Yet another problem with Turkish encoding. clean_encoding_name()
> > > in src/backend/utils/mb/encnames.c uses tolower() to convert locale
> > > names to lower-case. This causes errors if locale name contains
> > > capital "I" and current olcale is Turkish. Some examples:
> > >
> > > aaa=# \l
> > > List of databases
> > > Name | Owner | Encoding
> > > -----------+-------+----------
> > > aaa | pgsql | LATIN5
> > > bbb | pgsql | LATIN5
> > > template0 | pgsql | LATIN5
> > > template1 | pgsql | LATIN5
> > > (4 rows)
> > > aaa=# CREATE DATABASE ccc ENCODING='LATIN5';
> > > ERROR: LATIN5 is not a valid encoding name
> > > aaa=# \encoding
> > > SQL_ASCII
> > > aaa=# \encoding SQL_ASCII
> > > SQL_ASCII: invalid encoding name or conversion procedure not found
> > > aaa=# \encoding LATIN5
> > > LATIN5: invalid encoding name or conversion procedure not found
> > >
> > >
> > > Patch, is a simple change to use ASCII-only lower-case conversion
> > > instead of locale-dependent tolower()
> > >
> > > Best regards,
> > > Nic.
> > >
> > >
> > >
> > >
> > >
> > >
> > > *** ./src/backend/utils/mb/encnames.c.orig Mon Dec 2 15:58:49 2002
> > > --- ./src/backend/utils/mb/encnames.c Mon Dec 2 18:13:23 2002
> > > ***************
> > > *** 407,413 ****
> > > for (p = key, np = newkey; *p != '\0'; p++)
> > > {
> > > if (isalnum((unsigned char) *p))
> > > ! *np++ = tolower((unsigned char) *p);
> > > }
> > > *np = '\0';
> > > return newkey;
> > > --- 407,416 ----
> > > for (p = key, np = newkey; *p != '\0'; p++)
> > > {
> > > if (isalnum((unsigned char) *p))
> > > ! if (*p >= 'A' && *p <= 'Z')
> > > ! *np++ = *p + 'a' - 'A';
> > > ! else
> > > ! *np++ = *p;
> > > }
> > > *np = '\0';
> > > return newkey;
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 4: Don't 'kill -9' the postmaster
> > >
> >
> >
>
> --
> Peter Eisentraut peter_e@gmx.net
>
>
>
>
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Index: src/backend/utils/mb/encnames.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/utils/mb/encnames.c,v
retrieving revision 1.10
diff -c -c -r1.10 encnames.c
*** src/backend/utils/mb/encnames.c 4 Sep 2002 20:31:31 -0000 1.10
--- src/backend/utils/mb/encnames.c 5 Dec 2002 23:19:40 -0000
***************
*** 407,413 ****
for (p = key, np = newkey; *p != '\0'; p++)
{
if (isalnum((unsigned char) *p))
! *np++ = tolower((unsigned char) *p);
}
*np = '\0';
return newkey;
--- 407,418 ----
for (p = key, np = newkey; *p != '\0'; p++)
{
if (isalnum((unsigned char) *p))
! {
! if (*p >= 'A' && *p <= 'Z')
! *np++ = *p + 'a' - 'A';
! else
! *np++ = *p;
! }
}
*np = '\0';
return newkey;
Peter, is that patch OK for 7.3.1? I am not sure.
---------------------------------------------------------------------------
Peter Eisentraut wrote:
> Bruce Momjian writes:
>
> > I am not going to apply this patch because I think it will mess up the
> > handling of other locales.
>
> This patch looks OK to me. Normally, character set names should use
> identifier case-folding rules anyway, so seems to be a step in the right
> direction. Much better than saying that users of certain locales can't
> properly use PostgreSQL.
>
> >
> >
> > ---------------------------------------------------------------------------
> >
> > Nicolai Tufar wrote:
> > > Hi,
> > >
> > > Yet another problem with Turkish encoding. clean_encoding_name()
> > > in src/backend/utils/mb/encnames.c uses tolower() to convert locale
> > > names to lower-case. This causes errors if locale name contains
> > > capital "I" and current olcale is Turkish. Some examples:
> > >
> > > aaa=# \l
> > > List of databases
> > > Name | Owner | Encoding
> > > -----------+-------+----------
> > > aaa | pgsql | LATIN5
> > > bbb | pgsql | LATIN5
> > > template0 | pgsql | LATIN5
> > > template1 | pgsql | LATIN5
> > > (4 rows)
> > > aaa=# CREATE DATABASE ccc ENCODING='LATIN5';
> > > ERROR: LATIN5 is not a valid encoding name
> > > aaa=# \encoding
> > > SQL_ASCII
> > > aaa=# \encoding SQL_ASCII
> > > SQL_ASCII: invalid encoding name or conversion procedure not found
> > > aaa=# \encoding LATIN5
> > > LATIN5: invalid encoding name or conversion procedure not found
> > >
> > >
> > > Patch, is a simple change to use ASCII-only lower-case conversion
> > > instead of locale-dependent tolower()
> > >
> > > Best regards,
> > > Nic.
> > >
> > >
> > >
> > >
> > >
> > >
> > > *** ./src/backend/utils/mb/encnames.c.orig Mon Dec 2 15:58:49 2002
> > > --- ./src/backend/utils/mb/encnames.c Mon Dec 2 18:13:23 2002
> > > ***************
> > > *** 407,413 ****
> > > for (p = key, np = newkey; *p != '\0'; p++)
> > > {
> > > if (isalnum((unsigned char) *p))
> > > ! *np++ = tolower((unsigned char) *p);
> > > }
> > > *np = '\0';
> > > return newkey;
> > > --- 407,416 ----
> > > for (p = key, np = newkey; *p != '\0'; p++)
> > > {
> > > if (isalnum((unsigned char) *p))
> > > ! if (*p >= 'A' && *p <= 'Z')
> > > ! *np++ = *p + 'a' - 'A';
> > > ! else
> > > ! *np++ = *p;
> > > }
> > > *np = '\0';
> > > return newkey;
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 4: Don't 'kill -9' the postmaster
> > >
> >
> >
>
> --
> Peter Eisentraut peter_e@gmx.net
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian writes: > Peter, is that patch OK for 7.3.1? I am not sure. Definitely. It's a bug fix. -- Peter Eisentraut peter_e@gmx.net
Thanks. Applied for 7.3.1. --------------------------------------------------------------------------- Peter Eisentraut wrote: > Bruce Momjian writes: > > > Peter, is that patch OK for 7.3.1? I am not sure. > > Definitely. It's a bug fix. > > -- > Peter Eisentraut peter_e@gmx.net > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073