Обсуждение: varchar sort ordering ignore blanks
Hi there, I have a table with a single column, pk of varchar type The table contains few names, say: XXXX A XXXX C XXXXB In the first two records there is a between the XXXX and the following letter A and C while, the third one has a B immediately following the XXXX (without blanks). In postgres 7.4.7 (debian sarge), if I issue a select to sort the record I (correctly) obtain: XXXX A XXXX C XXXXB In postgres 8.1.9 (debian etch), if I issue a select to sort the record I (mistakenly) obtain: XXXX A XXXXB XXXX C That is: the sort order in postgres 8.1.9 seems to ignore the blank. In all cases I'm using locale LATIN9 during DB creation, but I tested also with ASCII, UTF8 and LATIN1 encoding. Can someone help me to get the correct order in postgres 8.1.9 ? === Sample code === CREATE TABLE t_table ( c_column varchar(30) NOT NULL, CONSTRAINT t_table_pk PRIMARY KEY (c_column) ) WITHOUT OIDS; INSERT INTO t_table(c_column) VALUES ('XXXX A'); INSERT INTO t_table(c_column) VALUES ('XXXXB'); INSERT INTO t_table(c_column) VALUES ('XXXX C'); select * from t_table order by c_column asc; ============= Thanks, Luca Arzeni
Luca Arzeni <l.arzeni@amadego.com> writes: > That is: the sort order in postgres 8.1.9 seems to ignore the blank. This is expected behavior in most non-C locales. > In all cases I'm using locale LATIN9 during DB creation, but I tested also > with ASCII, UTF8 and LATIN1 encoding. LATIN9 isn't a locale, it's an encoding. Try "initdb --locale=C". regards, tom lane
On Jan 15, 2008 6:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Luca Arzeni <l.arzeni@amadego.com> writes:This is expected behavior in most non-C locales.
> That is: the sort order in postgres 8.1.9 seems to ignore the blank.LATIN9 isn't a locale, it's an encoding. Try "initdb --locale=C".
> In all cases I'm using locale LATIN9 during DB creation, but I tested also
> with ASCII, UTF8 and LATIN1 encoding.
regards, tom lane
--------------------
I guess this has nothing to do with the encoding, but with the collation
rules used, which is governed by "lc_collate" parameter. See what you
get on both DBs for:
SHOW lc_collate ;
HTH,
Csaba.
-------------------
I guess this has nothing to do with the encoding, but with the collation
rules used, which is governed by "lc_collate" parameter. See what you
get on both DBs for:
SHOW lc_collate ;
HTH,
Csaba.
-------------------
Thanks Tom, and Csaba
both of you hit the problem: actually Postgres 7.4.7 has a C locale and Postgres 8.1 has US.UTF8 locale. Setting locale to locale=C or locale=POSIX for release 8.1 solved this issue, but it opens another one: if I use locale=C, I get
XXXX A
XXXX C
XXXXB
as sort order, but this setting gives me an error when it cames to:
XXXX d
XXXX e
XXXX f
XXXX è
because the right sort ordering should be:
XXXX d
XXXX e
XXXX è
XXXX f
So the problem is:
- C or POSIX locale is OK with blanks but fails on locale specific vowels
- LATIN9 locale is OK with vowels but ignores blanks
Is there any way to consider blanks meaningfull AND sort properly locale specific vowels ?
I don't know what SQL standard says about this issue, but I'm sure that in Italy you sort names considering vowels AND blanks!
Thanks, Luca