Обсуждение: varchar sort ordering ignore blanks
Hi there,
I have a table with a single column, pk of varchar type
The table contains few names, say:
XXXX A
XXXX C
XXXXB
In the first two records there is a between the XXXX and the following letter
A and C while, the third one has a B immediately following the XXXX (without
blanks).
In postgres 7.4.7 (debian sarge), if I issue a select to sort the record I
(correctly) obtain:
XXXX A
XXXX C
XXXXB
In postgres 8.1.9 (debian etch), if I issue a select to sort the record I
(mistakenly) obtain:
XXXX A
XXXXB
XXXX C
That is: the sort order in postgres 8.1.9 seems to ignore the blank.
In all cases I'm using locale LATIN9 during DB creation, but I tested also
with ASCII, UTF8 and LATIN1 encoding.
Can someone help me to get the correct order in postgres 8.1.9 ?
=== Sample code ===
CREATE TABLE t_table
(
c_column varchar(30) NOT NULL,
CONSTRAINT t_table_pk PRIMARY KEY (c_column)
)
WITHOUT OIDS;
INSERT INTO t_table(c_column) VALUES ('XXXX A');
INSERT INTO t_table(c_column) VALUES ('XXXXB');
INSERT INTO t_table(c_column) VALUES ('XXXX C');
select * from t_table order by c_column asc;
=============
Thanks, Luca Arzeni
Luca Arzeni <l.arzeni@amadego.com> writes:
> That is: the sort order in postgres 8.1.9 seems to ignore the blank.
This is expected behavior in most non-C locales.
> In all cases I'm using locale LATIN9 during DB creation, but I tested also
> with ASCII, UTF8 and LATIN1 encoding.
LATIN9 isn't a locale, it's an encoding. Try "initdb --locale=C".
regards, tom lane
On Jan 15, 2008 6:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Luca Arzeni <l.arzeni@amadego.com> writes:This is expected behavior in most non-C locales.
> That is: the sort order in postgres 8.1.9 seems to ignore the blank.LATIN9 isn't a locale, it's an encoding. Try "initdb --locale=C".
> In all cases I'm using locale LATIN9 during DB creation, but I tested also
> with ASCII, UTF8 and LATIN1 encoding.
regards, tom lane
--------------------
I guess this has nothing to do with the encoding, but with the collation
rules used, which is governed by "lc_collate" parameter. See what you
get on both DBs for:
SHOW lc_collate ;
HTH,
Csaba.
-------------------
I guess this has nothing to do with the encoding, but with the collation
rules used, which is governed by "lc_collate" parameter. See what you
get on both DBs for:
SHOW lc_collate ;
HTH,
Csaba.
-------------------
Thanks Tom, and Csaba
both of you hit the problem: actually Postgres 7.4.7 has a C locale and Postgres 8.1 has US.UTF8 locale. Setting locale to locale=C or locale=POSIX for release 8.1 solved this issue, but it opens another one: if I use locale=C, I get
XXXX A
XXXX C
XXXXB
as sort order, but this setting gives me an error when it cames to:
XXXX d
XXXX e
XXXX f
XXXX è
because the right sort ordering should be:
XXXX d
XXXX e
XXXX è
XXXX f
So the problem is:
- C or POSIX locale is OK with blanks but fails on locale specific vowels
- LATIN9 locale is OK with vowels but ignores blanks
Is there any way to consider blanks meaningfull AND sort properly locale specific vowels ?
I don't know what SQL standard says about this issue, but I'm sure that in Italy you sort names considering vowels AND blanks!
Thanks, Luca