Обсуждение: Collation in ORDER BY not lexicographical
Hi! We have big problems with collation in ORDER BY, which happens in binary order, not alphabetic (lexicographical), like:. A B Z a z Ä Ö ä ö This is what we have done: 1. initdb -U myuser -E UTF-8 --locale=de_DE.UTF-8 -D /mydata 2. pg_ctl -U myuser -D /mydata -l logfile start 3. createdb -U myuser mydb 4. create table mytable(name text) 5. insert into mytable values('Adam'); .... 6. select * from mytable order by name The result is: "Adam" "Berta" "Mann" "Zoo" "Zoodirektor" "adam" "Äpfel" "Öl" "äpfel" "locale -a" shows: de_DE de_DE.ISO8859-1 de_DE.ISO8859-15 de_DE.UTF-8 "psql -l" shows: List of databases Name | Owner | Encoding | Collation | Ctype | Access privileges ----------+-----------+----------+-------------+------------- +----------------------------------- postgres | myuser | UTF8 | de_DE.UTF-8 | de_DE.UTF-8 | template0 | myuser | UTF8 | de_DE.UTF-8 | de_DE.UTF-8 | =c/myuser : myuser =CTc/myuser template1 | myuser | UTF8 | de_DE.UTF-8 | de_DE.UTF-8 | =c/myuser : myuser =CTc/myuser mydb | myuser | UTF8 | de_DE.UTF-8 | de_DE.UTF-8 | PG is running on Mac OS X 10.5 and 10.6 Intel. Any help is appreciated. Thanks very much in advance. Paul
On Tue, Sep 29, 2009 at 2:52 AM, Paul Gaspar <devlist@revolversoft.com> wrote: > Hi! > > We have big problems with collation in ORDER BY, which happens in binary > order, not alphabetic (lexicographical), like:. > > A > B > Z > a > z > Ä > Ö > ä > ö > > PG is running on Mac OS X 10.5 and 10.6 Intel. I seem to recall there were some problem with Mac locales at some point being broken. Could be you're running into that issue.
am 29.09.2009 11:21 Uhr schrieb Scott Marlowe unter scott.marlowe@gmail.com: > On Tue, Sep 29, 2009 at 2:52 AM, Paul Gaspar <devlist@revolversoft.com> wrote: >> Hi! >> >> We have big problems with collation in ORDER BY, which happens in binary >> order, not alphabetic (lexicographical), like:. >> >> A >> B >> Z >> a >> z >> Ä >> Ö >> ä >> ö >> > >> PG is running on Mac OS X 10.5 and 10.6 Intel. > > I seem to recall there were some problem with Mac locales at some > point being broken. Could be you're running into that issue. Yep, i ran into this as well. Here is my workaround: Create a function like this: CREATE OR REPLACE FUNCTION f_getorderbyfriendlyversion(texttoconvert text) RETURNS text AS $BODY$ select replace(replace(replace(replace(replace(replace($1,'Ä','A'),'Ö','O'),'Ü','U' ),'ä','a'),'ö','o'),'ü','u'); $BODY$ LANGUAGE 'sql' IMMUTABLE STRICT COST 100; ALTER FUNCTION f_getorderbyfriendlyversion(text) OWNER TO postgres; Then create an index like this: create index idx_personen_nachname_orderByFriendly on personen (f_getorderbyfriendlyversion(nachname)) Now you can do: select * from personen order by f_getorderbyfriendlyversion(p.nachname) Seems pretty fast. Best, Maximilian Tyrtania
On Tue, 2009-09-29 at 03:21 -0600, Scott Marlowe wrote: > On Tue, Sep 29, 2009 at 2:52 AM, Paul Gaspar <devlist@revolversoft.com> wrote: > > Hi! > > > > We have big problems with collation in ORDER BY, which happens in binary > > order, not alphabetic (lexicographical), like:. > > > > A > > B > > Z > > a > > z > > Ä > > Ö > > ä > > ö > > > > > PG is running on Mac OS X 10.5 and 10.6 Intel. > > I seem to recall there were some problem with Mac locales at some > point being broken. Could be you're running into that issue. Yes, the UTF8 locales on BSD systems (Mac OS X, FreeBSD, etc.) are dysfunctional. Either switch to a non-UTF8 locale or a different operating system.
Thank you all very much for your help. Maximilian, we simplified your replacing code: > replace(replace(replace(replace(replace(replace > ($1,'Ä','A'),'Ö','O'),'Ü','U' > ),'ä','a'),'ö','o'),'ü','u'); to this: translate(upper($1),'ÄÖÜ','AOU') Paul Am 29.09.2009 um 14:36 schrieb Maximilian Tyrtania: > am 29.09.2009 11:21 Uhr schrieb Scott Marlowe unter scott.marlowe@gmail.com > : > >> On Tue, Sep 29, 2009 at 2:52 AM, Paul Gaspar <devlist@revolversoft.com >> > wrote: >>> Hi! >>> >>> We have big problems with collation in ORDER BY, which happens in >>> binary >>> order, not alphabetic (lexicographical), like:. >> >>> PG is running on Mac OS X 10.5 and 10.6 Intel. >> >> I seem to recall there were some problem with Mac locales at some >> point being broken. Could be you're running into that issue. > > Yep, i ran into this as well. Here is my workaround: Create a > function like > this: > > CREATE OR REPLACE FUNCTION f_getorderbyfriendlyversion(texttoconvert > text) > > RETURNS text AS > $BODY$ > select > replace(replace(replace(replace(replace(replace > ($1,'Ä','A'),'Ö','O'),'Ü','U' > ),'ä','a'),'ö','o'),'ü','u'); > > $BODY$ > > LANGUAGE 'sql' IMMUTABLE STRICT > COST 100; > > ALTER FUNCTION f_getorderbyfriendlyversion(text) OWNER TO postgres; > > Then create an index like this: > > create index idx_personen_nachname_orderByFriendly on personen > (f_getorderbyfriendlyversion(nachname)) > > > Now you can do: > > select * from personen order by f_getorderbyfriendlyversion > (p.nachname) > > Seems pretty fast. > > Best, > > Maximilian Tyrtania