* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > I was a little worried that it was too much to hope for that all libc
> > vendors on earth would ship a strxfrm() implementation that was actually
> > consistent with strcoll(), and here we are.
>=20
> Indeed. To try to put some scope on the problem, I made an idiot little
> program that just generates some random UTF8 strings and sees whether
> strcoll and strxfrm sort them alike. Attached are that program, a even
> more idiot little shell script that runs it over all available UTF8
> locales, and the results on my RHEL6 box. While de_DE seems to be the
> worst-broken locale, it's far from the only one.
>=20
> Please try this on as many platforms as you can get hold of ...
Results for Ubuntu 14.04:
sfrost@dwemer:/home/sfrost> sh tryalllocales.sh =20
Using LC_COLLATE =3D "C.UTF-8"
Using LC_CTYPE =3D "en_US.UTF-8"
C.UTF-8 good
Using LC_COLLATE =3D "de_DE.utf8"
Using LC_CTYPE =3D "en_US.UTF-8"
inconsistency between strcoll (36) and strxfrm (35) orders
inconsistency between strcoll (35) and strxfrm (36) orders
inconsistency between strcoll (160) and strxfrm (159) orders
inconsistency between strcoll (159) and strxfrm (160) orders
inconsistency between strcoll (347) and strxfrm (346) orders
inconsistency between strcoll (348) and strxfrm (347) orders
inconsistency between strcoll (346) and strxfrm (348) orders
inconsistency between strcoll (355) and strxfrm (353) orders
inconsistency between strcoll (353) and strxfrm (354) orders
inconsistency between strcoll (354) and strxfrm (355) orders
inconsistency between strcoll (440) and strxfrm (439) orders
inconsistency between strcoll (441) and strxfrm (440) orders
inconsistency between strcoll (439) and strxfrm (441) orders
inconsistency between strcoll (450) and strxfrm (449) orders
inconsistency between strcoll (449) and strxfrm (450) orders
inconsistency between strcoll (454) and strxfrm (452) orders
inconsistency between strcoll (455) and strxfrm (453) orders
inconsistency between strcoll (452) and strxfrm (454) orders
inconsistency between strcoll (453) and strxfrm (455) orders
inconsistency between strcoll (521) and strxfrm (520) orders
inconsistency between strcoll (520) and strxfrm (521) orders
inconsistency between strcoll (529) and strxfrm (528) orders
inconsistency between strcoll (528) and strxfrm (529) orders
inconsistency between strcoll (682) and strxfrm (681) orders
inconsistency between strcoll (681) and strxfrm (682) orders
inconsistency between strcoll (743) and strxfrm (742) orders
inconsistency between strcoll (742) and strxfrm (743) orders
inconsistency between strcoll (830) and strxfrm (829) orders
inconsistency between strcoll (829) and strxfrm (830) orders
inconsistency between strcoll (870) and strxfrm (869) orders
inconsistency between strcoll (869) and strxfrm (870) orders
inconsistency between strcoll (933) and strxfrm (931) orders
inconsistency between strcoll (931) and strxfrm (932) orders
inconsistency between strcoll (932) and strxfrm (933) orders
de_DE.utf8 BAD
Using LC_COLLATE =3D "en_US.utf8"
Using LC_CTYPE =3D "en_US.UTF-8"
en_US.utf8 good
Thanks!
Stephen