Re: a strange order by behavior

Поиск
Список
Период
Сортировка
От Peter Eisentraut
Тема Re: a strange order by behavior
Дата
Msg-id 1308773415.10498.6.camel@vanquo.pezone.net
обсуждение исходный текст
Ответ на Re: a strange order by behavior  (Samuel Gendler <sgendler@ideasculptor.com>)
Список pgsql-sql
On ons, 2011-06-22 at 01:43 -0700, Samuel Gendler wrote:
> I seem to recall a thread here about it ignoring spaces entirely in that
> collation (and maybe ignoring capitalization, too?).

The way it works is that every collating element (letter or other
character or character group that you sort as a unit) is assigned four
weights (primary, secondary, tertiary, and quaternary), and the sorting
then first compares the primary weights, then the secondary weights,
etc.  The primary weight typically indicates the overall sort order,
like A before B, the secondary weight has to do with diacritic marks,
the tertiary with letter case, and the fourth level is only used in
special cases.  So that's why it looks as though the capitalization is
"ignored" unless both the primary and secondary weights are the same.

> This worked:
> 
> createdb  -E UTF-8 --lc-collate=C some_db
> 
> A quick google search
> reveals that there is some kind of standard for unicode collation (
> http://www.unicode.org/reports/tr10/ ) and I have no idea if that is what is
> represented by the en_US.UTF-8 collation or not.

At least the collate category of the en_US.UTF-8 locale on glibc is
unaltered from the ISO 14651 default ordering, which is equivalent to
the Unicode default ordering.  There several other locales for which
that is also the case.  Unfortunately, this is not exposed outside of
the glibc source code.  So you can't just select "give me a neutral
default ordering".




В списке pgsql-sql по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: a strange order by behavior
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: a strange order by behavior