Re: Character expansion with ICU collations

Поиск
Список
Период
Сортировка
От Finnerty, Jim
Тема Re: Character expansion with ICU collations
Дата
Msg-id 1C0E0C6B-2CC1-4711-9FBE-7EC6B1F6A4F5@amazon.com
обсуждение исходный текст
Ответ на Character expansion with ICU collations  ("Finnerty, Jim" <jfinnert@amazon.com>)
Ответы Re: Character expansion with ICU collations  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Список pgsql-hackers
>>
    You can have these queries return both rows if you use an
    accent-ignoring collation, like this example in the documentation:

    CREATE COLLATION ignore_accents (provider = icu, locale =
    'und-u-ks-level1-kc-true', deterministic = false);
<<

Indeed.  Is the dependency between the character expansion capability and accent-insensitive collations documented
anywhere?

Another unexpected dependency appears to be @colCaseFirst=upper.  If specified in combination with
colStrength=secondary,it appears that the upper/lower case ordering is random within a group of characters that are
secondaryequal, e.g. 'A' < 'a', but 'b' < 'B', 'c' < 'C', ...  , but then 'L' < 'l'.  It is not even consistently
orderedwith respect to case.  If I make it a nondeterministic CS_AI collation, then it sorts upper before lower
consistently. The rule seems to be that you can't sort by case within a group that is case-insensitive. 
 

Can a CI collation be ordered upper case first, or is this a limitation of ICU?

For example, this is part of the sort order that I'd like to achieve with ICU, with the code point in column 1 and
dense_rank()shown in the rightmost column indicating that 'b' = 'B', for example:
 

66    B    B    138    151
98    b    b    138    151   <- so within a group that is CI_AS equal, the sort order needs to be upper case first
67    C    C    139    152
99    c    c    139    152
199    Ç    Ç    140    153
231    ç    ç    140    153
68    D    D    141    154
100    d    d    141    154
208    Ð    Ð    142    199
240    ð    ð    142    199
69    E    E    143    155
101    e    e    143    155

Can this sort order be achieved with ICU?

More generally, is there any interest in leveraging the full power of ICU tailoring rules to get whatever order someone
mayneed, subject to the limitations of ICU itself?  what would be required to extend CREATE COLLATION to accept an
optionalsequence of tailoring rules that we would store in the pg_collation catalog and apply along with the modifiers
inthe locale string?
 

    /Jim




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Question about StartLogicalReplication() error path
Следующее
От: Alexander Korotkov
Дата:
Сообщение: Re: doc issue missing type name "multirange" in chapter title