Re: Support LIKE with nondeterministic collations

Поиск

Список

Период

Сортировка

От	Peter Eisentraut
Тема	Re: Support LIKE with nondeterministic collations
Дата	3 мая 2024 г. 18:53:52
Msg-id	b32cefe2-b9e2-499e-b919-fe8f21c5bc22@eisentraut.org обсуждение исходный текст
Ответ на	Re: Support LIKE with nondeterministic collations ("Daniel Verite" <daniel@manitou-mail.org>)
Список	pgsql-hackers

Дерево обсуждения

On 03.05.24 16:58, Daniel Verite wrote:
>     * Generating bounds for a sort key (prefix matching)
> 
>     Having sort keys for strings allows for easy creation of bounds -
>     sort keys that are guaranteed to be smaller or larger than any sort
>     key from a give range. For example, if bounds are produced for a
>     sortkey of string “smith”, strings between upper and lower bounds
>     with one level would include “Smith”, “SMITH”, “sMiTh”. Two kinds
>     of upper bounds can be generated - the first one will match only
>     strings of equal length, while the second one will match all the
>     strings with the same initial prefix.
> 
>     CLDR 1.9/ICU 4.6 and later map U+FFFF to a collation element with
>     the maximum primary weight, so that for example the string
>     “smith\uFFFF” can be used as the upper bound rather than modifying
>     the sort key for “smith”.
> 
> In other words it says that
> 
>    col LIKE 'smith%' collate "nd"
> 
> is equivalent to:
> 
>    col >= 'smith' collate "nd" AND col < U&'smith\ffff' collate "nd"
> 
> which could be obtained from an index scan, assuming a btree
> index on "col" collate "nd".
> 
> U+FFFF is a valid code point but a "non-character" [1] so it's
> not supposed to be present in normal strings.

Thanks, this could be very useful!

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Support LIKE with nondeterministic collations