ICU for global collation

Поиск
Список
Период
Сортировка
От Peter Eisentraut
Тема ICU for global collation
Дата
Msg-id 5e756dd6-0e91-d778-96fd-b1bcb06c161a@2ndquadrant.com
обсуждение исходный текст
Ответы Re: ICU for global collation  (Andrey Borodin <x4mmm@yandex-team.ru>)
Re: ICU for global collation  ("Daniel Verite" <daniel@manitou-mail.org>)
Список pgsql-hackers
Here is an initial patch to add the option to use ICU as the global
collation provider, a long-requested feature.

To activate, use something like

    initdb --collation-provider=icu --locale=...

A trick here is that since we need to also still set the normal POSIX
locales, the --locale value needs to be valid as both a POSIX locale and
a ICU locale.  If that doesn't work out, there is also a way to specify
it separately, e.g.,

    initdb --collation-provider=icu --locale=en_US.utf8 --icu-locale=en

This complexity is unfortunate, but I don't see a way around it right now.

There are also options for createdb and CREATE DATABASE to do this for a
particular database only.

Besides this, the implementation is quite small: When starting up a
database, we create an ICU collator object, store it in a global
variable, and then use it when appropriate.  All the ICU code for
creating and invoking those collators already exists of course.

For the version tracking, I use the pg_collation row for the "default"
collation.  Again, this mostly reuses existing code and concepts.

Nondeterministic collations are not supported for the global collation,
because then LIKE and regular expressions don't work and that breaks
some system views.  This needs some separate research.

To test, run the existing regression tests against a database
initialized with ICU.  Perhaps some options for pg_regress could
facilitate that.

I fear that the Localization chapter in the documentation will need a
bit of a rewrite after this, because the hitherto separately treated
concepts of locale and collation are fusing together.  I haven't done
that here yet, but that would be the plan for later.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: configure still looking for crypt()?
Следующее
От: Robert Haas
Дата:
Сообщение: Re: POC: Cleaning up orphaned files using undo logs