Re: ICU for global collation

Поиск
Список
Период
Сортировка
От Daniel Verite
Тема Re: ICU for global collation
Дата
Msg-id 5d807706-60a2-4e56-bc59-eef9e7deb138@manitou-mail.org
обсуждение исходный текст
Ответ на ICU for global collation  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Ответы Re: ICU for global collation  (Marius Timmer <marius.timmer@uni-muenster.de>)
Re: ICU for global collation  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Список pgsql-hackers
 Hi,

When trying databases defined with ICU locales, I see that backends
that serve such databases seem to have their LC_CTYPE inherited from
the environment (as opposed to a per-database fixed value).

That's a problem for the backend code that depends on libc functions
that themselves depend on LC_CTYPE, such as the full text search parser
and dictionaries.

For instance, if you start the instance with a C locale
(LC_ALL=C pg_ctl...) , and tries to use FTS in an ICU UTF-8 database,
it doesn't work:

template1=# create database "fr-utf8"
  template 'template0' encoding UTF8
  locale 'fr'
  collation_provider 'icu';

template1=# \c fr-utf8
You are now connected to database "fr-utf8" as user "daniel".

fr-utf8=# show lc_ctype;
 lc_ctype
----------
 fr
(1 row)

fr-utf8=# select to_tsvector('été');
ERROR:    invalid multibyte character for locale
HINT:  The server's LC_CTYPE locale is probably incompatible with the
database encoding.

If I peek into the "real" LC_CTYPE when connected to this database,
I can see it's "C":

fr-utf8=# create extension plperl;
CREATE EXTENSION

fr-utf8=# create function lc_ctype() returns text as '$ENV{LC_CTYPE};'
  language plperl;
CREATE FUNCTION

fr-utf8=# select lc_ctype();
 lc_ctype
----------
 C


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabien COELHO
Дата:
Сообщение: Re: pgbench - allow to create partitioned tables
Следующее
От: Mahendra Singh
Дата:
Сообщение: Re: range test for hash index?