Re: [HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47language tags. Should it?

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: [HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47language tags. Should it?
Дата
Msg-id CAH2-WznOpmJ+3xh6bvea_YUyd4ZdGiwG9ycE31Q09oU3XXw5vA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47language tags. Should it?  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
On Tue, Sep 19, 2017 at 7:01 PM, Peter Geoghegan <pg@bowt.ie> wrote:
> I really think we need to add some kind of debug mode that makes ICU
> optionally spit out a locale display name at key points. We need this
> to gain confidence that the behavior that ICU provides actually
> matches what is expected across ICU different versions for different
> locale formats. We talked about this as a user-facing feature before,
> which can wait till v11; I just want this to debug problems like this
> one.

I patched CREATE COLLATION to show ICU display name, which produces
output like this:

postgres=# create collation basque (provider=icu,
locale='eu-u-kf-upper-kr-latn-digit-em-emoji-kn-true-co-eor');
NOTICE:  00000: ICU collation description is "Basque
(colcasefirst=upper, Sort Order=European Ordering Rules,
colnumeric=yes, colreorder=latn-digit, em=emoji)"
CREATE COLLATION

I used an ISO 639-1 language code (2 letter language code) above,
which, as we can see, is recognized as Basque. ICU is also fine with
the 3 letter 639-2 code "eus-", recognizing that as Basque, too. If I
use an ISO 639-2 code for Basque that ICU/CLDR doesn't like, "baq-*",
I can see that my expectations have only partially been met, since the
notice doesn't say anything about the language Basque:

postgres=# create collation actually_not_basque (provider=icu,
locale='baq-u-kf-upper-kr-latn-digit-em-emoji-kn-true-co-eor');
NOTICE:  00000: ICU collation description is "baq (colcasefirst=upper,
Sort Order=European Ordering Rules, colnumeric=yes,
colreorder=latn-digit, em=emoji)"
CREATE COLLATION

Functionality like this is starting to look essential to me, rather
than just a nice to have. Having this NOTICE would have made me
realize our problems with ICU versions < 54 much earlier, if nothing
else. If the purpose of NOTICE messages is to "Provide[s] information
that might be helpful to users", then I'd say that this definitely
qualifies. And, the extra code is trivial (we already get display name
in the context of initdb). I strongly recommend that we slip this into
v10, as part of fixing the problem with language tags that earlier ICU
versions have.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] hash index on unlogged tables doesn't behave as expected
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: [HACKERS] Windows warnings from VS 2017