Re: Create collation reporting the ICU locale display name

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: Create collation reporting the ICU locale display name
Дата
Msg-id CAH2-Wzmo3jt6h0BEBYxDfxMJ+pcg7eCJxR3PNpg0XMsBap+iaQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Create collation reporting the ICU locale display name  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Sat, Sep 14, 2019 at 8:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The advantage of describe_collation(oid) is that we would not be
> building knowledge into the callers about which columns of pg_collation
> matter for this purpose.  I'm not even convinced that the two you posit
> here are sufficient --- the encoding seems relevant, for instance.

+1. It seems like a good idea to consider the ICU display name to be
just that -- a display name. It should be considered a dynamic thing.
For one thing, it is subject to localization, so it isn't fixed even
when nothing changes internally. But there is also the question of
external changes. Internationalization is inherently a squishy
business.

I believe that the main goal of BCP 47 (i.e. ICU's CREATE COLLATION
locale strings) is to fail gracefully when cultural or political
developments occur that change the expectations of users. BCP 47 is
actually an IETF standard -- it's not from the Unicode consortium, or
from ICU. It is supposed to be highly forgiving -- this is a feature,
not a bug. Of course, many facets of a locale control things that we
don't care about, or at least don't involve ICU with. For example,
locale controls the default currency symbol.

There are pg_upgrade scenarios in which the display string for a
collation will legitimately change due to external changes. For
example, somebody that lived in Serbia and Montenegro (a country which
ceased to exist in 2006) could have used a locale string with "cs" (an
ISO 3166-1 code), which has been deprecated [1]. If memory serves,
there is a 5 year grace period codified by some ISO standard or other,
so recent ICU versions know nothing about Serbia and Montenegro
specifically. But they'll still recognize the Serbian language code,
as well as language codes for minority languages spoken in Serbia and
Montenegro. So, for the most part, the impact of sticking with this
old/somewhat inaccurate locale definition string is minimal.
(Actually, maybe downgrade scenarios are more interesting in
practice.)

[1] https://en.wikipedia.org/wiki/ISO_3166-2:CS#Codes_deleted_in_Newsletter_I-8
--
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Thomas Rosenstein"
Дата:
Сообщение: Re: Standby Replication and Replication Delay
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: Extending range type operators to cope with elements