Re: [HACKERS] ICU collation variant keywords and pg_collation entries(Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values)

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: [HACKERS] ICU collation variant keywords and pg_collation entries(Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values)
Дата
Msg-id CAH2-Wz=pA+ViKfPxGyBvyc41H4FhdHp=HUrmK9CDfnYdziXziQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] ICU collation variant keywords and pg_collation entries(Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_memvalues)  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Список pgsql-hackers
On Mon, Aug 7, 2017 at 2:50 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 8/6/17 20:07, Peter Geoghegan wrote:
>> I've looked into this. I'll give an example of what keyword variants
>> there are for Greek, and then discuss what I think each is.
>
> I'm not sure why we want to get into editorializing this.  We query ICU
> for the names of distinct collations and use that.

We ask ucol_getKeywordValuesForLocale() to get only "commonly used
[variant] values with the given locale" within
pg_import_system_collations(). So the editorializing has already
begun.

> It's more than most
> people need, sure, but it doesn't cost us anything.

It's also *less* than what other users need. I disagree on the cost of
redundancy among entries after initdb. It's just confusing to users,
and seems avoidable without adding special case logic. What's the
difference between el-u-co-standard-x-icu and el-x-icu?

> The alternatives
> are hand-maintaining a list of collations, or installing no collations
> by default.

A better alternative would be to actively take an interest in what
collations are created, by further refining the rules by which they
are created. We have a stable API, described by various standards,
that we can work with for this. This doesn't have to be a
maintainability burden. We can provide general guidance about how to
add stuff back within documentation.

I do think that we should actually list all the collations that are
available by default on some representative ICU version, once that
list is tightened up, just as other database systems list them. That
necessitates a little weasel wording that notes that later ICU
versions might add more, but that's not a problem IMV. I don't think
that CLDR will ever omit anything previously available, at least
within a reasonable timeframe [1].

[1] http://cldr.unicode.org/index/process/cldr-data-retention-policy
-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] max_files_per_processes vs others uses of file descriptors
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] ICU collation variant keywords and pg_collation entries (Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values)