Re: Collation version tracking for macOS

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: Collation version tracking for macOS
Дата
Msg-id CA+hUKGL5cYbrf3DXYNLBV78UXBiOaP-59MAzKFvC7dfT+49pTg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Collation version tracking for macOS  (Jeff Davis <pgsql@j-davis.com>)
Список pgsql-hackers
On Tue, Nov 29, 2022 at 3:55 PM Jeff Davis <pgsql@j-davis.com> wrote:
> =# select * from pg_icu_collation_versions('en_US') order by
> icu_version;
>  icu_version | uca_version | collator_version
> -------------+-------------+------------------
>  50.2        | 6.2         | 58.0.6.50
>  51.3        | 6.2         | 58.0.6.50
>  52.2        | 6.2         | 58.0.6.50
>  53.2        | 6.3         | 137.51
>  54.2        | 7.0         | 137.56
>  55.2        | 7.0         | 153.56
>  56.2        | 8.0         | 153.64
>  57.2        | 8.0         | 153.64
>  58.3        | 9.0         | 153.72
>  59.2        | 9.0         | 153.72
>  60.3        | 10.0        | 153.80
>  61.2        | 10.0        | 153.80
>  62.2        | 11.0        | 153.88
>  63.2        | 11.0        | 153.88
>  64.2        | 12.1        | 153.97
>  65.1        | 12.1        | 153.97
>  66.1        | 13.0        | 153.14
>  67.1        | 13.0        | 153.14
>  68.2        | 13.0        | 153.14
>  69.1        | 13.0        | 153.14
>  70.1        | 14.0        | 153.112
> (21 rows)
>
> This is good information, because it tells us that major library
> versions change more often than collation versions, empirically-
> speaking.

Wow, nice discovery about 104 -> 14.  Yeah, I imagine we'll want some
kind of band-aid to tolerate that exact screwup and avoid spurious
warnings.

Bugs aside, that's quite a revealing table in other ways.  We can see:

* The version scheme changed completely in ICU 53.  This corresponds
to a major rewrite of the collation code, I see[1].

* The first component seems to be (UCOL_RUNTIME_VERSION << 4) + 9.
UCOL_RUNTIME_VERSION is in their uvernum.h, currently 9, was 8, bumped
between 54 and 55 (I see this in their commit log), corresponding to
the two possible numbers 137 and 153 that we see there.  I don't know
where the final 9 term is coming from but it looks stable since the v2
collation rewrite landed.

* The second component seems to be uca_version_major * 8 +
uca_version_minor (that's the Unicode Collation Algorithm version, and
so far always matches the Unicode version, visible in the output of
the other function).

* The values you showed for English don't have a third component, but
if you try some other locales like 'zh' you'll see the CLDR major
version in third position.  So I guess some locales depend on CLDR
data and others don't.

TL;DR it *looks* like the set of ingredients for the version string is:

* UCOL_RUNTIME_VERSION (rarely changes)
* UCA/Unicode major.minor version
* sometimes CLDR major version, not sure when
* 9

[1] https://icu.unicode.org/design/collation/v2



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ajin Cherian
Дата:
Сообщение: Re: Support logical replication of DDLs
Следующее
От: John Naylor
Дата:
Сообщение: Re: [PoC] Improve dead tuple storage for lazy vacuum