Re: Initcap works differently with different locale providers

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: Initcap works differently with different locale providers
Дата
Msg-id CAPpHfdvbib54J8NGcqr=FfrhLeyMFj20AuV1SaBQ_SGme9JnuQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Initcap works differently with different locale providers  (Jeff Davis <pgsql@j-davis.com>)
Список pgsql-docs
On Wed, Jul 30, 2025 at 10:58 PM Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Mon, 2025-07-28 at 13:20 +0300, Alexander Korotkov wrote:
> > I can confirm inicap works with libc and libicu as you stated.  The
> > documentation patch looks good to me.  I’ve written a commit message.
> >  The REL_12_STABLE branch is not relevant anymore as it’s out of
> > support.  I’m going to push this if no objections.
>
> Apologies for the late review.
>
> First, it doesn't mention the "builtin" provider, which uses the same
> word break rules as libc.
>
> Second, word boundaries can be complex, and I'm wondering if we should
> not be so precise about what ICU does or doesn't do. For instance, ICU
> has options like U_TITLECASE_ADJUST_TO_CASED,
> U_TITLECASE_NO_BREAK_ADJUSTMENT, etc.[1], and I'm not sure exactly
> which one of those we use.

I think none of these options is used, because options could be
processed by ucasemap_toTitle() [1] while we use u_strToTitle() [2]
which takes no options.

Links
1. https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/ucasemap_8h.html#aa49d8b403bd91c52f127fe80679bac11
2. https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/ustring_8h.html#a47602e2c2012d77ee91908b9bbfdc063

------
Regards,
Alexander Korotkov
Supabase



В списке pgsql-docs по дате отправления: