Re: Pre-proposal: unicode normalized text

Поиск

Список

Период

Сортировка

От	Jeff Davis
Тема	Re: Pre-proposal: unicode normalized text
Дата	7 октября 2023 г. 04:18:01
Msg-id	96c0173c5156d365e132ec29e4873237be565743.camel@j-davis.com обсуждение исходный текст
Ответ на	Re: Pre-proposal: unicode normalized text (Robert Haas <robertmhaas@gmail.com>)
Ответы	Re: Pre-proposal: unicode normalized text (Peter Eisentraut <peter@eisentraut.org>)
Список	pgsql-hackers

Дерево обсуждения

On Wed, 2023-10-04 at 13:16 -0400, Robert Haas wrote:
> > At minimum I think we need to have some internal functions to check
> > for
> > unassigned code points. That belongs in core, because we generate
> > the
> > unicode tables from a specific version.
>
> That's a good idea.

Patch attached.

I added a new perl script to parse UnicodeData.txt and generate a
lookup table (of ranges, which can be binary-searched).

The C entry point does the same thing as u_charType(), and I also
matched the enum numeric values for convenience. I didn't use
u_charType() because I don't think this kind of unicode functionality
should depend on ICU, and I think it should match other Postgres
Unicode functionality.

Strictly speaking, I only needed to know whether it's unassigned or
not, not the general category. But it seemed easy enough to return the
general category, and it will be easier to create other potentially-
useful functions on top of this.

The tests do require ICU though, because I compare with the results of
u_charType().

Regards,
    Jeff Davis

Вложения

v1-0001-Internal-functions-for-determining-Unicode-genera.patch

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Amit Kapila
Дата: 07 октября 2023 г., 03:19:26
Сообщение: Re: typo in couple of places

Следующее

От: Vik Fearing
Дата: 07 октября 2023 г., 04:35:06
Сообщение: Re: Add support for AT LOCAL

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Pre-proposal: unicode normalized text

Вложения

Предыдущее

Следующее