Re: Pre-proposal: unicode normalized text

Поиск

Список

Период

Сортировка

От	Jeff Davis
Тема	Re: Pre-proposal: unicode normalized text
Дата	3 октября 2023 г. 22:55:32
Msg-id	b28354e5b228ef3ec742112e11442486718336af.camel@j-davis.com обсуждение
Ответ на	Re: Pre-proposal: unicode normalized text (Peter Eisentraut <peter@eisentraut.org>)
Список	pgsql-hackers

Дерево обсуждения

On Mon, 2023-10-02 at 10:47 +0200, Peter Eisentraut wrote:
> I think a better direction here would be to work toward making
> nondeterministic collations usable on the global/database level and
> then
> encouraging users to use those.
>
> It's also not clear which way the performance tradeoffs would fall.
>
> Nondeterministic collations are obviously going to be slower, but by
> how
> much?  People have accepted moving from C locale to "real" locales
> because they needed those semantics.  Would it be any worse moving
> from
> real locales to "even realer" locales?

If you normalize first, then you can get some semantic improvements
without giving up on the stability and performance of memcmp(). That
seems like a win with zero costs in terms of stability or performance
(except perhaps some extra text->utext casts).

Going to a "real" locale gives more semantic benefits but at a very
high cost: depending on a collation provider library, dealing with
collation changes, and performance costs. While supporting the use of
nondeterministic collations at the database level may be a good idea,
it's not helping to reach the compromise that I'm trying to reach in
this thread.

Regards,
    Jeff Davis

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Pre-proposal: unicode normalized text