Re: Pre-proposal: unicode normalized text

Поиск
Список
Период
Сортировка
От Nico Williams
Тема Re: Pre-proposal: unicode normalized text
Дата
Msg-id ZR8LLrk9AJVxEFbX@ubby21
обсуждение исходный текст
Ответ на Re: Pre-proposal: unicode normalized text  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Pre-proposal: unicode normalized text  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Pre-proposal: unicode normalized text  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Thu, Oct 05, 2023 at 07:31:54AM -0400, Robert Haas wrote:
> [...] On the other hand, to do that in PostgreSQL, we'd need to
> propagate the character set/encoding information into all of the
> places that currently get the typmod and collation, and that is not a
> small number of places. It's a lot of infrastructure for the project
> to carry around for a feature that's probably only going to continue
> to become less relevant.

Text+encoding can be just like bytea with a one- or two-byte prefix
indicating what codeset+encoding it's in.  That'd be how to encode
such text values on the wire, though on disk the column's type should
indicate the codeset+encoding, so no need to add a prefix to the value.

Complexity would creep in around when and whether to perform automatic
conversions.  The easy answer would be "never, on the server side", but
on the client side it might be useful to convert to/from the locale's
codeset+encoding when displaying to the user or accepting user input.

If there's no automatic server-side codeset/encoding conversions then
the server-side cost of supporting non-UTF-8 text should not be too high
dev-wise -- it's just (famous last words) a generic text type
parameterized by codeset+ encoding type.  There would not even be a hard
need for functions for conversions, though there would be demand for
them.

But I agree that if there's no need, there's no need.  UTF-8 is great,
and if only all PG users would just switch then there's not much more to
do.

Nico
-- 



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Nathan Bossart
Дата:
Сообщение: Re: [PoC/RFC] Multiple passwords, interval expirations
Следующее
От: Jeff Davis
Дата:
Сообщение: Re: Pre-proposal: unicode normalized text