Re: The "char" type versus non-ASCII characters

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: The "char" type versus non-ASCII characters
Дата
Msg-id 2320640.1638560531@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: The "char" type versus non-ASCII characters  (Andrew Dunstan <andrew@dunslane.net>)
Ответы Re: The "char" type versus non-ASCII characters  (Andrew Dunstan <andrew@dunslane.net>)
Список pgsql-hackers
Andrew Dunstan <andrew@dunslane.net> writes:
> On 12/3/21 14:12, Tom Lane wrote:
>> I can think of at least three ways we might address this:
>> 
>> * Forbid all non-ASCII values for type "char".  This results in
>> simple and portable semantics, but it might break usages that
>> work okay today.
>> 
>> * Allow such values only in single-byte server encodings.  This
>> is a bit messy, but it wouldn't break any cases that are not
>> problematic already.
>> 
>> * Continue to allow non-ASCII values, but change charin/charout,
>> char_text, etc so that the external representation is encoding-safe
>> (perhaps make it an octal or decimal number).

> I don't like #2.

Yeah, it's definitely messy --- for example, maybe é works in
a latin1 database but is rejected when you try to restore into
a DB with utf8 encoding.

> Is #3 going to change the external representation only
> for non-ASCII values? If so, that seems OK.

Right, I envisioned that ASCII behaves the same but we'd use
a numeric representation for high-bit-set values.  These
cases could be told apart fairly easily by charin(), since
the numeric representation would always be three digits.

> #1 is the simplest to implement and to understand,
> and I suspect it would break very little in practice, but others might
> disagree with that assessment.

We'd still have to decide what to do with pg_upgrade'd
non-ASCII values, so there's messiness there too.
Having charout() throw an error seems not very nice.

            regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: The "char" type versus non-ASCII characters
Следующее
От: Melanie Plageman
Дата:
Сообщение: Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)