What is the maximum encoding-conversion growth rate, anyway?

Поиск
Список
Период
Сортировка
От Tom Lane
Тема What is the maximum encoding-conversion growth rate, anyway?
Дата
Msg-id 29182.1180371229@sss.pgh.pa.us
обсуждение исходный текст
Ответы Re: What is the maximum encoding-conversion growth rate, anyway?  (Tatsuo Ishii <ishii@postgresql.org>)
Re: What is the maximum encoding-conversion growth rate, anyway?  (Bruce Momjian <bruce@momjian.us>)
Список pgsql-hackers
I just rearranged the code in mbutils.c a little bit to make it more
robust if conversion of an over-length string is attempted, and noted
this comment:

/** When converting strings between different encodings, we assume that space* for converted result is 4-to-1 growth in
theworst case. The rate for* currently supported encoding pairs are within 3 (SJIS JIS X0201 half width* kanna -> UTF8
isthe worst case).  So "4" should be enough for the moment.** Note that this is not the same as the maximum character
widthin any* particular encoding.*/
 
#define MAX_CONVERSION_GROWTH  4

It strikes me that this is overly pessimistic, since we do not support
5- or 6-byte UTF8 characters, and AFAICS there are no 1-byte characters
in any supported encoding that require 4 bytes in another.  Could we
reduce the multiplier to 3?  Or even 2?  This has a direct impact on the
longest COPY lines we can support, so I'd like it not to be larger than
necessary.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Oleg Bartunov
Дата:
Сообщение: Re: Why not keeping positions in GIN?
Следующее
От: Neil Conway
Дата:
Сообщение: libedit-preferred by default