Re: TM format can mix encodings in to_char()

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: TM format can mix encodings in to_char()
Дата
Msg-id 24472.1555775401@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: TM format can mix encodings in to_char()  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: TM format can mix encodings in to_char()  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
I wrote:
> Hmm.  I'd always imagined that the way that libc works is that LC_CTYPE
> determines the encoding (codeset) it's using across the board, so that
> functions like strftime would deliver data in that encoding.
> [ and much more based on that ]

After further study of the code, the situation seems less dire than
I feared yesterday.  In the first place, we disallow settings of
LC_COLLATE and LC_CTYPE that don't match the database encoding, see
tests in dbcommands.c's check_encoding_locale_matches() and in initdb.
So that core functionality will be consistent in any case.

Also, I see that PGLC_localeconv() is effectively doing exactly what
you suggested for strings that are encoded according to LC_MONETARY
and LC_NUMERIC:

        encoding = pg_get_encoding_from_locale(locale_monetary, true);

        db_encoding_convert(encoding, &worklconv.int_curr_symbol);
        db_encoding_convert(encoding, &worklconv.currency_symbol);
        ...

This is a little bit off, now that I look at it, because it's
failing to account for the possibility of getting -1 from
pg_get_encoding_from_locale.  It should probably do what
pg_bind_textdomain_codeset does:

    if (encoding < 0)
        encoding = PG_SQL_ASCII;

since passing PG_SQL_ASCII to the conversion will have the effect of
validating the data without any actual conversion.

I remain wary of this idea because it's depending on something that's
undefined per POSIX, but apparently it's working well enough for
LC_MONETARY and LC_NUMERIC, so we can probably get away with it for
LC_TIME as well.  Anyway the current code clearly does not work on
glibc, and I also verified that there's a problem on FreeBSD, so
this patch should make things better.

Also, experimentation suggests that LC_MESSAGES actually does work
the way I thought this stuff works, ie, its implied codeset isn't
really used.  (I think this only matters for strerror(), since we
force the issue for gettext, but glibc's strerror() is clearly not
paying attention to that.)  Sigh, who needs consistency?

            regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabien COELHO
Дата:
Сообщение: Re: Add missing operator <->(box, point)
Следующее
От: Andrey Borodin
Дата:
Сообщение: Re: block-level incremental backup