On 09/19/2014 08:50 AM, Tom Lane wrote:
> smcg2297@frii.com writes:
>> In a postgresql-9.3.1 database with UTF8 encoding I can do:
>
>> select convert_from (E'\\x68656c6c6f', 'LATIN1');
>> convert_from
>> --------------
>> hello
>
>> But when I explicitly give the "to" encoding:
>
>> select convert (E'\\x68656c6c6f', 'LATIN1', 'UTF8');
>> convert
>> --------------
>> \x68656c6c6f
>
>> Why does that second one give different results from the first?
>
> convert_from() produces a result of type text. But convert() returns
> bytea, because its output is not necessarily valid in the current
> database encoding. It's the same bytes, but it prints differently.
> (You could ameliorate the unreadability by changing the bytea_output
> setting.)
I my case I routinely deal with multi-byte utf-8 characters so the
"escape" format doesn't help much.
>> Second question: why is that none of the convert* functions are
>> marked as immutable (thus preventing me from creating a functional
>> index using them). Surely if I convert \x68 to utf-8 the result
>> will *always* be "h", won't it?
>
> Well, no, it'll be whatever the conversion function says it is;
> and encoding conversion functions are replaceable through DDL.
>
> I recall some discussion to the effect that that was silly and
> we should rip out CREATE CONVERSION and friends in favor of
> hard-wired conversion rules. Nothing's been done about it though.
Well, if it matters, count me as +1.
Thanks very much for the explanation, things are clearer now. :-)