Re: convert text field to utf8 in sql_ascii database

Поиск
Список
Период
Сортировка
От Albe Laurenz
Тема Re: convert text field to utf8 in sql_ascii database
Дата
Msg-id D960CB61B694CF459DCFB4B0128514C207201101@exadv11.host.magwien.gv.at
обсуждение исходный текст
Ответ на convert text field to utf8 in sql_ascii database  (Andy Colson <andy@squeakycode.net>)
Список pgsql-general
Andy Colson wrote:
> I am in the middle of a process to get all my data into utf8.  As its
> not all converted yet, my database encoding is SQL_ASCII.
>
> I am getting external apps fixed up to write utf8 to the database, and
> so far so good.  But, I ran across some stuff that needs a one time
> convert, and wanted to just write sql to update it.
>
> I cant seem to figure it out.  I'm on Slackware 64, PG 9.0.4.
>
> I tried:
> update general set legal = convert(legal, 'windows_1250', 'utf8');
>
> ERROR:  function convert(text, unknown, unknown) does not exist
> HINT:  No function matches the given name and argument types. You might
> need to add explicit type casts.
>
>
> It looks like convert() only does bytea, and legal is defined as text.
> Can't seem to cast it either (as legal::bytea).  The result is bytea, so
> again would have to cast/convert?

Right.

> Also, I'd like to add, I dont understand the online help:
> http://www.postgresql.org/docs/9.0/static/functions-string.html
>
> shows convert(string, src, dest), but then says see Table 9-7, which has
> conversion names, like windows_1250_to_utf8, where do I use that?

You can only convert from encoding A to encoding B if there is a
corresponding A_to_B conversion available.

> The help also shows functions convert_to() and convert_from().  But I
> dont understand how to use them.
>
> update general set legal = convert_to(legal, 'utf8');
>
> How does it know what source encoding to use?  And it converts to bytea
> so how do I convert it back to text?

convert_to() and convert_from() can't do any conversion in an SQL_ASCII
database because there is no well-defined database encoding in that case.
They will "cast" between text and bytea but leave the contents unmodified.

You'll have to employ convert() to do the actual conversion.

If, for example, you know that a column is in WIN1252 encoding, you could
  UPDATE tab SET col =
    convert_from(
      convert(
        convert_to('schön', 'WIN1252'), 'WIN1252', 'UTF8'
      ), 'UTF8'
    );

Yours,
Laurenz Albe

В списке pgsql-general по дате отправления:

Предыдущее
От: John R Pierce
Дата:
Сообщение: Re: pg_dump error
Следующее
От: Greg Smith
Дата:
Сообщение: Re: Performance degradation 8.4 -> 9.1