Обсуждение: convert text field to utf8 in sql_ascii database

Поиск
Список
Период
Сортировка

convert text field to utf8 in sql_ascii database

От
Andy Colson
Дата:
Hi All.

I am in the middle of a process to get all my data into utf8.  As its
not all converted yet, my database encoding is SQL_ASCII.

I am getting external apps fixed up to write utf8 to the database, and
so far so good.  But, I ran across some stuff that needs a one time
convert, and wanted to just write sql to update it.

I cant seem to figure it out.  I'm on Slackware 64, PG 9.0.4.

I tried:
update general set legal = convert(legal, 'windows_1250', 'utf8');

ERROR:  function convert(text, unknown, unknown) does not exist
HINT:  No function matches the given name and argument types. You might
need to add explicit type casts.


It looks like convert() only does bytea, and legal is defined as text.
Can't seem to cast it either (as legal::bytea).  The result is bytea, so
again would have to cast/convert?


Also, I'd like to add, I dont understand the online help:
http://www.postgresql.org/docs/9.0/static/functions-string.html

shows convert(string, src, dest), but then says see Table 9-7, which has
conversion names, like windows_1250_to_utf8, where do I use that?

The help also shows functions convert_to() and convert_from().  But I
dont understand how to use them.

update general set legal = convert_to(legal, 'utf8');

How does it know what source encoding to use?  And it converts to bytea
so how do I convert it back to text?

I'm a little confused.  I know I'm probably not doing this "the right
way" (c), I'm trying to get there eventually, but for now, I cant just
drop my database and recreate it as utf8.

Any hints?

-Andy

Re: convert text field to utf8 in sql_ascii database

От
"Albe Laurenz"
Дата:
Andy Colson wrote:
> I am in the middle of a process to get all my data into utf8.  As its
> not all converted yet, my database encoding is SQL_ASCII.
>
> I am getting external apps fixed up to write utf8 to the database, and
> so far so good.  But, I ran across some stuff that needs a one time
> convert, and wanted to just write sql to update it.
>
> I cant seem to figure it out.  I'm on Slackware 64, PG 9.0.4.
>
> I tried:
> update general set legal = convert(legal, 'windows_1250', 'utf8');
>
> ERROR:  function convert(text, unknown, unknown) does not exist
> HINT:  No function matches the given name and argument types. You might
> need to add explicit type casts.
>
>
> It looks like convert() only does bytea, and legal is defined as text.
> Can't seem to cast it either (as legal::bytea).  The result is bytea, so
> again would have to cast/convert?

Right.

> Also, I'd like to add, I dont understand the online help:
> http://www.postgresql.org/docs/9.0/static/functions-string.html
>
> shows convert(string, src, dest), but then says see Table 9-7, which has
> conversion names, like windows_1250_to_utf8, where do I use that?

You can only convert from encoding A to encoding B if there is a
corresponding A_to_B conversion available.

> The help also shows functions convert_to() and convert_from().  But I
> dont understand how to use them.
>
> update general set legal = convert_to(legal, 'utf8');
>
> How does it know what source encoding to use?  And it converts to bytea
> so how do I convert it back to text?

convert_to() and convert_from() can't do any conversion in an SQL_ASCII
database because there is no well-defined database encoding in that case.
They will "cast" between text and bytea but leave the contents unmodified.

You'll have to employ convert() to do the actual conversion.

If, for example, you know that a column is in WIN1252 encoding, you could
  UPDATE tab SET col =
    convert_from(
      convert(
        convert_to('schön', 'WIN1252'), 'WIN1252', 'UTF8'
      ), 'UTF8'
    );

Yours,
Laurenz Albe