Re: Easy way to convert a database from WIN1252 to UTF8?

Поиск
Список
Период
Сортировка
От Mike Christensen
Тема Re: Easy way to convert a database from WIN1252 to UTF8?
Дата
Msg-id AANLkTinsI5eyfnZG-4JvjLEL9Q1SGEwi7wstsI3GUUwy@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Easy way to convert a database from WIN1252 to UTF8?  (Sam Mason <sam@samason.me.uk>)
Список pgsql-general
On Thu, Jul 1, 2010 at 10:07 AM, Sam Mason <sam@samason.me.uk> wrote:
> On Thu, Jul 01, 2010 at 10:01:02AM -0700, Mike Christensen wrote:
>> Yup, the problem is line 170 doesn't actually match up to the
>> DB.dbs.out file line 170 (which is a blank line).  I believe it means
>> line 170 from the stdin pipe it was processing for the copy command.
>
> Doh, that's annoying.  It would be nice to know that it's done the right
> thing rather than "some" thing.
>
>> Suffice to say, there was some weird character in my database that PG
>> can't automatically translate from WIN1252 to UTF8, and apparently it
>> will drop that /entire/ COPY command (the entire table doesn't get
>> populated!)..
>
> Yup, this is deliberate.  You can also run psql with "-1" to put the
> whole lot (i.e. every table/view/... creation and data insert) in a
> transaction which will cause the whole restore to be rolled back if
> something doesn't look right as well.
>
>> As to what character was the culprit, I'm not entirely sure how to
>> figure this out.  I guess I could look for that hex value?  However,
>> if I set the encoding in the script itself, everything works
>> perfectly.
>
> PG is doing the right thing, 9D is undefined in Win1252.  I guess you've
> either got other problems or this was just an artifact of converting
> from Win1252 to UTF8 external to PG and then not telling it that you'd
> done that.
>
> --
>  Sam  http://samason.me.uk/
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Yeah, looking at the lines in question I don't really see anything
wrong with them.  Everything is going into the database as UTF8 so
maybe some weird characters got stuck in there somehow with the old
default encoding.  This is the main reason why I'm converting to UTF8
now, so data will be consistent across all layers..  Good to get these
bugs out of the way while the data set is relatively small.

If anyone wants me to do any more debugging, I'd be more than happy to
but I'm satisfied with the results.  Thanks!

Mike

В списке pgsql-general по дате отправления:

Предыдущее
От: Sam Mason
Дата:
Сообщение: Re: Easy way to convert a database from WIN1252 to UTF8?
Следующее
От: David Kerr
Дата:
Сообщение: Uncable to commit: transaction marked for rollback