Re: Upcoming PG re-releases

Поиск

Список

Период

Сортировка

От	Paul Lindner
Тема	Re: Upcoming PG re-releases
Дата	4 декабря 2005 г. 15:25:29
Msg-id	20051204162520.GD10317@inuus.com обсуждение исходный текст
Ответ на	Re: Upcoming PG re-releases (Bruce Momjian <pgman@candle.pha.pa.us>)
Ответы	Re: Upcoming PG re-releases Re: Upcoming PG re-releases
Список	pgsql-hackers

Дерево обсуждения

On Sat, Dec 03, 2005 at 10:54:08AM -0500, Bruce Momjian wrote:
> Neil Conway wrote:
> > On Wed, 2005-11-30 at 10:56 -0500, Tom Lane wrote:
> > > It's been about a month since 8.1.0 was released, and we've found about
> > > the usual number of bugs for a new release, so it seems like it's time
> > > for 8.1.1.
> >
> > I think one fix that should be made in time for 8.1.1 is adding a note
> > to the "version migration" section of the 8.1 release notes describing
> > the "invalid UTF-8 byte sequence" problems that some people have run
> > into when upgrading from prior versions. I'm not familiar enough with
> > the problem or its remedies to add the note myself, though.
>
> Agreed, but I don't understand the problem well enough either.  Does
> anyone?

There was a thread a couple of weeks back about this problem.  Here's
my sample writeup -- I give my permission for anyone to use it as they
see fit:

Upgrading UNICODE databases to 8.1

Postgres 8.1 includes a number of bug-fixes and improvements to
Unicode and UTF-8 character handling.  Unfortunately previous releases
would accept character sequences that were not valid UTF-8.  This
may cause problems when upgrading your database using
pg_dump/pg_restore resulting in an error message like this:
 Invalid UNICODE byte sequence detected near byte ...

To convert your pre-8.1 database to 8.1 you may have to remove and/or
fix the offending characters.  One simple way to fix the problem is to
run your pg_dump output through the iconv command like this:
 iconv -c -f UTF8 -t UTF8 -o fixed.sql dump.sql

The -c flag tells iconv to omit invalid characters from output.

There is one problem with this.  Most versions of iconv try to read
the entire input file into memory.  If you dump is quite large you
will need to split the dump into multiple files and convert each one
individually.  You must use the -l flag for split to insure that the
unicode byte sequences are not split.
  split -l 10000 dump.sql

Another possible solution is to use the --inserts flag to pg_dump.
When you load the resulting data dump in 8.1 this will result in the
problem rows showing up in your error log.

--
Paul Lindner        ||||| | | | |  |  |  |   |   |
lindner@inuus.com

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Andrew Dunstan
Дата: 04 декабря 2005 г., 15:10:08
Сообщение: Re: [PATCHES] snprintf() argument reordering not working

Следующее

От: Tom Lane
Дата: 04 декабря 2005 г., 15:34:27
Сообщение: Re: Upcoming PG re-releases

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Upcoming PG re-releases

Предыдущее

Следующее