Обсуждение: Restoring dtabase problem

Поиск
Список
Период
Сортировка

Restoring dtabase problem

От
"Benjamin Krajmalnik"
Дата:
I am trying to restore a database from a pg_dump.
I have a problematic issue in that a few rows have a problem with UTF-8 encoding.
All of these rows are in a specific table.
Is there something I can run against the table which will validate the rows which have the bad data and delete the rows?
 
The database is quite large (8GB) and I was informed that it would be much faster to restore using the COPY method as opposed to the INSERT method.
Any and all assistance will be deeply appreciated.
 

Regards,
 
Benjamin

Re: Restoring dtabase problem

От
Ivo Rossacher
Дата:
Am Freitag, 4. August 2006 19.06 schrieb Benjamin Krajmalnik:
> I am trying to restore a database from a pg_dump.
> I have a problematic issue in that a few rows have a problem with UTF-8
> encoding.
> All of these rows are in a specific table.
> Is there something I can run against the table which will validate the
> rows which have the bad data and delete the rows?

Typically this issue is related to a low number of invalid characters comming
up a lot of time in the file. So I would replace the invalid codes with the
correct unicode character by a text editor (take care to choose the correct
encoding in the editor). In the context of the rest of the text it becomes
mostly clear what the correct unicode character should be.

>
> The database is quite large (8GB) and I was informed that it would be
> much faster to restore using the COPY method as opposed to the INSERT
> method.

This is correct, but has to be done when dumping (see man pg_dump -> options
-d -D). 8GB, well this will take quite some time with inserts ...

Regards
Ivo

> Any and all assistance will be deeply appreciated.
>
>
> Regards,
>
> Benjamin

Re: Restoring dtabase problem

От
"Benjamin Krajmalnik"
Дата:
The problem is that I do not which rows have the problem.
Out of a few thousand rows, there are only about 15 rows with the
problem.
I was wondering if there is some validation which I can run as a script
against the table so it wil delete it.

How much faster is the copy method as opposed to the insert?
The only other idea which I had was to do a pg_dump of the schema, and
then individual pg_dumps for each table.
That way, I can use the copy method for all of the tables other than the
particular table.
I looked for an option to exclude a table from the dump, but did not
find such an option.

I am going to try a long shot right now since I am going from Windows to
FreeBSD.  I stopped Postgres on the Windows machine and tar'ed the data
directory.
Now I am going to move it to the FreeBSD box and see what happens.
Probably won't work, but what the heck!

As an alternative oprion, I may do a table dump of the table with the
problems in insert mode, turn on logging, and see if I can identify the
culprit rows that way and then delete them.




> -----Original Message-----
> From: pgsql-admin-owner@postgresql.org
> [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Ivo Rossacher
> Sent: Friday, August 04, 2006 1:36 PM
> To: pgsql-admin@postgresql.org
> Subject: Re: [ADMIN] Restoring dtabase problem
>
> Am Freitag, 4. August 2006 19.06 schrieb Benjamin Krajmalnik:
> > I am trying to restore a database from a pg_dump.
> > I have a problematic issue in that a few rows have a problem with
> > UTF-8 encoding.
> > All of these rows are in a specific table.
> > Is there something I can run against the table which will
> validate the
> > rows which have the bad data and delete the rows?
>
> Typically this issue is related to a low number of invalid
> characters comming up a lot of time in the file. So I would
> replace the invalid codes with the correct unicode character
> by a text editor (take care to choose the correct encoding in
> the editor). In the context of the rest of the text it
> becomes mostly clear what the correct unicode character should be.
>
> >
> > The database is quite large (8GB) and I was informed that
> it would be
> > much faster to restore using the COPY method as opposed to
> the INSERT
> > method.
>
> This is correct, but has to be done when dumping (see man
> pg_dump -> options -d -D). 8GB, well this will take quite
> some time with inserts ...
>
> Regards
> Ivo
>
> > Any and all assistance will be deeply appreciated.
> >
> >
> > Regards,
> >
> > Benjamin
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org
> so that your
>        message can get through to the mailing list cleanly
>

Re: Restoring dtabase problem

От
"Joshua D. Drake"
Дата:
Benjamin Krajmalnik wrote:
> The problem is that I do not which rows have the problem.
> Out of a few thousand rows, there are only about 15 rows with the
> problem.
> I was wondering if there is some validation which I can run as a script
> against the table so it wil delete it.
>
> How much faster is the copy method as opposed to the insert?

Exponential.

Joshua D. Drake


--

    === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
    Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/