Обсуждение: Dropped table, no backup, restore from file system backup or WAL files?

Поиск
Список
Период
Сортировка

Dropped table, no backup, restore from file system backup or WAL files?

От
"John T. Dow"
Дата:
Less than a week ago we installed a database and new software in a production environment.

Today someone was attempting to install the same software in a test environment, and realized too late that a script he
ranwas run against the production database. It dropped three tables. 

We need to recover those tables. This is what we have to work with.

1) We have a file system backup from 3AM.

2) We have not yet instituted a daily postgres backup with pg_dump.

3) We have all the WAL files since going into production (unarchived, only 6 files needed to cover the period).

There are two ways I can think of to try to recover the data.

1) I presume that we can restore the entire directory from the 3am backup (to a different physical location of course)
andthen export the data in the three tables to csv files and reimport it. With that approach, is there anything that
shouldbe done to test the integrity of the data? 

2) I should think that I could also restore the data from the WAL files, but when I create a recovery.conf file and use
pg_resetxlog.exe,I can get it to do its thing without complaint (renames recovery.conf to recovery.done) or I get the
followingerror in pg_log: 

LOG:  database system was shut down at 2010-07-06 13:27:42 EDT
LOG:  starting archive recovery
LOG:  restore_command = 'donothing.bat'
LOG:  invalid magic number 0000 in log file 0, segment 31, offset 0
LOG:  invalid primary checkpoint record.
LOG:  invalid magic number 0000 in log file 0, segment 31, offset 0
LOG:  invalid secondary checkpoint record
PANIC:  could not locate a valid checkpoint record

In neither case does it restore any data.

Note: for the recovery procedure, I have a BAT file that literally does nothing. The reason is that I've copied the
properWAL files into the pg_xlog directory, so there is no archiving per se. 

I have looked at the options pg_resetxlog.exe has and can't figure out if there's some parameter I can set to make it
restorethe data. Is it not possible, without doing a checkpoint? Can I construct a check point manually? 

John


Re: Dropped table, no backup, restore from file system backup or WAL files?

От
Jesper Krogh
Дата:
On 2010-07-07 03:24, John T. Dow wrote:
> In neither case does it restore any data.
>
> Note: for the recovery procedure, I have a BAT file that literally does nothing. The reason is that I've copied the
properWAL files into the pg_xlog directory, so there is no archiving per se. 
>
Ok, try to fix that.. as per:
24.3.3 here:
http://www.postgresql.org/docs/8.4/interactive/continuous-archiving.html

Assuming that the filesystem backup was made using
pg_start_backup()/pg_stop_backup() it will just work.

Jesper

Re: Dropped table, no backup, restore from file system backup or WAL files?

От
"John T. Dow"
Дата:
I changed the recovery.conf file so it has one statement:

restore_command = 'copy D:\devj\WCSD\received\pg_xlog\%f %p'



This is what I now get in pg_log:

LOG:  database system was shut down at 2010-07-07 14:48:34 EDT
LOG:  starting archive recovery
LOG:  restore_command = 'copy D:\devj\WCSD\received\pg_xlog\%f %p'
LOG:  could not open file "pg_xlog/0000000700000000000000BE" (log file 0, segment 190): No such file or directory
LOG:  invalid primary checkpoint record
LOG:  could not open file "pg_xlog/0000000700000000000000BE" (log file 0, segment 190): No such file or directory
LOG:  invalid secondary checkpoint record
PANIC:  could not locate a valid checkpoint record



And as before, nothing is restored.

These are the "archived" files:

0000000100000000000000B6
0000000100000000000000B7
0000000100000000000000B8
0000000100000000000000B9
0000000100000000000000BA
0000000100000000000000BB
0000000100000000000000BC

It's looking for timeline 7 but the timeline of the archived files is 1. (Big deal, I can rename them.)

It's also looking for file BE, but the files are B6 through BC.


I also tried with this: restore_command = 'copy D:\devj\WCSD\received\%f %p'


John








On Wed, 07 Jul 2010 06:33:01 +0200, Jesper Krogh wrote:

>On 2010-07-07 03:24, John T. Dow wrote:
>In neither case does it restore any data.
>
>> Note: for the recovery procedure, I have a BAT file that literally does nothing. The reason is that I've copied the
properWAL files into the pg_xlog directory, so there is no archiving per se 
>Ok, try to fix that.. as per:
>24.3.3 here:
>http://www.postgresql.org/docs/8.4/interactive/continuous-archiving.html
>
>Assuming that the filesystem backup was made using
>pg_start_backup()/pg_stop_backup() it will just work.
>
>--
>Jesper





Re: Dropped table, no backup, restore from file system backup or WAL files?

От
"John T. Dow"
Дата:

On Wed, 07 Jul 2010 22:18:13 +0200, Jesper Krogh wrote:

>On 2010-07-07 20:54, John T. Dow wrote:
>> I changed the recovery.conf file so it has one statement:
>>
>> restore_command = 'copy D:\devj\WCSD\received\pg_xlog\%f %p'
>>
>>
>>
>> This is what I now get in pg_log:
>>
>> LOG:  database system was shut down at 2010-07-07 14:48:34 EDT
>> LOG:  starting archive recovery
>> LOG:  restore_command = 'copy D:\devj\WCSD\received\pg_xlog\%f %p'
>> LOG:  could not open file "pg_xlog/0000000700000000000000BE" (log file 0, segment 190): No such file or directory
>> LOG:  invalid primary checkpoint record
>> LOG:  could not open file "pg_xlog/0000000700000000000000BE" (log file 0, segment 190): No such file or directory
>> LOG:  invalid secondary checkpoint record
>> PANIC:  could not locate a valid checkpoint record
>>
>>
>>
>> And as before, nothing is restored.
>>
>> These are the "archived" files:
>>
>> 0000000100000000000000B6
>> 0000000100000000000000B7
>> 0000000100000000000000B8
>> 0000000100000000000000B9
>> 0000000100000000000000BA
>> 0000000100000000000000BB
>> 0000000100000000000000BC
>>
>> It's looking for timeline 7 but the timeline of the archived files is 1. (Big deal, I can rename them.)
>>
>> It's also looking for file BE, but the files are B6 through BC.
>>
>>
>> I also tried with this: restore_command = 'copy D:\devj\WCSD\received\%f %p'
>>
>>
>then you either have a wrong base backup (did you run
>pg_start_backup/pg_stop_backup)? or
>the wrong set of WAL-files for your database.
>
>--
>Jesper




We did not do pg_start_backup/pg_stop_backup.

I was naively hoping that the WAL files would work on a "similar" database (ie same tables, but created independently).

I understand that WAL files can only be applied to the database which they are based on, or an exact copy of the
database.Therefore we need to copy the existing database files (file copy) and then try this technique on the copy.
Thatwill take some doing, as people are not in the same city, don't work the same hours, etc. 

What if we never did pg_start_backup? Will it work anyway?

Also, it is unclear from the documentation whether you must have archived wal files to replay the wal files. There
seemsto be no way to replay them other than with recovery.conf, and recovery.conf seems to require a restore_command
thatactually does something. Is it necessary to clear any wal files from the pg_xlog directory, and is it necessary
thatall wal files to be replayed must be copied into pg_xlog using the restore_command? This things aren't really
statedvery clearly. 

John







Re: Dropped table, no backup, restore from file system backup or WAL files?

От
Tom Lane
Дата:
"John T. Dow" <john@johntdow.com> writes:
> We did not do pg_start_backup/pg_stop_backup.

Ugh.

> What if we never did pg_start_backup? Will it work anyway?

You could maybe make it work, if you had full_page_writes turned on
and have a continuous series of WAL files extending back to before
the manual filesystem backup was started.  What pg_start_backup mainly
does for you is to automate things and make sure there is a well-defined
spot at which a successful replay can be started.  It's *not* going to
"just work" without pg_start_backup, though.  You'd need to manually
fake up a suitable backup label file, and maybe some other hacking.
Otherwise what's likely to happen is that the recovery goes through
but leaves you with a corrupted database anyway.

If the data is worth this much trouble to you, I'd suggest hiring a
Postgres consultant who's experienced in data recovery.

> Also, it is unclear from the documentation whether you must have
> archived wal files to replay the wal files. There seems to be no way
> to replay them other than with recovery.conf, and recovery.conf seems
> to require a restore_command that actually does something.

You can just have it copy from pg_xlog, if all the files you need are
in pg_xlog.  That's a pretty uncommon situation though, so there's not
any special easy case for it.

            regards, tom lane

Re: Dropped table, no backup, restore from file system backup or WAL files?

От
"John T. Dow"
Дата:
Tom

You've helped clarify things. Thanks.

Our situation is kind of a special case and I was trying to learn from it what can be done with the WAL files.

We only started the production system a few days ago, so we have all the WAL files. I don't care of we'd end up with a
corrupteddatabase because I wouldn't do a recovery to the original database but to a copy. The intention is to copy out
thedata for three small tables. 

As I understand it, one should do a pg_start_backup, then do a file system backup, then do pg_stop_backup.

Several questions:

First, if the file system back is run automatically at 3AM, what's the best way to do the start/stop backup?

Second, what about doing a pg_dump instead of a file system backup?

Third, the manual in 22.3.2 says "It is also possible to make a backup dump while the postmaster is stopped. In this
case,you obviously cannot use pg_start_backup or pg_stop_backup, and you will therefore be left to your own devices to
keeptrack of which backup dump is which and how far back the associated WAL files go. It is generally better to follow
theon-line backup procedure above." That implies that I can do what I'm trying to do, but it doesn't say how. You
indicatethat it's messy. I hoped that meant that one can manually replay wal files, one by one, but I guess not. 

John




On Thu, 08 Jul 2010 16:58:18 -0400, Tom Lane wrote:

>"John T. Dow" <john@johntdow.com> writes:
>> We did not do pg_start_backup/pg_stop_backup.
>
>Ugh.
>
>> What if we never did pg_start_backup? Will it work anyway?
>
>You could maybe make it work, if you had full_page_writes turned on
>and have a continuous series of WAL files extending back to before
>the manual filesystem backup was started.  What pg_start_backup mainly
>does for you is to automate things and make sure there is a well-defined
>spot at which a successful replay can be started.  It's *not* going to
>"just work" without pg_start_backup, though.  You'd need to manually
>fake up a suitable backup label file, and maybe some other hacking.
>Otherwise what's likely to happen is that the recovery goes through
>but leaves you with a corrupted database anyway.
>
>If the data is worth this much trouble to you, I'd suggest hiring a
>Postgres consultant who's experienced in data recovery.
>
>> Also, it is unclear from the documentation whether you must have
>> archived wal files to replay the wal files. There seems to be no way
>> to replay them other than with recovery.conf, and recovery.conf seems
>> to require a restore_command that actually does something.
>
>You can just have it copy from pg_xlog, if all the files you need are
>in pg_xlog.  That's a pretty uncommon situation though, so there's not
>any special easy case for it.
>
>            regards, tom lane
>
>--
>Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-novice