Обсуждение: Point-In-Time Recovery not working
Hi, new to the list here. I'm running a caldav server with postgresql backend. For some reason a user removed his agenda, I want to revert to a state before 20th of August in order to restore this. I have a configuration with archive_mode = on in postgresql.conf and a timeout of 24h. I have a dump from the original database and all WAL files from the last 18 months. I want to restore the database and run all WAL to restore to it's last working state. I've tried two options based on the documentation: http://www.postgresql.org/docs/8.3/static/continuous-archiving.html option one: - clear /var/lib/postgresql/8.3/main - create a new main folder as user postgres: [code]initdb -d /var/lib/postgresql/8.3/main[/code] - start the server starten (this is where it goes wrong I suppose) - backup the original DB: [code]psql -f backup.dump postgres[/code] - stop the server - create a recovery script /var/lib/postgresql/8.3/main/recovery.conf [code]restore_command = 'cp /BACKUP/davical/postgresql/%f %p' recovery_target_time = '2011-08-20 22:39:00 EST' recovery_target_inclusive = 'false'[/code] -start the server This gives the foll [code]2011-08-26 15:31:47 CEST LOG: database system was shut down at 2011-08-26 15:31:40 CEST 2011-08-26 15:31:47 CEST LOG: could not open file "pg_xlog/000000010000000000000000" (log file 0, segment 0): No such file or directory 2011-08-26 15:31:47 CEST LOG: invalid primary checkpoint record 2011-08-26 15:31:47 CEST LOG: could not open file "pg_xlog/000000010000000000000000" (log file 0, segment 0): No such file or directory 2011-08-26 15:31:47 CEST LOG: invalid secondary checkpoint record 2011-08-26 15:31:47 CEST PANIC: could not locate a valid checkpoint record 2011-08-26 15:31:47 CEST LOG: startup process (PID 6193) was terminated by signal 6: Aborted 2011-08-26 15:31:47 CEST LOG: aborting startup due to startup process failure [/code] I suppose by starting the server the xlog-entries are no longer consistent. But I am not able to restore the dump while the server is not running? second option: - remove /var/lib/postgresql/8.3/main - create a new main folder from scratch [code]initdb -d /var/lib/postgresql/8.3/main[/code] - recover the original DB from a tar.gz file by copying the contents directly into /var/lib/postgresql/8.3/main/base - create a recovery script /var/lib/postgresql/8.3/main/recovery.conf: [code]restore_command = 'cp /BACKUP/davical/postgresql/%f %p' recovery_target_time = '2011-08-20 22:39:00 EST' recovery_target_inclusive = 'false'[/code] -restart server 2011-08-26 16:59:07 CEST LOG: database system was shut down at 2011-08-26 16:54:54 CEST 2011-08-26 16:59:07 CEST LOG: starting archive recovery 2011-08-26 16:59:07 CEST LOG: restore_command = 'cp /BACKUP/davical/postgresql/%f %p' 2011-08-26 16:59:07 CEST LOG: recovery_target_time = '2011-08-21 05:39:00+02' 2011-08-26 16:59:07 CEST LOG: recovery_target_inclusive = false cp: cannot stat `/BACKUP/davical/postgresql/00000001.history': No such file or directory cp: cannot stat `/BACKUP/davical/postgresql/000000010000000000000000': No such file or directory 2011-08-26 16:59:07 CEST LOG: automatic recovery in progress 2011-08-26 16:59:07 CEST LOG: record with zero length at 0/440C10 2011-08-26 16:59:07 CEST LOG: redo is not required cp: cannot stat `/BACKUP/davical/postgresql/000000010000000000000000': No such file or directory cp: cannot stat `/BACKUP/davical/postgresql/00000002.history': No such file or directory 2011-08-26 16:59:07 CEST LOG: selected new timeline ID: 2 cp: cannot stat `/BACKUP/davical/postgresql/00000001.history': No such file or directory 2011-08-26 16:59:08 CEST LOG: incomplete startup packet 2011-08-26 16:59:08 CEST LOG: archive recovery complete 2011-08-26 16:59:08 CEST LOG: autovacuum launcher started 2011-08-26 16:59:08 CEST LOG: database system is ready to accept connections 2011-08-26 16:59:08 CEST LOG: archive command failed with exit code 1 2011-08-26 16:59:08 CEST DETAIL: The failed archive command was: /usr/local/bin/pg_backup pg_xlog/00000002.history 00000002.history 2011-08-26 16:59:09 CEST LOG: archive command failed with exit code 1 2011-08-26 16:59:09 CEST DETAIL: The failed archive command was: /usr/local/bin/pg_backup pg_xlog/00000002.history 00000002.history 2011-08-26 16:59:10 CEST LOG: archive command failed with exit code 1 2011-08-26 16:59:10 CEST DETAIL: The failed archive command was: /usr/local/bin/pg_backup pg_xlog/00000002.history 00000002.history 2011-08-26 16:59:10 CEST WARNING: transaction log file "00000002.history" could not be archived: too many failures This gives no errors, but it also does not recover any records in the database? What am I missing? Gijs
gais <gais@alpenjodel.de> wrote: > I'm running a caldav server with postgresql backend. For some > reason a user removed his agenda, I want to revert to a state > before 20th of August in order to restore this. I have a > configuration with archive_mode = on in postgresql.conf and a > timeout of 24h. I have a dump from the original database and all > WAL files from the last 18 months. I want to restore the database > and run all WAL to restore to it's last working state. > > I've tried two options based on the documentation: > http://www.postgresql.org/docs/8.3/static/continuous-archiving.html > option one: > [code]initdb -d /var/lib/postgresql/8.3/main[/code] > second option: > [code]initdb -d /var/lib/postgresql/8.3/main[/code] Not based very closely on the documentation. Please read the "Recovering using a Continuous Archive Backup" subsection closely and follow the directions there. In particular, initdb is not run as part of the recovery. If you follow the directions step by step, you should get farther. If you still have problems, feel free to post again. -Kevin
I've tried to follow the directions as closely as possible, but still the server won't start. I do get a different PANIC: 2011-08-26 23:55:08 CEST LOG: database system was shut down at 2011-08-26 22:08:35 CEST 2011-08-26 23:55:08 CEST LOG: starting archive recovery 2011-08-26 23:55:08 CEST LOG: restore_command = 'cp /var/lib/postgresql/8.3/main/archive/%f %p' cp: cannot stat `/var/lib/postgresql/8.3/main/archive/00000001.history': No such file or directory 2011-08-26 23:55:08 CEST LOG: restored log file "0000000100000001000000EE" from archive 2011-08-26 23:55:08 CEST LOG: unexpected pageaddr 1/EB01C000 in log file 1, segment 238, offset 114688 2011-08-26 23:55:08 CEST LOG: invalid primary checkpoint record 2011-08-26 23:55:08 CEST LOG: restored log file "0000000100000001000000EE" from archive 2011-08-26 23:55:08 CEST LOG: invalid xl_info in secondary checkpoint record 2011-08-26 23:55:08 CEST PANIC: could not locate a valid checkpoint record 2011-08-26 23:55:08 CEST LOG: startup process (PID 8975) was terminated by signal 6: Aborted 2011-08-26 23:55:08 CEST LOG: aborting startup due to startup process failure Somehow I think the database files I used to restore are inconsistent with the WAL's. Is there a workaround or am I still doing something completely foolish? > Not based very closely on the documentation. Please read the > "Recovering using a Continuous Archive Backup" subsection closely > and follow the directions there. In particular, initdb is not run > as part of the recovery. If you follow the directions step by step, > you should get farther. If you still have problems, feel free to > post again. > > -Kevin
> I've tried to follow the directions as closely as possible, but still > the server won't start. I do get a different PANIC: > > 2011-08-26 23:55:08 CEST LOG: database system was shut down at > 2011-08-26 22:08:35 CEST > 2011-08-26 23:55:08 CEST LOG: starting archive recovery > 2011-08-26 23:55:08 CEST LOG: restore_command = 'cp > /var/lib/postgresql/8.3/main/archive/%f %p' > cp: cannot stat > `/var/lib/postgresql/8.3/main/archive/00000001.history': No such file > or directory > 2011-08-26 23:55:08 CEST LOG: restored log file > "0000000100000001000000EE" from archive > 2011-08-26 23:55:08 CEST LOG: unexpected pageaddr 1/EB01C000 in log > file 1, segment 238, offset 114688 > 2011-08-26 23:55:08 CEST LOG: invalid primary checkpoint record > 2011-08-26 23:55:08 CEST LOG: restored log file > "0000000100000001000000EE" from archive > 2011-08-26 23:55:08 CEST LOG: invalid xl_info in secondary checkpoint > record > 2011-08-26 23:55:08 CEST PANIC: could not locate a valid checkpoint > record > 2011-08-26 23:55:08 CEST LOG: startup process (PID 8975) was > terminated by signal 6: Aborted > 2011-08-26 23:55:08 CEST LOG: aborting startup due to startup process > failure > > Somehow I think the database files I used to restore are inconsistent > with the WAL's. Is there a workaround or am I still doing something > completely foolish? OK, did some more testing and it seems the pg_control file that is residing in my global directory does not match the WAL files. Is it still possible to recover the data? I came across some pages claiming you should be able to rebuild the control file from the WAL?