Обсуждение: Point-In-Time Recovery not working

Поиск
Список
Период
Сортировка

Point-In-Time Recovery not working

От
gais
Дата:
Hi, new to the list here.

I'm running a caldav server with postgresql backend. For some reason a
user removed his agenda, I want to revert to a state before 20th of
August in order to restore this. I have a configuration with
archive_mode = on in postgresql.conf and a timeout of 24h. I have a dump
from the original database and all WAL files from the last 18 months. I
want to restore the database and run all WAL to restore to it's last
working state.

I've tried two options based on the documentation:
http://www.postgresql.org/docs/8.3/static/continuous-archiving.html

option one:

- clear /var/lib/postgresql/8.3/main
- create a new main folder as user postgres:
[code]initdb -d /var/lib/postgresql/8.3/main[/code]
- start the server starten (this is where it goes wrong I suppose)
- backup the original DB:
[code]psql -f backup.dump postgres[/code]
- stop the server
- create a recovery script /var/lib/postgresql/8.3/main/recovery.conf
[code]restore_command = 'cp /BACKUP/davical/postgresql/%f %p'
recovery_target_time = '2011-08-20 22:39:00 EST'
recovery_target_inclusive = 'false'[/code]

-start the server

This gives the foll
[code]2011-08-26 15:31:47 CEST LOG:  database system was shut down at
2011-08-26 15:31:40 CEST
2011-08-26 15:31:47 CEST LOG:  could not open file
"pg_xlog/000000010000000000000000" (log file 0, segment 0): No such file
or directory
2011-08-26 15:31:47 CEST LOG:  invalid primary checkpoint record
2011-08-26 15:31:47 CEST LOG:  could not open file
"pg_xlog/000000010000000000000000" (log file 0, segment 0): No such file
or directory
2011-08-26 15:31:47 CEST LOG:  invalid secondary checkpoint record
2011-08-26 15:31:47 CEST PANIC:  could not locate a valid checkpoint record
2011-08-26 15:31:47 CEST LOG:  startup process (PID 6193) was terminated
by signal 6: Aborted
2011-08-26 15:31:47 CEST LOG:  aborting startup due to startup process
failure
[/code]

I suppose by starting the server the xlog-entries are no longer
consistent. But I am not able to restore the dump while the server is
not running?


second option:

- remove /var/lib/postgresql/8.3/main
- create a new main folder from scratch
[code]initdb -d /var/lib/postgresql/8.3/main[/code]
- recover the original DB from a tar.gz file by copying the contents
directly into /var/lib/postgresql/8.3/main/base
- create a recovery script /var/lib/postgresql/8.3/main/recovery.conf:
[code]restore_command = 'cp /BACKUP/davical/postgresql/%f %p'
recovery_target_time = '2011-08-20 22:39:00 EST'
recovery_target_inclusive = 'false'[/code]

-restart server

2011-08-26 16:59:07 CEST LOG:  database system was shut down at
2011-08-26 16:54:54 CEST
2011-08-26 16:59:07 CEST LOG:  starting archive recovery
2011-08-26 16:59:07 CEST LOG:  restore_command = 'cp
/BACKUP/davical/postgresql/%f %p'
2011-08-26 16:59:07 CEST LOG:  recovery_target_time = '2011-08-21
05:39:00+02'
2011-08-26 16:59:07 CEST LOG:  recovery_target_inclusive = false
cp: cannot stat `/BACKUP/davical/postgresql/00000001.history': No such
file or directory
cp: cannot stat `/BACKUP/davical/postgresql/000000010000000000000000':
No such file or directory
2011-08-26 16:59:07 CEST LOG:  automatic recovery in progress
2011-08-26 16:59:07 CEST LOG:  record with zero length at 0/440C10
2011-08-26 16:59:07 CEST LOG:  redo is not required
cp: cannot stat `/BACKUP/davical/postgresql/000000010000000000000000':
No such file or directory
cp: cannot stat `/BACKUP/davical/postgresql/00000002.history': No such
file or directory
2011-08-26 16:59:07 CEST LOG:  selected new timeline ID: 2
cp: cannot stat `/BACKUP/davical/postgresql/00000001.history': No such
file or directory
2011-08-26 16:59:08 CEST LOG:  incomplete startup packet
2011-08-26 16:59:08 CEST LOG:  archive recovery complete
2011-08-26 16:59:08 CEST LOG:  autovacuum launcher started
2011-08-26 16:59:08 CEST LOG:  database system is ready to accept
connections
2011-08-26 16:59:08 CEST LOG:  archive command failed with exit code 1
2011-08-26 16:59:08 CEST DETAIL:  The failed archive command was:
/usr/local/bin/pg_backup pg_xlog/00000002.history 00000002.history
2011-08-26 16:59:09 CEST LOG:  archive command failed with exit code 1
2011-08-26 16:59:09 CEST DETAIL:  The failed archive command was:
/usr/local/bin/pg_backup pg_xlog/00000002.history 00000002.history
2011-08-26 16:59:10 CEST LOG:  archive command failed with exit code 1
2011-08-26 16:59:10 CEST DETAIL:  The failed archive command was:
/usr/local/bin/pg_backup pg_xlog/00000002.history 00000002.history
2011-08-26 16:59:10 CEST WARNING:  transaction log file
"00000002.history" could not be archived: too many failures


This gives no errors, but it also does not recover any records in the
database? What am I missing?

Gijs

Re: Point-In-Time Recovery not working

От
"Kevin Grittner"
Дата:
gais <gais@alpenjodel.de> wrote:

> I'm running a caldav server with postgresql backend. For some
> reason a user removed his agenda, I want to revert to a state
> before 20th of August in order to restore this. I have a
> configuration with archive_mode = on in postgresql.conf and a
> timeout of 24h. I have a dump from the original database and all
> WAL files from the last 18 months. I want to restore the database
> and run all WAL to restore to it's last working state.
>
> I've tried two options based on the documentation:
>
http://www.postgresql.org/docs/8.3/static/continuous-archiving.html

> option one:

> [code]initdb -d /var/lib/postgresql/8.3/main[/code]

> second option:

> [code]initdb -d /var/lib/postgresql/8.3/main[/code]

Not based very closely on the documentation.  Please read the
"Recovering using a Continuous Archive Backup" subsection closely
and follow the directions there.  In particular, initdb is not run
as part of the recovery.  If you follow the directions step by step,
you should get farther.  If you still have problems, feel free to
post again.

-Kevin

Re: Point-In-Time Recovery not working

От
gais
Дата:
I've tried to follow the directions as closely as possible, but still
the server won't start. I do get a different PANIC:

2011-08-26 23:55:08 CEST LOG:  database system was shut down at
2011-08-26 22:08:35 CEST
2011-08-26 23:55:08 CEST LOG:  starting archive recovery
2011-08-26 23:55:08 CEST LOG:  restore_command = 'cp
/var/lib/postgresql/8.3/main/archive/%f %p'
cp: cannot stat `/var/lib/postgresql/8.3/main/archive/00000001.history':
No such file or directory
2011-08-26 23:55:08 CEST LOG:  restored log file
"0000000100000001000000EE" from archive
2011-08-26 23:55:08 CEST LOG:  unexpected pageaddr 1/EB01C000 in log
file 1, segment 238, offset 114688
2011-08-26 23:55:08 CEST LOG:  invalid primary checkpoint record
2011-08-26 23:55:08 CEST LOG:  restored log file
"0000000100000001000000EE" from archive
2011-08-26 23:55:08 CEST LOG:  invalid xl_info in secondary checkpoint
record
2011-08-26 23:55:08 CEST PANIC:  could not locate a valid checkpoint record
2011-08-26 23:55:08 CEST LOG:  startup process (PID 8975) was terminated
by signal 6: Aborted
2011-08-26 23:55:08 CEST LOG:  aborting startup due to startup process
failure

Somehow I think the database files I used to restore are inconsistent
with the WAL's. Is there a workaround or am I still doing something
completely foolish?


> Not based very closely on the documentation.  Please read the
> "Recovering using a Continuous Archive Backup" subsection closely
> and follow the directions there.  In particular, initdb is not run
> as part of the recovery.  If you follow the directions step by step,
> you should get farther.  If you still have problems, feel free to
> post again.
>
> -Kevin


Re: Point-In-Time Recovery not working

От
gais
Дата:
> I've tried to follow the directions as closely as possible, but still
> the server won't start. I do get a different PANIC:
>
> 2011-08-26 23:55:08 CEST LOG:  database system was shut down at
> 2011-08-26 22:08:35 CEST
> 2011-08-26 23:55:08 CEST LOG:  starting archive recovery
> 2011-08-26 23:55:08 CEST LOG:  restore_command = 'cp
> /var/lib/postgresql/8.3/main/archive/%f %p'
> cp: cannot stat
> `/var/lib/postgresql/8.3/main/archive/00000001.history': No such file
> or directory
> 2011-08-26 23:55:08 CEST LOG:  restored log file
> "0000000100000001000000EE" from archive
> 2011-08-26 23:55:08 CEST LOG:  unexpected pageaddr 1/EB01C000 in log
> file 1, segment 238, offset 114688
> 2011-08-26 23:55:08 CEST LOG:  invalid primary checkpoint record
> 2011-08-26 23:55:08 CEST LOG:  restored log file
> "0000000100000001000000EE" from archive
> 2011-08-26 23:55:08 CEST LOG:  invalid xl_info in secondary checkpoint
> record
> 2011-08-26 23:55:08 CEST PANIC:  could not locate a valid checkpoint
> record
> 2011-08-26 23:55:08 CEST LOG:  startup process (PID 8975) was
> terminated by signal 6: Aborted
> 2011-08-26 23:55:08 CEST LOG:  aborting startup due to startup process
> failure
>
> Somehow I think the database files I used to restore are inconsistent
> with the WAL's. Is there a workaround or am I still doing something
> completely foolish?


OK, did some more testing and it seems the pg_control file that is
residing in my global directory does not match the WAL files. Is it
still possible to recover the data? I came across some pages claiming
you should be able to rebuild the control file from the WAL?