Обсуждение: PITR and WAL archiving
I have been ramming my head against the wall for the past two
evenings, trying to get PITR working and it's gotten to the point
where I need to ask someone for help.
I'm basically trying to follow the instructions in the manual in
chapter 23.3.
Here's how I'm starting out:
===================
Contents of 'data':
PG_VERSION
base
global
pg_clog
pg_hba.conf
pg_ident.conf
pg_multixact
pg_subtrans
pg_tblspc
pg_twophase
pg_xlog
postgresql.conf
postmaster.opts
data/pg_xlog/archive_status is empty.
I have the following line in postgresql.conf:
archive_command = 'cp -i %p /mnt/server/archive/wal/%f < /dev/null'
/mnt/server/archive/wal is empty.
Here's my procedure:
===============
1. pg_ctl start -l logfile
2. SELECT pg_start_backup('label');
3. tar -zcf /mnt/server/archive/file/backup.tar.gz . (the contents of
'data')
4. SELECT pg_stop_backup();
5. pg_ctl stop
6. I now delete the contents of 'data' (making a copy before hand)
7. In 'data': tar xzvf /mnt/server/archive/file/backup.tar.gz
8. echo \"restore_command = 'cp /mnt/server/archive/wal/%f %p'\" >
recovery.conf
9. pg_ctl start -l logfile
And here are the logs:
================
LOG: database system was shut down at 2007-03-13 23:07:33 CET
LOG: checkpoint record is at 0/1E000020
LOG: redo record is at 0/1E000020; undo record is at 0/0; shutdown
TRUE
LOG: next transaction ID: 0/161203; next OID: 17156
LOG: next MultiXactId: 1; next MultiXactOffset: 0
LOG: database system is ready
LOG: received smart shutdown request
LOG: shutting down
LOG: database system is shut down
LOG: database system was shut down at 2007-03-13 23:09:21 CET
LOG: checkpoint record is at 0/1E000068
LOG: redo record is at 0/1E000068; undo record is at 0/0; shutdown
TRUE
LOG: next transaction ID: 0/161203; next OID: 17156
LOG: next MultiXactId: 1; next MultiXactOffset: 0
LOG: database system is ready
LOG: database system was interrupted at 2007-03-13 23:11:21 CET
LOG: starting archive recovery
LOG: restore_command = "cp /mnt/server/archive/wal/%f %p"
cp: /mnt/server/archive/wal/00000001.history: No such file or
directory
LOG: restored log file "00000001000000000000001E.000000B0.backup"
from archive
LOG: restored log file "00000001000000000000001E" from archive
LOG: checkpoint record is at 0/1E0000B0
LOG: redo record is at 0/1E0000B0; undo record is at 0/0; shutdown
FALSE
LOG: next transaction ID: 0/161205; next OID: 17156
LOG: next MultiXactId: 1; next MultiXactOffset: 0
LOG: automatic recovery in progress
LOG: redo starts at 0/1E0000F8
cp: /mnt/server/archive/wal/00000001000000000000001F: No such file or
directory
LOG: could not open file "pg_xlog/00000001000000000000001F" (log file
0, segment 31): No such file or directory
LOG: redo done at 0/1E0000F8
LOG: restored log file "00000001000000000000001E" from archive
LOG: archive recovery complete
LOG: database system is ready
LOG: received smart shutdown request
LOG: shutting down
LOG: database system is shut down
LOG: database system was shut down at 2007-03-13 23:17:32 CET
LOG: checkpoint record is at 0/1F000068
LOG: redo record is at 0/1F000068; undo record is at 0/0; shutdown
TRUE
LOG: next transaction ID: 0/161207; next OID: 17156
LOG: next MultiXactId: 1; next MultiXactOffset: 0
LOG: database system is ready
overwrite /mnt/server/archive/wal/00000001000000000000001E? (y/n [n])
not overwritten
LOG: archive command "cp -i pg_xlog/00000001000000000000001E /mnt/
server/archive/wal/00000001000000000000001E < /dev/null" failed:
return code 256
Where oh where might I be going wrong? I'd be grateful for any
suggestions.
Regards
Adam
Adam Groves wrote:
> I have been ramming my head against the wall for the past two
> evenings, trying to get PITR working and it's gotten to the point
> where I need to ask someone for help.
>
> I'm basically trying to follow the instructions in the manual in
> chapter 23.3.
Which postgres version are you using?
> 1. pg_ctl start -l logfile
> 2. SELECT pg_start_backup('label');
> 3. tar -zcf /mnt/server/archive/file/backup.tar.gz . (the contents of
> 'data')
> 4. SELECT pg_stop_backup();
> 5. pg_ctl stop
> 6. I now delete the contents of 'data' (making a copy before hand)
> 7. In 'data': tar xzvf /mnt/server/archive/file/backup.tar.gz
> 8. echo \"restore_command = 'cp /mnt/server/archive/wal/%f %p'\" >
> recovery.conf
> 9. pg_ctl start -l logfile
Are you doing this all in one row, without any other queries
(especially updates) hitting the database?
The reason that I ask is that postgres 8.1 will *only* archive a
wal segment (a wal file) if it's completly filled. Since every
WAL segment in 8.1 is *exactly* 16MB large, archive_command will
only get called after 16MB worth of wal traffic has happened.
In 8.1, pg_stop_backup() won't trigger archive_command either -
it can't, because postgres 8.1 can't close a wal segment early -
it *has* to write 16MB first. Therefore, the part of the wal
that is bring your filesystem backup into a consistent state
lives in the currently open wal after pg_stop_backup is called.
Now, you remove your datadir, and restore your filesystem backup.
Postgres realizes that it needs the wal segement that was current
when pg_start_backup() was called, and requests this from
restore_command. But since this segement was the current segment
up to the point where you removed your datadir, it got never archived :-(.
Postgres 8.2 improves the situation. It can now switch wal segments
early, and will do so if you call pg_stop_backup().
greetings, Florian Pflug