Обсуждение: Restarting standby database have to reaply WAL (redo logs)
Hi all: I am testing warm standby databases. I start standby database ############################################## 2007-09-25 12:21:45 ARTLOG: database system was interrupted at 2007-09-25 12:16:22 ART 2007-09-25 12:21:45 ARTLOG: starting archive recovery 2007-09-25 12:21:45 ARTLOG: restore_command = "/usr/local/pgsql-8.2.4/bin/pg_standby -d -t /var/lib/postgres/procuracion/standby.trigger /var/lib/postgres/archive/procuracion %f %p" Trigger file : /var/lib/postgres/procuracion/standby.trigger Waiting for WAL file : /var/lib/postgres/archive/procuracion/00000001.history WAL file path : 00000001.history Restoring to... : pg_xlog/RECOVERYHISTORY Sleep interval : 5 seconds Max wait interval : 0 forever Command for restore : cp "/var/lib/postgres/archive/procuracion/00000001.history" "pg_xlog/RECOVERYHISTORY" Num archived files kept : all files running restore :cp: no se puede efectuar `stat' sobre <</var/lib/postgres/archive/procuracion/00000001.history>>: No existe el fichero o el directorio cp: no se puede efectuar `stat' sobre <</var/lib/postgres/archive/procuracion/00000001.history>>: No existe el fichero o el directorio cp: no se puede efectuar `stat' sobre <</var/lib/postgres/archive/procuracion/00000001.history>>: No existe el fichero o el directorio not restored : history file not found Trigger file : /var/lib/postgres/procuracion/standby.trigger Waiting for WAL file : /var/lib/postgres/archive/procuracion/00000001000000180000009D.00000068.backup WAL file path : 00000001000000180000009D.00000068.backup Restoring to... : pg_xlog/RECOVERYHISTORY Sleep interval : 5 seconds Max wait interval : 0 forever Command for restore : cp "/var/lib/postgres/archive/procuracion/00000001000000180000009D.00000068.backup" "pg_xlog/RECOVERYHISTORY" Num archived files kept : all files running restore : success 2007-09-25 12:22:00 ARTLOG: restored log file "00000001000000180000009D.00000068.backup" from archive Trigger file : /var/lib/postgres/procuracion/standby.trigger Waiting for WAL file : /var/lib/postgres/archive/procuracion/00000001000000180000009D WAL file path : 00000001000000180000009D Restoring to... : pg_xlog/RECOVERYXLOG Sleep interval : 5 seconds Max wait interval : 0 forever Command for restore : cp "/var/lib/postgres/archive/procuracion/00000001000000180000009D" "pg_xlog/RECOVERYXLOG" Num archived files kept : all files running restore : success 2007-09-25 12:22:00 ARTLOG: restored log file "00000001000000180000009D" from archive 2007-09-25 12:22:00 ARTLOG: checkpoint record is at 18/9D000068 2007-09-25 12:22:00 ARTLOG: redo record is at 18/9D000068; undo record is at 0/0; shutdown FALSE 2007-09-25 12:22:00 ARTLOG: next transaction ID: 0/37198; next OID: 3638273 2007-09-25 12:22:00 ARTLOG: next MultiXactId: 1; next MultiXactOffset: 0 2007-09-25 12:22:00 ARTLOG: automatic recovery in progress 2007-09-25 12:22:01 ARTLOG: redo starts at 18/9D0000B0 Trigger file : /var/lib/postgres/procuracion/standby.trigger Waiting for WAL file : /var/lib/postgres/archive/procuracion/00000001000000180000009E WAL file path : 00000001000000180000009E Restoring to... : pg_xlog/RECOVERYXLOG Sleep interval : 5 seconds Max wait interval : 0 forever Command for restore : cp "/var/lib/postgres/archive/procuracion/00000001000000180000009E" "pg_xlog/RECOVERYXLOG" Num archived files kept : all files WAL file not present yet. Checking for trigger file... WAL file not present yet. Checking for trigger file... WAL file not present yet. Checking for trigger file... WAL file not present yet. Checking for trigger file... WAL file not present yet. Checking for trigger file... . . . running restore : success 2007-09-25 12:28:51 ARTLOG: restored log file "0000000100000018000000A5" from archive Trigger file : /var/lib/postgres/procuracion/standby.trigger Waiting for WAL file : /var/lib/postgres/archive/procuracion/0000000100000018000000A6 WAL file path : 0000000100000018000000A6 Restoring to... : pg_xlog/RECOVERYXLOG Sleep interval : 5 seconds Max wait interval : 0 forever Command for restore : cp "/var/lib/postgres/archive/procuracion/0000000100000018000000A6" "pg_xlog/RECOVERYXLOG" Num archived files kept : all files . ############################################## Then shutdown standby ############################################## WAL file not present yet. Checking for trigger file...2007-09-25 12:31:47 ARTLOG: received fast shutdown request 2007-09-25 12:31:47 ARTFATAL: could not restore file "0000000100000018000000A6" from archive: return code 15 2007-09-25 12:31:47 ARTLOG: startup process (PID 29654) exited with exit code 1 2007-09-25 12:31:47 ARTLOG: aborting startup due to startup process failure 2007-09-25 12:31:47 ARTLOG: logger shutting down ############################################## And start standby again ############################################## 2007-09-25 12:32:01 ARTLOG: database system was interrupted while in recovery at log time 2007-09-25 12:16:21 ART 2007-09-25 12:32:01 ARTHINT: If this has occurred more than once some data may be corrupted and you may need to choose an earlier recovery target. 2007-09-25 12:32:01 ARTLOG: starting archive recovery 2007-09-25 12:32:01 ARTLOG: restore_command = "/usr/local/pgsql-8.2.4/bin/pg_standby -d -t /var/lib/postgres/procuracion/standby.trigger /var/lib/postgres/archive/procuracion %f %p" Trigger file : /var/lib/postgres/procuracion/standby.trigger Waiting for WAL file : /var/lib/postgres/archive/procuracion/00000001.history WAL file path : 00000001.history Restoring to... : pg_xlog/RECOVERYHISTORY Sleep interval : 5 seconds Max wait interval : 0 forever Command for restore : cp "/var/lib/postgres/archive/procuracion/00000001.history" "pg_xlog/RECOVERYHISTORY" Num archived files kept : all files running restore :cp: no se puede efectuar `stat' sobre <</var/lib/postgres/archive/procuracion/00000001.history>>: No existe el fichero o el directorio cp: no se puede efectuar `stat' sobre <</var/lib/postgres/archive/procuracion/00000001.history>>: No existe el fichero o el directorio cp: no se puede efectuar `stat' sobre <</var/lib/postgres/archive/procuracion/00000001.history>>: No existe el fichero o el directorio not restored : history file not found Trigger file : /var/lib/postgres/procuracion/standby.trigger Waiting for WAL file : /var/lib/postgres/archive/procuracion/00000001000000180000009D WAL file path : 00000001000000180000009D Restoring to... : pg_xlog/RECOVERYXLOG Sleep interval : 5 seconds Max wait interval : 0 forever Command for restore : cp "/var/lib/postgres/archive/procuracion/00000001000000180000009D" "pg_xlog/RECOVERYXLOG" Num archived files kept : all files running restore : success 2007-09-25 12:32:17 ARTLOG: restored log file "00000001000000180000009D" from archive 2007-09-25 12:32:17 ARTLOG: checkpoint record is at 18/9D000068 2007-09-25 12:32:17 ARTLOG: redo record is at 18/9D000068; undo record is at 0/0; shutdown FALSE 2007-09-25 12:32:17 ARTLOG: next transaction ID: 0/37198; next OID: 3638273 2007-09-25 12:32:17 ARTLOG: next MultiXactId: 1; next MultiXactOffset: 0 2007-09-25 12:32:17 ARTLOG: automatic recovery in progress 2007-09-25 12:32:17 ARTLOG: redo starts at 18/9D0000B0 Trigger file : /var/lib/postgres/procuracion/standby.trigger Waiting for WAL file : /var/lib/postgres/archive/procuracion/00000001000000180000009E WAL file path : 00000001000000180000009E Restoring to... : pg_xlog/RECOVERYXLOG Sleep interval : 5 seconds Max wait interval : 0 forever Command for restore : cp "/var/lib/postgres/archive/procuracion/00000001000000180000009E" "pg_xlog/RECOVERYXLOG" Num archived files kept : all files running restore : success 2007-09-25 12:32:17 ARTLOG: restored log file "00000001000000180000009E" from archive Trigger file : /var/lib/postgres/procuracion/standby.trigger Waiting for WAL file : /var/lib/postgres/archive/procuracion/00000001000000180000009F WAL file path : 00000001000000180000009F Restoring to... : pg_xlog/RECOVERYXLOG Sleep interval : 5 seconds Max wait interval : 0 forever Command for restore : cp "/var/lib/postgres/archive/procuracion/00000001000000180000009F" "pg_xlog/RECOVERYXLOG" Num archived files kept : all files running restore : success 2007-09-25 12:32:17 ARTLOG: restored log file "00000001000000180000009F" from archive Trigger file : /var/lib/postgres/procuracion/standby.trigger Waiting for WAL file : /var/lib/postgres/archive/procuracion/0000000100000018000000A0 WAL file path : 0000000100000018000000A0 Restoring to... : pg_xlog/RECOVERYXLOG Sleep interval : 5 seconds Max wait interval : 0 forever Command for restore : cp "/var/lib/postgres/archive/procuracion/0000000100000018000000A0" "pg_xlog/RECOVERYXLOG" Num archived files kept : all files running restore : success 2007-09-25 12:32:18 ARTLOG: restored log file "0000000100000018000000A0" from archive ############################################## Why did not start appliyng the last WAL?
On Tue, 2007-09-25 at 12:41 -0300, Cesar Otero wrote: > Then shutdown standby > > ############################################## > WAL file not present yet. Checking for trigger file...2007-09-25 > 12:31:47 ARTLOG: received fast shutdown request > 2007-09-25 12:31:47 ARTFATAL: could not restore file > "0000000100000018000000A6" from archive: return code 15 > 2007-09-25 12:31:47 ARTLOG: startup process (PID 29654) exited with exit code 1 > 2007-09-25 12:31:47 ARTLOG: aborting startup due to startup process failure > 2007-09-25 12:31:47 ARTLOG: logger shutting down The shutdown takes down all processes, so the current restore fails in mid-flight. That's how it works, currently. > And start standby again > Trigger file : /var/lib/postgres/procuracion/standby.trigger > Waiting for WAL file : > /var/lib/postgres/archive/procuracion/00000001000000180000009D > WAL file path : 00000001000000180000009D > Restoring to... : pg_xlog/RECOVERYXLOG > Sleep interval : 5 seconds > Max wait interval : 0 forever > Command for restore : cp > "/var/lib/postgres/archive/procuracion/00000001000000180000009D" > "pg_xlog/RECOVERYXLOG" Then we restart recovery from the last restartpoint. Is that what you meant? -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
I mean before shutdown the last log applied was 0000000100000018000000A5 and after restart pg try to apply from wal 00000001000000180000009D. The 00000001000000180000009D file is the first wal created in the hot backup process. I think that after restarting standby db the first log should be last log applied + 1 On 9/25/07, Simon Riggs <simon@2ndquadrant.com> wrote: > On Tue, 2007-09-25 at 12:41 -0300, Cesar Otero wrote: > > > Then shutdown standby > > > > ############################################## > > WAL file not present yet. Checking for trigger file...2007-09-25 > > 12:31:47 ARTLOG: received fast shutdown request > > 2007-09-25 12:31:47 ARTFATAL: could not restore file > > "0000000100000018000000A6" from archive: return code 15 > > 2007-09-25 12:31:47 ARTLOG: startup process (PID 29654) exited with exit code 1 > > 2007-09-25 12:31:47 ARTLOG: aborting startup due to startup process failure > > 2007-09-25 12:31:47 ARTLOG: logger shutting down > > The shutdown takes down all processes, so the current restore fails in > mid-flight. That's how it works, currently. > > > And start standby again > > > Trigger file : /var/lib/postgres/procuracion/standby.trigger > > Waiting for WAL file : > > /var/lib/postgres/archive/procuracion/00000001000000180000009D > > WAL file path : 00000001000000180000009D > > Restoring to... : pg_xlog/RECOVERYXLOG > > Sleep interval : 5 seconds > > Max wait interval : 0 forever > > Command for restore : cp > > "/var/lib/postgres/archive/procuracion/00000001000000180000009D" > > "pg_xlog/RECOVERYXLOG" > > Then we restart recovery from the last restartpoint. > > Is that what you meant? > > -- > Simon Riggs > 2ndQuadrant http://www.2ndQuadrant.com > > -- Ing. César Otero Souto TE: 54 11 15-6860-2366
On Tue, 2007-09-25 at 14:14 -0300, Cesar Otero wrote: > I think that after restarting standby db the first log should be last > log applied + 1 That isn't how it works, for performance reasons. We don't make a restartpoint after each file, we do this on the first usable checkpoint WAL record that we find after checkpoint_timeout seconds. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com