Обсуждение: pitr replica dies on startup
Hi guys, I've inherited a PITR continuous recovery master/standby server pair. The continuous recovery and loading of the xlogs seems to work fine, but when I opted to test the replica bring up, it falls over with signal 6. Here's an excerpt from the log with log levels set up to debug5: DEBUG: max_safe_fds = 984, usable_fds = 1000, already_open = 6 LOG: database system was interrupted at 2007-08-31 15:09:40 PDT LOG: starting archive recovery LOG: restore_command = "/data/pg/data/recovery.py /data/pg/wal/%f "%p"" DEBUG: executing restore command "/data/pg/data/recovery.py /data/pg/wal/00000001.history "pg_xlog/RECOVERYHISTORY"" DEBUG: could not restore file "00000001.history" from archive: return code 0 DEBUG: executing restore command "/data/pg/data/recovery.py /data/pg/wal/00000001000000000000004C.00000020.backup "pg_xlog/RECOVERYHISTORY"" LOG: restored log file "00000001000000000000004C.00000020.backup" from archive DEBUG: executing restore command "/data/pg/data/recovery.py /data/pg/wal/00000001000000000000004C "pg_xlog/RECOVERYXLOG"" LOG: restored log file "00000001000000000000004C" from archive LOG: checkpoint record is at 0/4C000020 LOG: redo record is at 0/4C000020; undo record is at 0/0; shutdown FALSE LOG: next transaction ID: 0/1222; next OID: 24576 LOG: next MultiXactId: 1; next MultiXactOffset: 0 LOG: automatic recovery in progress LOG: redo starts at 0/4C000070 DEBUG: executing restore command "/data/pg/data/recovery.py /data/pg/wal/00000001000000000000004D "pg_xlog/RECOVERYXLOG"" LOG: restored log file "00000001000000000000004D" from archive DEBUG: recovery restart point at 0/4D000020 CONTEXT: xlog redo checkpoint: redo 0/4D000020; undo 0/0; tli 1; xid 0/1232; oid 24576; multi 1; offset 0; online DEBUG: executing restore command "/data/pg/data/recovery.py /data/pg/wal/00000001000000000000004E "pg_xlog/RECOVERYXLOG"" LOG: restored log file "00000001000000000000004E" from archive DEBUG: recovery restart point at 0/4E000020 CONTEXT: xlog redo checkpoint: redo 0/4E000020; undo 0/0; tli 1; xid 0/1242; oid 24576; multi 1; offset 0; online DEBUG: executing restore command "/data/pg/data/recovery.py /data/pg/wal/00000001000000000000004F "pg_xlog/RECOVERYXLOG"" LOG: restored log file "00000001000000000000004F" from archive DEBUG: recovery restart point at 0/4F000020 CONTEXT: xlog redo checkpoint: redo 0/4F000020; undo 0/0; tli 1; xid 0/1252; oid 24576; multi 1; offset 0; online DEBUG: executing restore command "/data/pg/data/recovery.py /data/pg/wal/000000010000000000000050 "pg_xlog/RECOVERYXLOG"" DEBUG: could not restore file "000000010000000000000050" from archive: return code 256 LOG: could not open file "pg_xlog/000000010000000000000050" (log file 0, segment 80): No such file or directory LOG: redo done at 0/4F000070 DEBUG: executing restore command "/data/pg/data/recovery.py /data/pg/wal/00000001000000000000004F "pg_xlog/RECOVERYXLOG"" DEBUG: could not restore file "00000001000000000000004F" from archive: return code 256 PANIC: could not open file "pg_xlog/00000001000000000000004F" (log file 0, segment 79): No such file or directory DEBUG: reaping dead processes LOG: startup process (PID 31211) was terminated by signal 6 LOG: aborting startup due to startup process failure DEBUG: proc_exit(1) DEBUG: shmem_exit(1) DEBUG: exit(1) You can see the rest of the log file here: http://pgsql.privatepaste.com/841mOR5gzm It seems like everything is happy, except it seems to ask for the 4F log file more than once. Directory permissions look good: total 45 -rw------- 1 postgres postgres 4 Aug 31 11:13 PG_VERSION -rw------- 1 postgres postgres 167 Aug 31 15:09 backup_label.old drwx------ 5 postgres postgres 120 Aug 31 14:59 base drwx------ 2 postgres postgres 768 Aug 31 15:08 global drwx------ 2 postgres postgres 72 Aug 31 11:13 pg_clog -rw------- 1 postgres postgres 3279 Aug 31 11:16 pg_hba.conf -rw------- 1 postgres postgres 1460 Aug 31 11:14 pg_ident.conf drwx------ 4 postgres postgres 96 Aug 31 11:13 pg_multixact drwx------ 2 postgres postgres 72 Aug 31 11:13 pg_subtrans drwx------ 2 postgres postgres 48 Aug 31 11:13 pg_tblspc drwx------ 2 postgres postgres 48 Aug 31 11:13 pg_twophase drwx------ 3 postgres postgres 112 Aug 31 15:24 pg_xlog -rw-r--r-- 1 postgres postgres 15370 Aug 31 14:57 postgresql.conf -rw------- 1 postgres postgres 35 Aug 31 15:09 postmaster.opts -rw-r--r-- 1 postgres postgres 67 Aug 31 13:40 recovery.conf -rwxr-xr-x 1 postgres postgres 1384 Aug 31 13:40 recovery.py I verified that the files are really being copied into pg_xlog/RECOVERYXLOG. This is 8.2.4 on Gentoo, but I observed the same behavior with 8.2.3 on Gentoo. We've md5sum'd the postgres binary, diffed the output of pg_config and everything is the same. I tried starting up postgres in a newly initdb'd directory and it starts up fine. I also tried using the conf files from the 'broken' directory in this new directory and it starts up fine, but if I try to replicate into this directory, postgres exits with signal 6 just the same. Conversely, using a default postgresql.conf file in the 'broken' directory leaves it just as broken. The master can start/stop just fine with no problems. What else to look at? I do have a core file, back trace can be found here: http://pgsql.privatepaste.com/8f0WDAAzz9 Any suggestions much appreciated! -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
Jeff Frost <jeff@frostconsultingllc.com> writes: > It seems like everything is happy, except it seems to ask for the 4F > log file more than once. IIRC that's standard procedure. Is there a reason your recovery_command can't support it? regards, tom lane
On Fri, 31 Aug 2007, Tom Lane wrote: > Jeff Frost <jeff@frostconsultingllc.com> writes: >> It seems like everything is happy, except it seems to ask for the 4F >> log file more than once. > > IIRC that's standard procedure. Is there a reason your recovery_command > can't support it? Really? If that's the case, then no, this particular recovery command archives the WAL files off into an archive directory after they've been successfully copied. So, the second time postgres asks for it, it won't be available. Thanks for the idea Tom, I'll update the script to look in the archive directory if the file is unavailable in the backup directory. Why does it request it twice? -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
Jeff Frost <jeff@frostconsultingllc.com> writes: > Why does it request it twice? I think the reason is that the rollforward cycle is fetch next segment into RECOVERYXLOG process segment unlink RECOVERYXLOG and only when the "fetch" step fails does it realize it's done. So then it has to figure out which segment it wants to start extending and fetch that into the real name of the segment. We could probably rejigger this to avoid a duplicate fetch in common cases, but I'm worried that we couldn't guarantee it in every case --- edge cases like the last valid xlog record ending right at a segment boundary might be trouble. If the recovery script has to handle the situation sometimes, it's probably best to make it do so on a regular basis --- would you like to have only found out about this when you were trying to revive your database at 4AM? regards, tom lane
On Fri, 31 Aug 2007, Tom Lane wrote: > Jeff Frost <jeff@frostconsultingllc.com> writes: >> Why does it request it twice? > > I think the reason is that the rollforward cycle is > > fetch next segment into RECOVERYXLOG > process segment > unlink RECOVERYXLOG > > and only when the "fetch" step fails does it realize it's done. So then > it has to figure out which segment it wants to start extending and fetch > that into the real name of the segment. > > We could probably rejigger this to avoid a duplicate fetch in common > cases, but I'm worried that we couldn't guarantee it in every case --- > edge cases like the last valid xlog record ending right at a segment > boundary might be trouble. If the recovery script has to handle the > situation sometimes, it's probably best to make it do so on a regular > basis --- That all seems reasonable enough. Is it in the docs somewhere? I didn't find anything like this mentioned. If not, could we get it added as a note? > would you like to have only found out about this when you were > trying to revive your database at 4AM? Oh, definitely not! But then again, that's why I was testing it. :-) I just wish the guy who came before me and wrote the script had tested it. But, no big deal, I'll just update the script to deal with it. As always, thanks Tom! I'll report back if that's the fix. -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
Jeff Frost <jeff@frostconsultingllc.com> writes: > That all seems reasonable enough. Is it in the docs somewhere? I > didn't find anything like this mentioned. If not, could we get it > added as a note? Yeah, it hadn't occurred to anyone to specify this, because we just thought of recovery_command as fetching from a static archive. We clearly need to document the expected semantics better. I'm wondering whether we should discourage people from putting side-effects into the recovery_command, period. You already found out that recovery can ask for the same file more than once, but what if it never asks for a particular file at all? I'm not sure that can happen, just playing devil's advocate. Simon, did you see this thread? What do you think? regards, tom lane
On Fri, 31 Aug 2007, Jeff Frost wrote: > On Fri, 31 Aug 2007, Tom Lane wrote: > >> Jeff Frost <jeff@frostconsultingllc.com> writes: >>> Why does it request it twice? >> >> I think the reason is that the rollforward cycle is >> >> fetch next segment into RECOVERYXLOG >> process segment >> unlink RECOVERYXLOG >> >> and only when the "fetch" step fails does it realize it's done. So then >> it has to figure out which segment it wants to start extending and fetch >> that into the real name of the segment. >> >> We could probably rejigger this to avoid a duplicate fetch in common >> cases, but I'm worried that we couldn't guarantee it in every case --- >> edge cases like the last valid xlog record ending right at a segment >> boundary might be trouble. If the recovery script has to handle the >> situation sometimes, it's probably best to make it do so on a regular >> basis --- > > That all seems reasonable enough. Is it in the docs somewhere? I didn't > find anything like this mentioned. If not, could we get it added as a note? > > >> would you like to have only found out about this when you were >> trying to revive your database at 4AM? > > Oh, definitely not! But then again, that's why I was testing it. :-) I just > wish the guy who came before me and wrote the script had tested it. But, no > big deal, I'll just update the script to deal with it. > > As always, thanks Tom! I'll report back if that's the fix. Sure enough, commenting out the mv fixes the problem. Now everything starts up just fine. So, I'll have to re-engineer the script not to do that. Thanks again Tom! -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
On Fri, Aug 31, 2007 at 09:56:50PM -0400, Tom Lane wrote: > Jeff Frost <jeff@frostconsultingllc.com> writes: > > That all seems reasonable enough. Is it in the docs somewhere? I > > didn't find anything like this mentioned. If not, could we get it > > added as a note? > > Yeah, it hadn't occurred to anyone to specify this, because we just > thought of recovery_command as fetching from a static archive. > We clearly need to document the expected semantics better. > > I'm wondering whether we should discourage people from putting > side-effects into the recovery_command, period. You already found out > that recovery can ask for the same file more than once, but what if it > never asks for a particular file at all? I'm not sure that can happen, > just playing devil's advocate. I'd rather go the route of documenting the details of how (archive|recovery)_command is used; one of the huge benefits of our system over others is the flexibility you have in being able to run whatever command you want. I know Simon was working on some improvements to the PITR docs, but I don't know if that's been committed or not yet. -- Decibel!, aka Jim Nasby decibel@decibel.org EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Вложения
On Fri, 2007-08-31 at 21:56 -0400, Tom Lane wrote: > Jeff Frost <jeff@frostconsultingllc.com> writes: > > That all seems reasonable enough. Is it in the docs somewhere? I > > didn't find anything like this mentioned. If not, could we get it > > added as a note? > > Yeah, it hadn't occurred to anyone to specify this, because we just > thought of recovery_command as fetching from a static archive. > We clearly need to document the expected semantics better. > > I'm wondering whether we should discourage people from putting > side-effects into the recovery_command, period. You already found out > that recovery can ask for the same file more than once, but what if it > never asks for a particular file at all? I'm not sure that can happen, > just playing devil's advocate. Yes, side effects are bad, in general. > Simon, did you see this thread? What do you think? The two main things we want are: 1. avoid doing a COPY from the archive, to improve performance 2. automatically clear down the archive when running in continuous recovery mode At first thought, it seems easy to fix this by removing the double file request. That idea was in my original PITR patch, but it was removed and sensibly so. The subtle point is that if you remove files from the archive too early then you may prevent the recovery from being restarted. So "mv" should never be used to avoid the copy operation, because it implements (1) but not (2). I've submitted a patch that sends the recovery_command a new %r value which is the last file needed to restart the recovery. All files prior to that file can be removed from the archive. That is then used by pg_standby to maintain the archive. We discussed the patch as being for 8.3, but the patch is currently in the 8.4 queue. That patch solves this issue, so I'd ask that we review that now. (pgsql-patches, 8 April). That includes initial docs on pg_standby, which does implement file management correctly. There *is* a bug in PITR which I fixed recently. (pgsql-patches, 8 June). That one really needs to be in 8.3 - more review work, sorry. Whatever we do, there's more docs coming on this for 8.3, pg_standby, pg_compresslog and a few other points. Give me a week or so please. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com