Обсуждение: pg_standby error - can't find 00000001.history
Hi, I'm setting up a replacement standby server. I've had this working before until the old standby server lost a drive array and a motherboard. So, my process was working. On the other hand, the old slave was debian etch, with a backported 8.2 release, and the new one is ubuntu 8.04. I've been able to bring the database up from pitr replication using a simple cp restore command. So I'm not concerned that I can't get the DB up, or that the format of the data is wrong. What's not working is doing this in a warm standby manner with pg_standby. postgres@grape:/home/erics$ pg_ctlcluster 8.2 main start The PostgreSQL server failed to start. Please check the log output: 2009-03-12 13:40:18 PDT LOG: could not load root certificate file "root.crt": no SSL error reported 2009-03-12 13:40:18 PDT DETAIL: Will not verify client certificates. 2009-03-12 13:40:18 PDT LOG: database system was interrupted at 2009-03-11 16:31:37 PDT 2009-03-12 13:40:18 PDT LOG: starting archive recovery 2009-03-12 13:40:18 PDT LOG: restore_command = "/usr/lib/postgresql/ 8.2/bin/pg_standby -l -d -k 100 -r 2 -s 2 -w 0 -t /tmp/pgsql.trigger. 5432 /data/pg/repl-db3 %f %p 2>> standby.log" 2009-03-12 13:40:18 PDT FATAL: could not restore file "00000001.history" from archive: return code 32512 2009-03-12 13:40:18 PDT LOG: startup process (PID 6223) exited with exit code 1 2009-03-12 13:40:18 PDT LOG: aborting startup due to startup process failure postgres@grape:/home/erics$ tail /data/pg/main/standby.log sh: /usr/lib/postgresql/8.2/bin/pg_standby: not found sh: /usr/lib/postgresql/8.2/bin/pg_standby: not found sh: /usr/lib/postgresql/8.2/bin/pg_standby: not found sh: /usr/lib/postgresql/8.2/bin/pg_standby: not found postgres@grape:/home/erics$ ls -l /usr/lib/postgresql/8.2/bin/pg_standby -rwxr-xr-x 1 root root 20028 2009-03-12 11:44 /usr/lib/postgresql/8.2/ bin/pg_standby postgres@grape:/home/erics$ ls -l /data/pg/repl-db3 ls -l /data/pg/repl-db3 total 2723068 -rw------- 1 postgres postgres 16777216 2009-03-09 21:22 000000010000004E000000EC -rw------- 1 postgres postgres 16777216 2009-03-09 22:22 000000010000004E000000ED -rw------- 1 postgres postgres 16777216 2009-03-09 23:22 000000010000004E000000EE ... 1) What is 00000001.history? It doesn't look like a WAL file. 2) What file is not found? It sorta looks like the pg_standby binary, but I'm not sure that I believe that. 3) I know that it's going to ask for files that aren't found. Why is it failing this time? 4) The process I'm following did work on a pair of debian-etch machines. We managed to fail over and reset the spare at least 25 times. I'm concerned that I don't understand why it's failing. Any ideas? thanks. eric
Eric Soroos <eric-psql@soroos.net> writes: > 2009-03-12 13:40:18 PDT LOG: starting archive recovery > 2009-03-12 13:40:18 PDT LOG: restore_command = "/usr/lib/postgresql/ > 8.2/bin/pg_standby -l -d -k 100 -r 2 -s 2 -w 0 -t /tmp/pgsql.trigger. > 5432 /data/pg/repl-db3 %f %p 2>> standby.log" > 2009-03-12 13:40:18 PDT FATAL: could not restore file > "00000001.history" from archive: return code 32512 > 2009-03-12 13:40:18 PDT LOG: startup process (PID 6223) exited with > exit code 1 Hmm ... 32512 is 0x7F00, which I think means exit(127), which is generally what the shell returns when it can't find the program it's supposed to execute. > sh: /usr/lib/postgresql/8.2/bin/pg_standby: not found > sh: /usr/lib/postgresql/8.2/bin/pg_standby: not found > sh: /usr/lib/postgresql/8.2/bin/pg_standby: not found > sh: /usr/lib/postgresql/8.2/bin/pg_standby: not found This seems to square with the above conclusion. > 2) What file is not found? It sorta looks like the pg_standby > binary, but I'm not sure that I believe that. Permissions problems on some containing directory, perhaps? regards, tom lane
> >> 2) What file is not found? It sorta looks like the pg_standby >> binary, but I'm not sure that I believe that. > > Permissions problems on some containing directory, perhaps? It's the same directory as the postgresql binaries, and they all have sane permissions. (root:root, 755). Ls finds it, more finds it but complains as it's a binary. (all run as the postgres user) The only think that I can think is that somehow the shell is considering it a corrupted shebang, and can't fine the corrupted path to execute. I've pulled in the packaged 8.3 binaries for this distro, and it appears that the pg_standby in that package is working properly. thanks, eric