does a file named 0000000100000000000000CD exist anywhere on your disk?
-lee
On Fri, Feb 27, 2009 at 2:47 AM, Brad Wiemerslage <wiemersl@yahoo.com> wrote:
>
> I'm attempting to get warm standby up and running with a pair of servers running ubuntu 8.04 and postgresql 8.3.
Beenfollowing the docs:
>
> http://www.postgresql.org/docs/8.3/static/warm-standby.html
> http://www.postgresql.org/docs/current/static/pgstandby.html
>
> Also, basically following the ideas here in this blog post:
>
> http://scale-out-blog.blogspot.com/2009/02/simple-ha-with-postgresql-point-in-time.html
>
> I've customized the original script he refers to in the article, which is here in its entirety for reference:
>
> https://s3.amazonaws.com/extras.continuent.com/standby.sh
>
> Here is the meat of my customized script, which runs on the standby. The postgresql server on the standby is stopped
first.
>
> start_backup="SELECT pg_start_backup('my_backup');"
> stop_backup="SELECT pg_stop_backup();"
> echo "$start_backup" | $psql -h$PRIMARY -U myuser -d mydb -e
> rsync --delete -avz -e "ssh -i /path/to/key" myuser@$PRIMARY:$PG_DATA/ $PG_DATA
> echo "$stop_backup" | $psql -h $PRIMARY -U myuser -d mydb -e
>
> The files seem to copied over to the standby machine just fine. Success is reported with respect to the backup
commands. Permissions seem fine.
>
> Next, there are some steps which blow out some files. As I understand it, you no longer need the files on the
standbythat were in pg_xlog on the primary.
>
> rm -f $PG_DATA/recovery.*
> rm -f $PG_DATA/8.3/main/logfile
> rm -f $PG_DATA/8.3/main/postmaster.pid
> rm -f $PG_DATA/8.3/main/pg_xlog/0*
> rm -f $PG_DATA/8.3/main/pg_xlog/archive_status/0*
>
> This step seems to work fine.
>
> Then, the archives are pulled. They are pulled to /mnt/postgresql_archives with this command:
>
> rsync --delete -avz -e "ssh -i /path/to/key" myuser@$PRIMARY:$PG_ARCHIVES/ $PG_ARCHIVES
>
> Everything looks good. I end up with an up to date list of WAL files in /mnt/postgresql_archives on the standby.
Hereis a listing:
>
> root@standby:/mnt/postgresql_archives# ls
> total 688996
> drwxr-xr-x 2 postgres postgres 4096 2009-02-27 01:13 .
> drwxr-xr-x 14 root root 4096 2009-02-27 01:17 ..
> -rw-rw---- 1 postgres postgres 16777216 2009-02-27 00:19 0000000100000000000000CB
> -rw-rw---- 1 postgres postgres 16777216 2009-02-27 00:29 0000000100000000000000CC
> -rw-rw---- 1 postgres postgres 16777216 2009-02-27 00:38 0000000100000000000000CD
> -rw-rw---- 1 postgres postgres 245 2009-02-27 00:38 0000000100000000000000CD.00000020.backup
> -rw-rw---- 1 postgres postgres 16777216 2009-02-27 00:48 0000000100000000000000CE
> -rw-rw---- 1 postgres postgres 16777216 2009-02-27 00:54 0000000100000000000000CF
> -rw-rw---- 1 postgres postgres 16777216 2009-02-27 00:58 0000000100000000000000D0
> -rw-rw---- 1 postgres postgres 16777216 2009-02-27 01:01 0000000100000000000000D1
> -rw-rw---- 1 postgres postgres 16777216 2009-02-27 01:03 0000000100000000000000D2
> -rw-rw---- 1 postgres postgres 16777216 2009-02-27 01:13 0000000100000000000000D3
>
> Then, the recovery.conf is put in place. I've tried two different versions, which end up giving me the same error.
Hereare the two different versions.
>
> #1: restore_command = '/usr/lib/postgresql/8.3/bin/pg_standby -c -d -s 2 -t /mnt/postgresql_archives/pgsql.trigger
/mnt/postgresql_archives%f %p >> /mnt/postgresql_archives/standby.log 1>&2'
>
> #2: restore_command = 'cp /mnt/server/archivedir/%f "%p"'
>
> I don't believe that #2 is suitable for warm standby, but just tried it to debug after #1 wouldn't work. Now, I try
tostart up the server. For it to work in standby mode, additional archive files will be pulled from the primary
machineon a periodic basis. I'm using this command, which deletes them on the primary when they are no longer
necessary. It also seems to work fine.
>
> rsync -avz -e "ssh -i /path/to/key" myuser@$PRIMARY:$PG_ARCHIVES/ $PG_ARCHIVES
>
> I guess I'm a little confused about exactly what is happening here when the server comes up, but here is the error
messageI'm getting. It seems to be looking for the files in pg_pxlog, which is cleared out. So, the error makes
sense. But isn't it supposed to be looking in /mnt/postgresql_archives per the restore_command(s)? The files are
availablethere.
>
> 2009-02-27 01:26:52.867 EST,,,7422,,49a787ac.1cfe,2,,2009-02-27 01:26:52 EST,,0,LOG,58P01,"could not open file
""pg_xlog/0000000100000000000000CD""(log file 0, segment 20
> 5): No such file or directory",,,,,,,,
> 2009-02-27 01:26:52.867 EST,,,7422,,49a787ac.1cfe,3,,2009-02-27 01:26:52 EST,,0,LOG,00000,"invalid checkpoint
record",,,,,,,,
> 2009-02-27 01:26:52.867 EST,,,7422,,49a787ac.1cfe,4,,2009-02-27 01:26:52 EST,,0,PANIC,XX000,"could not locate
requiredcheckpoint record",,"If you are not restoring from a
> backup, try removing the file ""/var/lib/postgresql/8.3/main/backup_label"".",,,,,,
> 2009-02-27 01:26:52.868 EST,,,7419,,49a787ab.1cfb,1,,2009-02-27 01:26:51 EST,,0,LOG,00000,"startup process (PID 7422)
wasterminated by signal 6: Aborted",,,,,,,,
> 2009-02-27 01:26:52.868 EST,,,7419,,49a787ab.1cfb,2,,2009-02-27 01:26:51 EST,,0,LOG,00000,"aborting startup due to
startupprocess failure",,,,,,,,
>
> So, I tried copying the files in the /mnt/postgresql_archives over to pg_xlog. This seemed to work, and the updates
wereapplied. Never at any point did I get a recovery.done file. Also, for whatever reason, I was never able to get
anydebug info from pg_standby in standby.log.
>
> Anyhow, I've burned up a couple days trying to figure this out. Any help would be much appreciated.
>
> Thanks,
> Brad
>
>
>
>
> --
> Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin
>