On 12.01.2013 04:32, Phil Monroe wrote:
> Hi Everyone,
>
> So we had to failover and do a full base backup to get our slave database back
> online and ran into a interesting scenario. After copying the data directory,
> setting up the recovery.conf, and starting the slave database, the database
> crashes while replaying xlogs. However, trying to start the database again, the
> database is able to replay xlogs farther than it initially got, but ultimately
> ended up failing out again. After starting the DB a third time, PostgreSQL
> replays even further and catches up to the master to start streaming
> replication. Is this common and or acceptable?
How did you perform the base backup? Did you use pg_basebackup? Or if
you did a filesystem-level copy, did you use pg_start/stop_backup
correctly? Did you take the base backup from the master server, or from
another slave?
This looks similar to the bug discussed here:
http://www.postgresql.org/message-id/CAMkU=1wpvYJVEDo6Qvq4QbosZ+AV6BMVCf+XVCG=mJqFRjQ8Pg@mail.gmail.com.
That was fixed in 9.2.2, so if you're using 9.2.1 or 9.2.0, try upgrading.
- Heikki