The timeline for the events all dates MM/DD/YYYY
   06/09/2009 1310 EDT - Hardware fault on primary database server
db01pri
   06/09/2009 1325 EDT - Failover to warm standby db01sec
   06/12/2009 1615 EDT - db01pri server fixed and OS booted
   06/15/2009 1115 EDT - started recovery of hotbackup from 06/15/2009
0205 EDT from db01sec onto db01pri
   06/15/2009 1320 EDT - Attempted to start postgres on db01pri in
warm standby mode
   06/15/2009 1325 EDT - Failure to apply WAL log errors with
"unexpected timeline ID"
   06/15/2009 1340 EDT - Started a new hotbackup on db01sec
   06/15/2009 1545 EDT - Started recovery hotbackup from 06/15/2009
1340 to db01pri
   06/15/2000 1430 EDT - db01pri recovered and running in warm standby
Here is the contents of the pg_xlog directory and the 00000004.history
file:
[postgres@db01pri ~]$Â cat 00000004.history
1Â Â Â 0000000100000736000000A1Â Â Â before transaction 0 at 1999-12-31
19:00:00-05
[postgres@db01pri ~]$Â ls -l
total 98468
-rw------- 1 postgres postgres      74 Jul 10 2008 00000002.history
-rw------- 1 postgres postgres      74 Jun 9 13:29 00000003.history
-rw-------Â 1 postgres postgres 16777216 Jun 16 08:45
0000000400000749000000C9
-rw-------Â 1 postgres postgres 16777216 Jun 16 08:46
0000000400000749000000CA
-rw-------Â 1 postgres postgres 16777216 Jun 16 08:47
0000000400000749000000CB
-rw------- 1 postgres postgres      74 Jun 9 13:33 00000004.history
drwxr-xr-x 2 postgres postgres   32768 Jun 16 08:46 archive_status
-rw------- 1 postgres postgres 16777216 Jun 9 13:45 xlogtemp.17243
-rw------- 1 postgres postgres 16777216 Jun 9 13:45 xlogtemp.17244
-rw------- 1 postgres postgres 16777216 Jun 9 13:52 xlogtemp.17397
[postgres@db01pri ~]$
Thanks again,
Keith
Tom Lane wrote:
Keith Pierno <kpierno@lulu.com> writes:
The backup used was from well after the failover time which is why I
was concerned. Interestingly enough the logs are still all prefixed
with 00000004... That just makes this problem extremely bizarre.
Hmm, that *is* weird. It seems like the new primary must have reverted
its decision to go from timeline 4 to timeline 6. (Which in itself is
a bit odd; why not timeline 5?)
Can you give us an exact sequence of events on the slave server/new
primary around the time of the failover? Also, what was in the .history
file when you found it, and are there any other history files?
regards, tom lane