Re: Failing start-up archive recovery at Standby mode in PG9.2.4

Поиск
Список
Период
Сортировка
От Amit Langote
Тема Re: Failing start-up archive recovery at Standby mode in PG9.2.4
Дата
Msg-id 1366870433934-5753221.post@n5.nabble.com
обсуждение исходный текст
Ответ на Re: Failing start-up archive recovery at Standby mode in PG9.2.4  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Ответы Re: Failing start-up archive recovery at Standby mode in PG9.2.4  (Kyotaro HORIGUCHI <kyota.horiguchi@gmail.com>)
Список pgsql-hackers
I also had a similar observation when I could reproduce this.
I tried to find why restartpoint causes the recycled segment to be named
after timeline 3, but have not been able to determine that.

When I looked at the source, I found that, the function XLogFileReadAnyTLI
which returns a segment file for reading a XLog page iterates over a list
expectedTLIs which starts with 3 in such a case (that is, in case where this
error happens).  XLogFileReadAnyTLI checks the segment in both archive and
pg_xlog. So, even if such a segment (that is with timeline 3) may not be in
the archive , it may be in pg_xlog, due to recycling as we have observed.
So, such a recycled segment may be returned by XLogFileReadAnyTLI as though
it were the next segment to recover from, resulting in the "unexpected
pageaddr ..." error. 

I could not understand (in case this error happens) how expectedTLIs list
had 3 at its head (for example, when XLogFileReadAnyTLI used it as we
observed) whereas at other times (when this error does not happen), it has 2
at its head until all the segments of timeline 2 are recovered from?
Also, how does recycled segment gets timeline 3 whereas 2 is expected in
this case?

Is this the right way to look at the problem and its possible fix?

I have tried going through the source regarding this but have not been able
to determine where this could accidentally happen, partly because I do not
understand recovery process (and its code) very well. Will post if find
anything useful. 

regards,
Amit Langote





--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Failing-start-up-archive-recovery-at-Standby-mode-in-PG9-2-4-tp5753110p5753221.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Josh Berkus
Дата:
Сообщение: Please add discussion topics for cluster-hackers meeting
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Redundancy in comment within lock.c