Re: Recovery inconsistencies, standby much larger than primary

Поиск
Список
Период
Сортировка
От Greg Stark
Тема Re: Recovery inconsistencies, standby much larger than primary
Дата
Msg-id CAM-w4HOE7ZwMTWttt5rYeQxhURdR5sSm1JLQOv0S4qC8P6-7MQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Recovery inconsistencies, standby much larger than primary  (Andres Freund <andres@2ndquadrant.com>)
Ответы Re: Recovery inconsistencies, standby much larger than primary  (Andres Freund <andres@2ndquadrant.com>)
Re: Recovery inconsistencies, standby much larger than primary  (Andres Freund <andres@2ndquadrant.com>)
Список pgsql-hackers
On Sun, Jan 26, 2014 at 5:45 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>
>> We're also seeing log entries about "wal contains reference to invalid
>> pages" but these errors seem only vaguely correlated. Sometimes we get
>> the errors but the tables don't grow noticeably and sometimes we don't
>> get the errors and the tables are much larger.
>
> Uhm. I am a bit confused. You see those in the standby's log? At !debug
> log levels? That'd imply that the standby is dead and needed to be
> recloned, no? How do you continue after that?


So in chatting with Heikki last night we came up with a scenario where
this check is insufficient.

If you have multiple checkpoints during the base backup then there
will be restartpoints during recovery. If the reference to the invalid
page is before the restartpont then after crashing recovery and coming
back up the recovery will go forward fine.

Fixing this check doesn't look trivial. I'm inclined to say to
suppress any restartpoints while there are references to invalid pages
in the hash. The problem with this is that it will prevent trimming
the xlog during recovery. It seems frightening that most days recovery
will take little extra space but if you happen to have a drop table or
truncate during the base backup then your recovery might require a lot
of extra space.

The alternative of spilling the hash table to disk at every
restartpoint seems kind of hokey. Then we need to worry about fsyncing
this file, cleaning it up, dealing with the file after crashes, etc.

-- 
greg



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Asif Naeem
Дата:
Сообщение: Re: [bug fix or improvement?] Correctly place DLLs for ECPG apps in bin folder
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Recovery inconsistencies, standby much larger than primary