Re: Recovery inconsistencies, standby much larger than primary

Поиск
Список
Период
Сортировка
От Greg Stark
Тема Re: Recovery inconsistencies, standby much larger than primary
Дата
Msg-id CAM-w4HM0N1qd3CVusYXogYQxVV43WO4qAy+J0fjjCZ5SmutX0A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Recovery inconsistencies, standby much larger than primary  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Recovery inconsistencies, standby much larger than primary  (Andres Freund <andres@2ndquadrant.com>)
Re: Recovery inconsistencies, standby much larger than primary  (Greg Stark <stark@mit.edu>)
Список pgsql-hackers
On Thu, Feb 6, 2014 at 10:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I had noticed that the WAL records that were mis-replayed seemed to
> be bunched pretty close together (two of them even adjacent).  Could
> you confirm that?  If so, it seems like we're looking for some condition
> that makes mis-replay fairly probable for a period of time, but in
> itself might be quite improbable.  Not that that helps much at
> nailing it down.

Well one thing that argues for is hardware problems. It could be that
the right memory mapping just happened to line up with that variable
on the stack for that short time and then it was mapped to something
else entirely. Or the machine was overheating for a short time and
then the temperature became more reasonable. Or the person with the
x-ray source walked by in that short time window.

That doesn't explain the other instance or the other copies of this
database. I think the most productive thing I can do is switch my
attention to the other database to see if it really looks like the
same problem.

> You might well be on to something with the bgwriter idea, considering
> that none of the WAL replay code was originally written with any
> concurrent execution in mind.  We might've missed some place where
> additional locking is needed.

Except that the bgwriter has been in there for a few years already.
Unless there's been some other change, possibly involving copying some
code that was safe in some context but not where it was copied to.

The problem with the bgwriter being at fault is that from what I can
see the bgwriter will never extend a file. That means the xlog
recovery code must have done it. That means even if the bgwriter came
along and looked at the buffer we just read in it would already be too
late to cause mischief. The xlog code extends the file *first* then
reads in the backup block into a buffer. I can't see how it could
corrupt the stack or the wal recovery buffer in the window between
reading in the wal buffer and deciding to extend the relation.


-- 
greg



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeremy Harris
Дата:
Сообщение: Re: Minor performance improvement in transition to external sort
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Recovery inconsistencies, standby much larger than primary