Re: BUG #4879: bgwriter fails to fsync the file in recovery mode

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: BUG #4879: bgwriter fails to fsync the file in recovery mode
Дата
Msg-id 4A43DAF6.90203@enterprisedb.com
обсуждение исходный текст
Ответ на Re: BUG #4879: bgwriter fails to fsync the file in recovery mode  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: BUG #4879: bgwriter fails to fsync the file in recovery mode  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
Tom Lane wrote:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> Tom Lane wrote:
>>> ... I think it might be better to fix
>>> things so that InRecovery is maintained correctly in the bgwriter too.
>
>> We could set InRecovery=true in CreateCheckPoint if it's a startup
>> checkpoint, and reset it afterwards. I'm not 100% sure it's safe to have
>> bgwriter running with InRecovery=true at other times. Grepping for
>> InRecovery doesn't show anything that bgwriter calls, but it feels safer
>> that way.
>
> Actually, my thought was exactly that it would be better if it was set
> correctly earlier in the run --- if there ever are any places where it
> matters, this way is more likely to be right.

Well, we have RecoveryInProgress() now that answers the question "is
recovery still in progress in the system". InRecovery now means "am I a
process that's performing WAL replay?".

>  (I'm not convinced that
> it doesn't matter today, anyhow --- are we sure these places are not
> called in a restartpoint?)

Hmm, good point, I didn't think of restartpoints. But skimming though
all the references to InRecovery, I can't see any.

>> Hmm, I see another small issue. We now keep track of the "minimum
>> recovery point". Whenever a data page is flushed, we set minimum
>> recovery point to the LSN of the page in XLogFlush(), instead of
>> fsyncing WAL like we do in normal operation. During the end-of-recovery
>> checkpoint, however, RecoveryInProgress() returns false, so we don't
>> update minimum recovery point in XLogFlush(). You're unlikely to be
>> bitten by that in practice; you would need to crash during the
>> end-of-recovery checkpoint, and then set the recovery target to an
>> earlier point. It should be fixed nevertheless.
>
> We would want the end-of-recovery checkpoint to act like it's not in
> recovery anymore for this purpose, no?

For the purpose of updating min recovery point, we want it to act like
it *is* still in recovery. But in the XLogFlush() call in
CreateCheckPoint(), we really want it to flush the WAL, not update min
recovery point.

A simple fix is to call UpdateMinRecoveryPoint() after the WAL replay is
finished, but before creating the checkpoint. exitArchiveRecovery()
seems like a good place.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #4879: bgwriter fails to fsync the file in recovery mode
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: BUG #4879: bgwriter fails to fsync the file in recovery mode