Обсуждение: WAL format changes break the suppression of do-nothing checkpoints.
commit 2c03216d831160bedd72d45f7 has invalidated the part of the docs saying "If no WAL has been written since the previous checkpoint, new checkpoints will be skipped even if checkpoint_timeout has passed", presumably by accident.
It seems that this part is no longer true when it should be true:
if (curInsert == ControlFile->checkPoint +
MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint))
MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint) is now 96, but the amount by which curInsert gets advanced is still 104, like it was before the commit.
Cheers,
Jeff
On 03/30/2015 09:01 PM, Jeff Janes wrote: > commit 2c03216d831160bedd72d45f7 has invalidated the part of the docs > saying "If no WAL has been written since the previous checkpoint, new > checkpoints will be skipped even if checkpoint_timeout has passed", > presumably by accident. > > It seems that this part is no longer true when it should be true: > > if (curInsert == ControlFile->checkPoint + > MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint)) > > MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint) is now 96, but the amount by > which curInsert gets advanced is still 104, like it was before the commit. Hmm. Wasn't this a bit broken before too, when the checkpoint record crosses a page boundary? Instead of trying to calculate where the checkpoint record ends, I think we could check that the prev-pointer points to the last checkpoint record. - Heikki
On 03/31/2015 09:09 PM, Jeff Janes wrote: > On Tue, Mar 31, 2015 at 6:19 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > >> On 03/30/2015 09:01 PM, Jeff Janes wrote: >> >>> commit 2c03216d831160bedd72d45f7 has invalidated the part of the docs >>> saying "If no WAL has been written since the previous checkpoint, new >>> checkpoints will be skipped even if checkpoint_timeout has passed", >>> presumably by accident. >>> >>> It seems that this part is no longer true when it should be true: >>> >>> if (curInsert == ControlFile->checkPoint + >>> MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint)) >>> >>> MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint) is now 96, but the amount >>> by >>> which curInsert gets advanced is still 104, like it was before the commit. >> >> Hmm. Wasn't this a bit broken before too, when the checkpoint record >> crosses a page boundary? > > But once the next one it was written, it wouldn't across a boundary > anymore. So you would get one extra checkpoint, rather than an infinite > stream of them. So a bit broken, but just a bit. Yeah. It's not serious. It occurs to me that that's actually nice from a robustness point of view: if something goes badly wrong and corrupt, you're more likely to be able to recover the last checkpoint record if it doesn't cross a page boundary. And more importantly, a segment boundary. OTOH, when you create a spurious checkpoint, you lose the "previous" checkpoint location. A good compromise is probably to not skip the checkpoint if the last one crossed a segment boundary. (Committed that way) >> Instead of trying to calculate where the checkpoint record ends, I think >> we could check that the prev-pointer points to the last checkpoint record. > > Wouldn't doing that involve reading WAL while not in recovery? No. We keep track of the beginning position of the last inserted record at all times, because when the next record is inserted, we have to put that in the xl_prev field. Committed a patch to do that. - Heikki