On 2015-11-04 16:01:28 +0900, Michael Paquier wrote:
> On Wed, Nov 4, 2015 at 8:39 AM, Andres Freund <andres@anarazel.de> wrote:
> > On November 4, 2015 12:37:02 AM GMT+01:00, Michael Paquier wrote:
> >>On a completely idle system, I don't think we should log any standby
> >>records. This is what ~9.3 does.
> >
> > Are you sure? I think it'll around checkpoints, no? I thought Heikki had fixed that, but looking sound that doesn't
seemto be the case.
>
> Er, yes, sorry. I should have used clearer words: I meant idle system
> with something running nothing including internal checkpoints.
Uh, but you'll always have checkpoints happen on wal_level =
hot_standby, even in 9.3? Maybe I'm not parsing your sentence right.
As soon as a single checkpoint ever happened the early-return logic in
CreateCheckPoint() will fail to take the LogStandbySnapshot() in
CreateCheckPoint() into account. The test is: if (curInsert == ControlFile->checkPoint +MAXALIGN(SizeOfXLogRecord +
sizeof(CheckPoint))&&ControlFile->checkPoint == ControlFile->checkPointCopy.redo)
which obviously doesn't work if there's been a WAL record logged after
the redo pointer has been determined etc.
The reason that a single checkpoint is needed to "jumpstart" the
pointless checkpoints is that otherwise we'll never have issued a
LogStandbySnapshot() and thus the above code block works if we started
from a proper shutdown checkpoint.
Independent of the idle issue, it seems to me that the location of the
LogStandbySnapshot() is actually rather suboptimal - it really should
really be before the CheckPointGuts(), not afterwards. As closer it's to
the redo pointer of the checkpoint a hot standby node starts up from,
the sooner that node can reach consistency. There's no difference for
the first time a node starts from a basebackup (since we gotta replay
that checkpoint anyway before we're consistent), but if we start from a
restartpoint...
Greetings,
Andres Freund