On 2014-01-13 16:36:41 -0500, Tom Lane wrote:
> Heikki Linnakangas <hlinnakangas@vmware.com> writes:
> > Good point. Normally, we expect the checksum to match on all pages that
> > we read during WAL replay, because full page writes will initialize any
> > page that is modified to an untorn state, before it's ever read. But we
> > can't rely on that in the extra read that btree_xlog_vacuum() does.
>
> But it's not an "extra" read. It's replicating a read that was done
> on the master in the btvacuumpage() scan. AFAICS the only way to fail
> on the slave and not the master is if the slave has inconsistent data,
> in which case you're at hazard of failing anyway.
I tried to explain which scenario I see as dangerous nearby.
> >> Now, you could argue that that shouldn't be the case because we're only
> >> entering that codepath once STANDBY_SNAPSHOT_READY and you might be
> >> right...
>
> > I don't think that saves us. standbyMode can be STANDBY_SNAPSHOT_READY,
> > before we reach consistency. Adding a check for reachedConsistency,
> > though, ought to fix it.
>
> Huh? Surely we're not letting queries in until we're consistent.
We don't, but STANDBY_SNAPSHOT_READY isn't the only variable controlling
that. It just determines whether we'd have the necessary visibility
information. The full check is:
*/
if (standbyState == STANDBY_SNAPSHOT_READY &&
!LocalHotStandbyActive &&
reachedConsistency &&
IsUnderPostmaster)
{
...
xlogctl->SharedHotStandbyActive = true;
...
SendPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY);
}
So we need to mimick that.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services