On Thursday, September 13, 2012 10:32 PM Fujii Masao wrote:
On Thu, Sep 13, 2012 at 9:21 PM, Heikki Linnakangas <hlinnaka@iki.fi> =
wrote:
> On 12.09.2012 22:03, Fujii Masao wrote:
>>
>> On Wed, Sep 12, 2012 at 8:47 PM,<amit.kapila@huawei.com> wrote:
>>>
>>> The following bug has been logged on the website:
>>>
>>> Bug reference: 7533
>>> Logged by: Amit Kapila
>>> Email address: amit.kapila@huawei.com
>>> PostgreSQL version: 9.2.0
>>> Operating system: Suse
>>> Description:
>>>
>>> M host is primary, S host is standby and CS host is cascaded =
standby.
>>>
>
>> Hmm, I think the CheckRecoveryConsistency() call in the redo loop is
>> misplaced. It's called after we got a record from ReadRecord, but =
*before*
>> replaying it (rm_redo). Even if replaying record X makes the system
>> consistent, we won't check and notice that until we have fetched =
record X+1.
>> In this particular test case, record X is a shutdown checkpoint =
record, but
>> it could as well be a running-xacts record, or the record that =
reaches
>> minRecoveryPoint.
>
>> Does the problem go away if you just move the =
CheckRecoveryConsistency()
>> call *after* rm_redo (attached)?
> No, at least in my case. When recovery starts at shutdown checkpoint =
record and
> there is no record following the shutdown checkpoint, recovery gets in
> wait state
> before entering the main redo apply loop. That is, recovery starts =
waiting for
> new WAL record to arrive, in ReadRecord just before the redo loop. So =
moving
> the CheckRecoveryConsistency() call after rm_redo cannot fix the =
problem which
>I reported. To fix the problem, we need to make the recovery reach the
> consistent
> point before the redo loop, i.e., in the CheckRecoveryConsistency()
> just before the redo loop.
I think may be in that case we need both the fixes, as the problem I =
have reported can be fixed with Heikki's patch.
With Regards,
Amit Kapila.