Re: BUG #7710: Xid epoch is not updated properly during checkpoint

Поиск
Список
Период
Сортировка
От Simon Riggs
Тема Re: BUG #7710: Xid epoch is not updated properly during checkpoint
Дата
Msg-id CA+U5nM+q5nwzCyRa-n_vzFNLLPep1W+vtAt1ysWKTe0-hMscpw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #7710: Xid epoch is not updated properly during checkpoint  (Simon Riggs <simon@2ndQuadrant.com>)
Ответы Re: BUG #7710: Xid epoch is not updated properly during checkpoint  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
On 2 December 2012 12:51, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 1 December 2012 22:56, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> tarvip@gmail.com writes:
>>> [ txid_current can show a bogus value near XID wraparound ]
>>> This happens only if wal_level=hot_standby.
>>
>> I believe what is happening here is
>>
>> (1) CreateCheckPoint sets up checkPoint.nextXid and
>> checkPoint.nextXidEpoch, near xlog.c line 7070 in HEAD.  At this point,
>> nextXid is still a bit less than the wrap point.
>>
>> (2) After performing the checkpoint, at line 7113, CreateCheckPoint
>> calls LogStandbySnapshot() which "helpfully" updates checkPoint.nextXid
>> to the latest value.  Which by now has wrapped around.  But it doesn't
>> fix checkPoint.nextXidEpoch, so the checkpoint that gets written out has
>> effectively lost the epoch bump that should have happened.
>>
>> While we could add some more logic to try to correct the epoch value
>> in this scenario, I think it's a much better idea to just stop having
>> LogStandbySnapshot update the nextXid.  That seems to me to be useless
>> complication.  I also quite dislike the fact that we're effectively
>> redefining the checkpoint nextXid from being taken before the main
>> body of the checkpoint to being taken afterwards, but *only* in
>> XLogStandbyInfoActive mode.  If that inconsistency isn't already causing
>> bugs (besides this one) today, it'll probably cause them in the future.
>
> I agree that the coding looks weird and agree it shouldn't be there.
> The meaning of the checkpoint values should not differ because
> wal_level has changed.
>
>> So barring objections, I'm going to remove LogStandbySnapshot's behavior
>> of returning the updated nextXid.
>
> Removing it may cause other bugs, but if so, those other bugs need to
> be solved in the right way, not by having a too-far-forwards nextxid
> on the checkpoint record. Having said that, I can't see any bugs that
> would be caused by this.

Andres'  patch is invasive and is not for my liking in backbranches.

Tom's recommended change goes further still and not one I'm
comfortable risking, even though I agree with the root cause/change.

I've applied an absolutely minimal fix on this, which introduces no
other changes that could cause unforeseen consequences.

Others may wish to go further, overriding my patches, as they choose.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: BUG #7710: Xid epoch is not updated properly during checkpoint
Следующее
От: Tom Lane
Дата:
Сообщение: Re: BUG #7710: Xid epoch is not updated properly during checkpoint