Re: VM map freeze corruption

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: VM map freeze corruption
Дата
Msg-id CAD21AoCqXqQdPTTRt_bUikpZFS89+6KO8AyuvhRdaq55x7LaKQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: VM map freeze corruption  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Список pgsql-hackers
On Wed, Apr 18, 2018 at 10:36 PM, Alvaro Herrera
<alvherre@alvh.no-ip.org> wrote:
> Pavan Deolasee wrote:
>> On Wed, Apr 18, 2018 at 7:37 AM, Wood, Dan <hexpert@amazon.com> wrote:
>
>> > My analysis is that heap_prepare_freeze_tuple->FreezeMultiXactId()
>> > returns FRM_NOOP if the MultiXACT locked rows haven't committed.  This
>> > results in changed=false and totally_frozen=true(as initialized).  When
>> > this returns to lazy_scan_heap(), no rows are added to the frozen[] array.
>> > Yet, tuple_totally_frozen is true.  This means the page is marked frozen in
>> > the VM, even though the MultiXACT row wasn't left untouch.
>> >
>> > A fix to heap_prepare_freeze_tuple() that seems to do the trick is:
>> >         else
>> >         {
>> >             Assert(flags & FRM_NOOP);
>> > +          totally_frozen = false;
>> >         }
>> >
>>
>> That's a great find!
>
> Indeed.
>
> This family of bugs (introduced by freeze map changes in 9.6) was
> initially fixed in 38e9f90a227d but this spot was missed in that fix.
>
> IMO the cause is the totally_frozen variable, which starts life in a
> bogus state (true) and the different code paths can set it to the right
> state, or by inaction end up deciding that the initial bogus state was
> correct in the first place.  Errors of omission are far too easy in that
> kind of model, ISTM, so I propose this slightly different patch, which
> albeit yet untested seems easier to verify and easier to get right.
>

Thank you for the patch!
Agreed. The patch looks to fix this issue correctly.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Speedup of relation deletes during recovery
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: Adding an LWLockHeldByMe()-like function that reports if anybuffer content lock is held