On Wed, 15 Apr 2020 at 04:04, Teja Mupparti <tejeswarm@hotmail.com> wrote:
>
> Thanks Kyotaro and Masahiko for the feedback. I think there is a consensus on the critical-section around truncate,
butI just want to emphasize the need for reversing the order of the dropping the buffers and the truncation.
>
> Repro details (when full page write = off)
>
> 1) Page on disk has empty LP 1, Insert into page LP 1
> 2) checkpoint START (Recovery REDO eventually starts here)
> 3) Delete all rows on the page (page is empty now)
> 4) Autovacuum kicks in and truncates the pages
> DropRelFileNodeBuffers - Dirty page NOT written, LP 1 on disk still empty
> 5) Checkpoint completes
> 6) Crash
> 7) smgrtruncate - Not reached (this is where we do the physical truncate)
>
> Now the crash-recovery starts
>
> Delete-log-replay (above step-3) reads page with empty LP 1 and the delete fails with PANIC (old page on
diskwith no insert)
>
I agree that when replaying the deletion of (3) the page LP 1 is
empty, but does that replay really fail with PANIC? I guess that we
record that page into invalid_page_tab but don't raise a PANIC in this
case.
Regards,
--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services