Re: FSM Corruption (was: Could not read block at end of the relation)
От | Ronan Dunklau |
---|---|
Тема | Re: FSM Corruption (was: Could not read block at end of the relation) |
Дата | |
Msg-id | 5959995.31r3eYUQgx@aivenlaptop обсуждение исходный текст |
Ответ на | Re: FSM Corruption (was: Could not read block at end of the relation) (Noah Misch <noah@leadboat.com>) |
Ответы |
Re: FSM Corruption (was: Could not read block at end of the relation)
|
Список | pgsql-bugs |
Le dimanche 7 avril 2024, 00:30:37 CEST Noah Misch a écrit : > Your v3 has the right functionality. As further confirmation of the fix, I > tried reverting the non-test parts of commit 917dc7d "Fix WAL-logging of FSM > and VM truncation". That commit's 008_fsm_truncation.pl fails with 917dc7d > reverted from master, and adding this patch makes it pass again. I ran > pgindent and edited comments. I think the attached version is ready to go. > Thank you Noah, the updated comments are much better. I think it should be backported at least to 16 since the chances of tripping on that behaviour are quite high here, but what about previous versions ? > While updating comments in FreeSpaceMapPrepareTruncateRel(), I entered a > rabbit hole about the comments 917dc7d left about torn pages. I'm sharing > these findings just in case it helps a reader of the $SUBJECT patch avoid > the same rabbit hole. Both fsm and vm read with RBM_ZERO_ON_ERROR, so I > think they're fine with torn pages. Per the README sentences I'm adding, > FSM could stop writing WAL. I'm not proposing that, but I do bet it's the > right thing. visibilitymap_prepare_truncate() has mirrored fsm truncate > since 917dc7d. The case for removing WAL there is clearer still, because > parallel function visibilitymap_clear() does not write WAL. I'm attaching > a WIP patch to remove visibilitymap_prepare_truncate() WAL. I'll abandon > that or pursue it for v18, in a different thread. That's an interesting finding. > If I were continuing the benchmark study, I would try SSD, a newer kernel, > and/or shared_buffers=48GB. Instead, since your perf results show only > +0.01% CPU from new lseek() calls, I'm going to stop there and say it's > worth taking the remaining risk that some realistic scenario gets a > material regression from those new lseek() calls. Agree with you here. Many thanks, -- Ronan Dunklau
В списке pgsql-bugs по дате отправления: