Re: FSM Corruption (was: Could not read block at end of the relation)
От | Ronan Dunklau |
---|---|
Тема | Re: FSM Corruption (was: Could not read block at end of the relation) |
Дата | |
Msg-id | 12418161.O9o76ZdvQC@aivenlaptop обсуждение исходный текст |
Ответ на | Re: FSM Corruption (was: Could not read block at end of the relation) (Noah Misch <noah@leadboat.com>) |
Список | pgsql-bugs |
Le samedi 13 avril 2024, 19:15:28 CEST Noah Misch a écrit : > On Thu, Apr 11, 2024 at 08:38:43AM -0700, Noah Misch wrote: > > On Thu, Apr 11, 2024 at 09:36:50AM +0200, Ronan Dunklau wrote: > > > Le dimanche 7 avril 2024, 00:30:37 CEST Noah Misch a écrit : > > > > Your v3 has the right functionality. As further confirmation of the > > > > fix, I > > > > tried reverting the non-test parts of commit 917dc7d "Fix WAL-logging > > > > of FSM and VM truncation". That commit's 008_fsm_truncation.pl fails > > > > with 917dc7d reverted from master, and adding this patch makes it > > > > pass again. I ran pgindent and edited comments. I think the > > > > attached version is ready to go.> > > > > Thank you Noah, the updated comments are much better. I think it should > > > be > > > backported at least to 16 since the chances of tripping on that > > > behaviour are quite high here, but what about previous versions ? > > > > It should be reachable in all branches, just needing concurrent extension > > lock waiters to reach before v16. Hence, my plan is to back-patch it all > > the way. It applies with negligible conflicts back to v12. > > While it applied, it doesn't build in v12 or v13, due to smgr_cached_nblocks > first appearing in c5315f4. Options: > > 1. Back-patch the addition of smgr_cached_nblocks or equivalent. > 2. Stop the back-patch of $SUBJECT at v14. > 3. Incur more lseek() in v13 and v12. > > Given the lack of reports before v16, (3) seems too likely to be a cure > worse than the disease. I'm picking (2) for today. We could do (1) > tomorrow, but I lean toward (2) until someone reports the problem on v13 or > v12. The problem's impact is limited to DML giving ERROR when it should > have succeeded, and I expect VACUUM FULL is a workaround. Without those > mitigating factors, I would choose (1). > > Pushed that way, as 9358297. I agree with you that option 2 seems to be the safest course of action. For the record, other available options when that happens is to stop PG and manually remove the FSM from disk (see https://wiki.postgresql.org/wiki/ Free_Space_Map_Problems) or adapt the patch submitted here https:// www.postgresql.org/message-id/flat/5446938.Sb9uPGUboI%40aivenlaptop to do it online. Thank you for all your work with refining, testing and finally comitting the fix. Best regards, -- Ronan Dunklau
В списке pgsql-bugs по дате отправления: