avoid multiple hard links to same WAL file after a crash
От | Nathan Bossart |
---|---|
Тема | avoid multiple hard links to same WAL file after a crash |
Дата | |
Msg-id | 20220407182954.GA1231544@nathanxps13 обсуждение исходный текст |
Ответы |
Re: avoid multiple hard links to same WAL file after a crash
Re: avoid multiple hard links to same WAL file after a crash |
Список | pgsql-hackers |
Hi hackers, I am splitting this off of a previous thread aimed at reducing archiving overhead [0], as I believe this fix might deserve back-patching. Presently, WAL recycling uses durable_rename_excl(), which notes that a crash at an unfortunate moment can result in two links to the same file. My testing [1] demonstrated that it was possible to end up with two links to the same file in pg_wal after a crash just before unlink() during WAL recycling. Specifically, the test produced links to the same file for the current WAL file and the next one because the half-recycled WAL file was re-recycled upon restarting. This seems likely to lead to WAL corruption. The attached patch prevents this problem by using durable_rename() instead of durable_rename_excl() for WAL recycling. This removes the protection against accidentally overwriting an existing WAL file, but there shouldn't be one. This patch also sets the stage for reducing archiving overhead (as discussed in the other thread [0]). The proposed change to reduce archiving overhead will make it more likely that the server will attempt to re-archive segments after a crash. This might lead to archive corruption if the server concurrently writes to the same file via the aforementioned bug. [0] https://www.postgresql.org/message-id/20220222011948.GA3850532%40nathanxps13 [1] https://www.postgresql.org/message-id/20220222173711.GA3852671%40nathanxps13 -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Вложения
В списке pgsql-hackers по дате отправления: