Re: Online verification of checksums
От | Fabien COELHO |
---|---|
Тема | Re: Online verification of checksums |
Дата | |
Msg-id | alpine.DEB.2.21.1903030743240.8095@lancre обсуждение исходный текст |
Ответ на | Re: Online verification of checksums (Michael Paquier <michael@paquier.xyz>) |
Ответы |
Re: Online verification of checksums
|
Список | pgsql-hackers |
Bonjour Michaël, >> I gotta say, my conclusion from this debate is that it's simply a >> mistake to do this without involvement of the server that can use >> locking to prevent these kind of issues. It seems pretty absurd to me >> to have hacky workarounds around partial writes of a live server, around >> truncation, etc, even though the server has ways to deal with that. > > I agree with Andres on this one. We are never going to make this stuff > safe if we don't handle page reads with the proper locks because of torn > pages. What I think we should do is provide a SQL function which reads a > page in shared mode, and then checks its checksum if its LSN is older > than the previous redo point. This discards cases with rather hot > pages, but if the page is hot enough then the backend re-reading the > page would just do the same by verifying the page checksum by itself. -- > Michael My 0.02€ about that, as one of the reviewer of the patch: I agree that having a server function (extension?) to do a full checksum verification, possibly bandwidth-controlled, would be a good thing. However it would have side effects, such as interfering deeply with the server page cache, which may or may not be desirable. On the other hand I also see value in an independent system-level external tool capable of a best effort checksum verification: the current check that the cluster is offline to prevent pg_verify_checksum from running is kind of artificial, and when online simply counting online-database-related checksum issues looks like a reasonable compromise. So basically I think that allowing pg_verify_checksum to run on an online cluster is still a good thing, provided that expected errors are correctly handled. -- Fabien.
В списке pgsql-hackers по дате отправления: