On Tue, Mar 05, 2019 at 02:08:03PM +0100, Tomas Vondra wrote: > Based on quickly skimming that thread the main issue seems to be > deciding which files in the data directory are expected to have > checksums. Which is a valid issue, of course, but I was expecting > something about partial read/writes etc.
I remember complaining about partial write handling as well for the base backup checks... There should be an email about it on the list, cannot find it now ;p
> My understanding is that: > > (a) The checksum verification should not generate false positives (same > as for basebackup). > > (b) The partial reads do emit warnings, which might be considered false > positives I guess. Which is why I'm arguing for changing it to do the > same thing basebackup does, i.e. ignore this.
Well, at least that's consistent... Argh, I really think that we ought to make the failures reported harder because that's easier to detect within a tool and some deployments set log_min_messages > WARNING so checksum failures would just be lost. For base backups we don't care much about that as files are just blindly copied so they could have torn pages, which is fine as that's fixed at replay. Now we are talking about a set of tools which could have reliable detection mechanisms for those problems.
I’m traveling but will try to comment more in the coming days but in general I agree with Tomas on these items. Also, pg_basebackup has to handle torn pages when it comes to checksums just like the verify tool does, and having them be consistent (along with external tools) would really be for the best, imv. I still feel like a retry of a short read (try reading more to get the whole page..) would be alright and reading until we hit eof and then moving on. I’m not sure it’s possible but I do worry a bit that we might get a short read from a network file system or something that isn’t actually at eof and then we would skip a significant remaining portion of the file... another thought might be to stat the file after we have opened it to see it’s length...
Just a few thoughts since I’m on my phone. Will try to write up something more in a day or two.