Re: [PATCH] Verify Checksums during Basebackups
От | Michael Banck |
---|---|
Тема | Re: [PATCH] Verify Checksums during Basebackups |
Дата | |
Msg-id | 1521759951.15036.21.camel@credativ.de обсуждение исходный текст |
Ответ на | Re: [PATCH] Verify Checksums during Basebackups (Magnus Hagander <magnus@hagander.net>) |
Список | pgsql-hackers |
Hi Magnus, thanks a lot for looking at my patch! Am Donnerstag, den 22.03.2018, 15:07 +0100 schrieb Magnus Hagander: > On Sat, Mar 17, 2018 at 10:34 PM, Michael Banck <michael.banck@credativ.de> wrote: > > On Fri, Mar 09, 2018 at 10:35:33PM +0100, Michael Banck wrote: > > > Possibly open questions: > > > > > > 1. I have not so far changed the replication protocol to make verifying > > > checksums optional. I can go about that next if the consensus is that we > > > need such an option (and cannot just check it everytime)? > > > > I think most people (including those I had off-list discussions about > > this with) were of the opinion that such an option should be there, so I > > added an additional option NOVERIFY_CHECKSUMS to the BASE_BACKUP > > replication command and also an option -k / --no-verify-checksums to > > pg_basebackup to trigger this. > > > > Updated patch attached > > > > Notes: > > + if (checksum_failure == true) > > Really, just if(checksum_failure) > > + errmsg("checksum mismatch during basebackup"))); > > Should be "base backup" in messages. I've changed both. > +static const char *skip[] = { > > I think that needs a much better name than just "skip". Skip for what? > In particular since we are just skipping it for checksums, and not for > the actual basebackup, that name is actively misinforming. I have copied that verbatim from the online checksum patch, but of course this is in src/backend/replication and not src/bin so warrants more scrutiny. If you plan to commit both for v11, it might make sense to have that separated out to a more central place? But I guess what we mean is a test for "is a heap file". Do you have a good suggestion where it should end up so that pg_verify_checksums can use it as well? In the meantime, I've changed the skip[] array to no_heap_files[] and the skipfile() function to is_heap_file(), also reversing the logic. If it helps pg_verify_checksums, we could make is_not_a_heap_file() instead. > + filename = basename(pstrdup(readfilename)); > + if (!noverify_checksums && DataChecksumsEnabled() && > + !skipfile(filename) && > + (strncmp(readfilename, "./global/", 9) == 0 || > + strncmp(readfilename, "./base/", 7) == 0 || > + strncmp(readfilename, "/", 1) == 0)) > + verify_checksum = true; > > I would include the checks for global, base etc into the skipfile() > function as well (also renamed). Check. I had to change the way (the previous) skipfile() works a bit, because it was expecting a filename as argument, while we check pathnames in the above. > + * Only check pages which have not been modified since the > + * start of the base backup. > > I think this needs a description of why, as well (without having read > this thread, this is a pretty subtle case). I tried to expand on this some more. > +system_or_bail 'dd', 'conv=notrunc', 'oflag=seek_bytes', 'seek=4000', 'bs=9', 'count=1', 'if=/dev/zero', "of=$pgdata/$pg_class"; > > This part of the test will surely fail on Windows, not having a > /dev/zero. Can we easily implement this part natively in perl perhaps? Right, this was one of the open questions. I now came up with a perl 4- liner that seems to do the trick, but I can't test it on Windows. > Most of that stuff is trivial and can be cleaned up at commit time. Do > you want to send an updated patch with a few of those fixes, or should > I clean it? I've attached a new patch, but I have not addressed the question whether skipfile()/is_heap_file() should be moved somewhere else yet. I found one more cosmetic issue: if there is an external tablespace, and pg_basebackup encounters corruption, you would get the message "pg_basebackup: changes to tablespace directories will not be undone" from cleanup_directories_atexit(), which I now also suppress in case of checksum failures. > The test thing is a stopper until we figure that one out though. And > while at it -- it seems we don't have any tests for the checksum > feature in general. It would probably make sense to consider that at > the same time as figuring out the right way to do this one. I don't want to deflect work, but it seems to me the online checksums patch would be in a better position to generally test checksums while it's at it. Or did you mean something related to pg_basebackup? Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.banck@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
Вложения
В списке pgsql-hackers по дате отправления: