Re: [PATCH] Verify Checksums during Basebackups
От | Magnus Hagander |
---|---|
Тема | Re: [PATCH] Verify Checksums during Basebackups |
Дата | |
Msg-id | CABUevEy-abw5HyjpQvJJQKEv3r46wGHesgfqxqpYrhucSKnbyA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [PATCH] Verify Checksums during Basebackups (Magnus Hagander <magnus@hagander.net>) |
Список | pgsql-hackers |
On Tue, Apr 3, 2018 at 1:52 PM, Magnus Hagander <magnus@hagander.net> wrote:
On Mon, Apr 2, 2018 at 2:48 PM, Michael Banck <michael.banck@credativ.de> wrote:Hi!
On Sun, Apr 01, 2018 at 05:27:21PM +0200, Magnus Hagander wrote:
> It might be a micro-optimisation, but ISTM that we shouldn't do the
> basename(palloc(fn)) in is_heap_file() unless we actually need it -- so not
> at the top of the function. (And surely "." and ".." should not occur here?
> I think that's a result of copy/paste from the online checksum patch?
>
> We also do exactly the same basename(palloc(fn)) in sendFile(). Can we
> find a way to reuse that duplication? Perhaps we want to split it into the
> two pieces already out in sendFile() and pass it in as separate parameters?
I've done the latter now, although it looks a bit weird that the second
argument data is a subset of the first. I couldn't find another way to
not have it done twice, though.I agree, but I think it's still cleaner.On further look, there is actually no need to pstrdup() at all -- we never used the modified part of the string anyway, because we don't care about the oid (unlike pg_verify_checksums).So I adjusted the patch by that.> If not then this second one should at least only be done inside the if
> (verify_checksums).
We can't have both, as we need to call the is_heap_file() function in
order to determine whether we should verify the checksums.Right. I realize that -- thus the "if not". But I guess I was not clear in what I meant -- see attached file for it.> There is a bigger problem next to that though -- I believe basename() does
> not exist on Win32. I haven't tested it, but there is zero documentation of
> it existing, which usually indicates it doesn't. That's the part that
> definitely needs to get fixed.
>
> I think you need to look into the functionality in port/path.c, in
> particular last_dir_separator()?
Thanks for the pointer, I've used that now; I mentioned before that
basename() might be a portability hazard, but couldn't find a good
substitute myself.Yeah, I have a recollection somewhere of running into this before, but I couldn't find any references. But the complete lack of docs about it on msdn.microsoft.com is a clear red flag :)
V6 of the patch is attached.Excellent. I've done some mangling on it:* Changed the is_heap_file to is_checksummed_file (and the associtaed struct name), because this is really what it's about (we for example verify checksums on indexes, which are clearly not heaps)* Moved the list of files to the top of the file next to the other lists of files/directories* Added missing function prototype at the top, and changed the parameter names to be a bit more clear* Added some comments* Changed the logic around the checksum-check to avoid the pstrdup() and to not call the path functions unless necessary (per comment above)* "filen" -> "file" in message* xlog.h does not need to be included* pgindentRemaining question:The check for (cnt % BLCKSZ != 0) currently does "continue", which means that this block of data isn't actually sent to the client at all, which seems completely wrong. We only want to prevent checksum validations.I have moved the check up a bit, and refactored it so it continues to do the actual transmission of the file if this path is hit.
And of course I forgot that particular part in the first push, so I've pushed it as a separate commit.
В списке pgsql-hackers по дате отправления: