Re: backup manifests

Поиск
Список
Период
Сортировка
От Rushabh Lathia
Тема Re: backup manifests
Дата
Msg-id CAGPqQf2N9-xQYH=azPRzGUvqAMhMYemy3Gsy2sZnVjD2s+3fRw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: backup manifests  (Jeevan Chalke <jeevan.chalke@enterprisedb.com>)
Ответы Re: backup manifests  (Rushabh Lathia <rushabh.lathia@gmail.com>)
Список pgsql-hackers

Thanks Jeevan for reviewing the patch and offline discussion.

On Mon, Dec 9, 2019 at 11:15 AM Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote:


On Fri, Dec 6, 2019 at 12:05 PM Rushabh Lathia <rushabh.lathia@gmail.com> wrote:


On Fri, Dec 6, 2019 at 1:44 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Dec 5, 2019 at 11:22 AM Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
> Here is the whole stack of patches.

I committed 0001, as that's just refactoring and I think (hope) it's
uncontroversial. I think 0002-0005 need to be squashed together
(crediting all authors properly and in the appropriate order) as it's
quite hard to understand right now,

Please find attached single patch and I tried to add the credit to all
the authors.

I had a look over the patch and here are my few review comments:

1.
+            if (pg_strcasecmp(manifest_checksum_algo, "SHA256") == 0)
+                manifest_checksums = MC_SHA256;
+            else if (pg_strcasecmp(manifest_checksum_algo, "CRC32C") == 0)
+                manifest_checksums = MC_CRC32C;
+            else if (pg_strcasecmp(manifest_checksum_algo, "NONE") == 0)
+                manifest_checksums = MC_NONE;
+            else
+                ereport(ERROR,

Is NONE is a valid input? I think the default is "NONE" only and thus no need
of this as an input. It will be better if we simply error out if input is
neither "SHA256" nor "CRC32C".

I believe you have done this way as from pg_basebackup you are always passing
MANIFEST_CHECKSUMS '%s' string which will have "NONE" if no user input is
given. But I think passing that conditional will be better like we have
maxrate_clause for example.

Well, this is what I think, feel free to ignore as I don't see any correctness
issue over here.


I would still keep this NONE as it's look more cleaner in the say of
given options to the checksums.


2.
+    if (manifest_checksums != MC_NONE)
+    {
+        checksumbuflen = finalize_manifest_checksum(cCtx, checksumbuf);
+        switch (manifest_checksums)
+        {
+            case MC_NONE:
+                break;
+        }

Since switch case is within "if (manifest_checksums != MC_NONE)" condition,
I don't think we need a case for MC_NONE here. Rather we can use a default
case to error out.


Yeah, with the new patch we don't have this part of code.


3.
+    if (manifest_checksums != MC_NONE)
+    {
+        initialize_manifest_checksum(&cCtx);
+        update_manifest_checksum(&cCtx, content, len);
+    }

@@ -1384,6 +1641,9 @@ sendFile(const char *readfilename, const char *tarfilename, struct stat *statbuf
     int            segmentno = 0;
     char       *segmentpath;
     bool        verify_checksum = false;
+    ChecksumCtx cCtx;
+
+    initialize_manifest_checksum(&cCtx);


I see that in a few cases you are calling initialize/update_manifest_checksum()
conditional and at some other places call is unconditional. It seems like
calling unconditional will not have any issues as switch cases inside them
return doing nothing when manifest_checksums is MC_NONE.


Fixed.


4.
initialize/update/finalize_manifest_checksum() functions may be needed by the
validation patch as well. And thus I think these functions should not depend
on a global variable as such. Also, it will be good if we keep them in a file
that is accessible to frontend-only code. Well, you can ignore these comments
with the argument saying that this refactoring can be done by the patch adding
validation support. I have no issues. Since both the patches are dependent and
posted on the same email chain, thought of putting that observation.


Make sense, I just changed those API to that it doesn't have to
access the global.


5.
+        switch (manifest_checksums)
+        {
+            case MC_SHA256:
+                checksumlabel = "SHA256:";
+                break;
+            case MC_CRC32C:
+                checksumlabel = "CRC32C:";
+                break;
+            case MC_NONE:
+                break;
+        }

This code in AddFileToManifest() is executed for every file for which we are
adding an entry. However, the checksumlabel will be going to remain the same
throughout. Can it be set just once and then used as is?


Yeah, with the attached patch we no more have this part of code.


6.
Can we avoid manifest_checksums from declaring it as a global variable?
I think for that, we need to pass that to every function and thus need to
change the function signature of various functions. Currently, we pass
"StringInfo manifest" to all the required function, will it better to pass
the struct variable instead? A struct may have members like,
"StringInfo manifest" in it, checksum type (manifest_checksums),
checksum label, etc.


I agree.  Earlier I was not sure about this because that require data structure
to expose.  But in the given attached patch that's what I tried, introduced new
data structure and defined in basebackup.h and passed the same through the
function so that doesn't require to pass an individual members.   Also removed
global manifest_checksum and added the same in the newly introduced structure.

Attaching the patch, which need to apply on the top of earlier 0001 patch.

Thanks,

--
Rushabh Lathia
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Questions/Observations related to Gist vacuum
Следующее
От: amul sul
Дата:
Сообщение: Re: [HACKERS] advanced partition matching algorithm forpartition-wise join