Re: block-level incremental backup

Поиск
Список
Период
Сортировка
От Jeevan Ladhe
Тема Re: block-level incremental backup
Дата
Msg-id CAOgcT0MF5RpWcTHrepYx84PQfHHsjzVSD-ZsJNahTNenekD3zw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: block-level incremental backup  (Jeevan Ladhe <jeevan.ladhe@enterprisedb.com>)
Ответы Re: block-level incremental backup  (Jeevan Chalke <jeevan.chalke@enterprisedb.com>)
Список pgsql-hackers
Here are some comments:


+/* The reference XLOG position for the incremental backup. */                                    
+static XLogRecPtr refptr;    

As Robert already pointed we may want to pass this as parameter around instead
of a global variable. Also, can be renamed to something like: incr_backup_refptr.
I see in your earlier version of patch this was named startincrptr, which I
think was more meaningful.

---------

    /*                                                                                          
     * If incremental backup, see whether the filename is a relation filename                   
     * or not.                                                                                  
     */

Can be reworded something like:
"If incremental backup, check if it is relation file and can be sent partially."

---------

+           if (verify_checksum)
+           {
+               ereport(WARNING,
+                       (errmsg("cannot verify checksum in file \"%s\", block "
+                               "%d: read buffer size %d and page size %d "
+                               "differ",
+                               readfilename, blkno, (int) cnt, BLCKSZ)));
+               verify_checksum = false;
+           }

For do_incremental_backup() it does not make sense to show the block number in
warning as it is always going to be 0 when we throw this warning.
Further, I think this can be rephrased as:
"cannot verify checksum in file \"%s\", read file size %d is not multiple of
page size %d".

Or maybe we can just say:
"cannot verify checksum in file \"%s\"" if checksum requested, disable the
checksum and leave it to the following message:

+           ereport(WARNING,
+                   (errmsg("file size (%d) not in multiple of page size (%d), sending whole file",
+                           (int) cnt, BLCKSZ))); 

---------

If you agree on the above comment for blkno, then we can shift declaration of blkno
inside the condition "       if (!sendwholefile)" in do_incremental_backup(), or
avoid it altogether, and just pass "i" as blkindex, as well as blkno to
verify_page_checksum(). May be add a comment why they are same in case of
incremental backup.

---------

I think we should give the user hint from where he should be reading the input
lsn for incremental backup in the --help option as well as documentation?
Something like - "To take an incremental backup, please provide value of "--lsn"
as the "START WAL LOCATION" of previously taken full backup or incremental
backup from backup_lable file. 

---------

pg_combinebackup:

+static bool made_new_outputdata = false;
+static bool found_existing_outputdata = false;

Both of these are global, I understand that we need them global so that they are
accessible in cleanup_directories_atexit(). But they are passed to
verify_dir_is_empty_or_create() as parameters, which I think is not needed.
Instead verify_dir_is_empty_or_create() can directly change the globals.

---------

I see that checksum_failure is never set and always remains as false. May be
it is something that you wanted to set in combine_partial_files() when a
the corrupted partial file is detected?

---------

I think the logic for verifying the backup chain should be moved out from main()
function to a separate function.

---------

+ /*
+ * Verify the backup chain.  INCREMENTAL BACKUP REFERENCE WAL LOCATION of
+ * the incremental backup must match with the START WAL LOCATION of the
+ * previous backup, until we reach a full backup in which there is no
+ * INCREMENTAL BACKUP REFERENCE WAL LOCATION.
+ */

The current logic assumes the incremental backup directories are to be provided
as input in the serial order the backups were taken. This is bit confusing
unless clarified in pg_combinebackup help menu or documentation. I think we
should clarify it at both the places.

---------

I think scan_directory() should be rather renamed as do_combinebackup().

Regards,
Jeevan Ladhe

On Thu, Aug 29, 2019 at 8:11 PM Jeevan Ladhe <jeevan.ladhe@enterprisedb.com> wrote:
Due to the inherent nature of pg_basebackup, the incremental backup also
allows taking backup in tar and compressed format. But, pg_combinebackup
does not understand how to restore this. I think we should either make
pg_combinebackup support restoration of tar incremental backup or restrict
taking the incremental backup in tar format until pg_combinebackup
supports the restoration by making option '--lsn' and '-Ft' exclusive.

It is arguable that one can take the incremental backup in tar format, extract
that manually and then give the resultant directory as input to the
pg_combinebackup, but I think that kills the purpose of having
pg_combinebackup utility.

Thoughts?

Regards,
Jeevan Ladhe

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabien COELHO
Дата:
Сообщение: Re: WIP: Generic functions for Node types using generated metadata
Следующее
От: Nikhil Sontakke
Дата:
Сообщение: Re: [HACKERS] PATCH: Batch/pipelining support for libpq