Re: [PERFORM] Backup taking long time !!!

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: [PERFORM] Backup taking long time !!!
Дата
Msg-id 20170122173245.GH18360@tamriel.snowman.net
обсуждение исходный текст
Ответ на Re: [PERFORM] Backup taking long time !!!  (julyanto SUTANDANG <julyanto@equnix.co.id>)
Ответы Re: [PERFORM] Backup taking long time !!!  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Список pgsql-performance
Greetings,

* julyanto SUTANDANG (julyanto@equnix.co.id) wrote:
> Thanks for elaborating this Information, this is new, so whatever it is the
> procedure is *Correct and Workable*.

Backups are extremely important, so I get quite concerned when people
provide incorrect information regarding them.

> > With all of the WAL
> > which was created during the backup, PG will be able to recover from the
> > changes made during the backup to the data directory, but you *must*
> > have all of that WAL, or the backup will be inconsistent because of
>
> That is rather out of question, because all what we discuss here is just
> doing full/snapshot backup.

It's unclear what you mean by 'out of question' or why you believe that
it matters if it's a full backup or not.

Any backup of PG *must* include all of the WAL that was created during
the backup.

> The backup is Full Backup or Snapshot and it will work whenever needed.
> We are not saying about Incremental Backup yet.
> Along with collecting the XLOG File, you can have incremental backup and
> having complete continuous data backup.
> in this case, Stephen is suggesting on using pg_receivexlog or
> archive_command
> (everything here is actually explained well on the docs))

No, that is not correct.  You must have the WAL for a full backup as
well.  If I understand what you're suggesting, it's that WAL is only for
point-in-time-recovery, but that is *not* correct, WAL is required for
restoring a full backup to a consistent state.

> those changes that were made to the data directory after
> > pg_start_backup() was called.
> >
> > In other words, if you aren't using pg_receivexlog or archive_command,
> > your backups are invalid.
> >
> I doubt that *invalid* here is a valid word
> In term of snapshot backup and as long as the snapshot can be run, that is
> valid, isn't it?

It's absolutely correct, you must have the WAL generated during your
backup or the backup is invalid.

If, what you mean by 'snapshot' is a *full-system atomic snapshot*,
provided by some layer lower than PostgreSQL that is *exactly* as if the
machine was physically turned off all at once, then, and *only* then,
can you be guaranteed that PG will be able to recover, but the reason
for that is because PG will go back to the last checkpoint that
happened, as recorded in pg_control, and replay all of the WAL in the
pg_xlog/pg_wal directory, which must all exist and be complete for all
committed transaction because the WAL was sync'd to disk before the
commit was acknowledged and the WAL is not removed until after a
checkpoint has completed which has sync'd the data in the data directory
out to the filesystem.

That's also known as 'crash recovery' and it works precisely because all
of the WAL is available at the time of the event and we have a known
point to go back to (the checkpoint).

During a backup, multiple checkpoints can occur and WAL will be removed
from the pg_xlog/pg_wal directory during the backup; WAL which is
critical to the consistency of the database and which must be retained
by the user because it must be used to perform WAL replay of the
database when restoring from the backup which was made.

> > if you wanted to backup in later day, you can use rsync then it will copy
> > > faster because rsync only copy the difference, rather than copy all the
> > > data.
> >
> > This is *also* incorrect.  rsync, by itself, is *not* safe to use for
> > doing that kind of incremental backup, unless you enable checksums.  The
> > reason for this is that rsync has only a 1-second level granularity and
> > it is possible (unlikely, though it has been demonstrated) to miss
> > changes made to a file within that 1-second window.
>
> As long as that is not XLOG file, anyway.. as you are saying that wouldn't
> be a problem since actually we can run the XLOG for recovery. .

No, that's also not correct, unless you keep all WAL since the *first*
full backup.

The 1-second window concern is regarding the validity of a subsequent
incremental backup.

This is what happens, more-or-less:

1- File datadir/A is copied by rsync
2- backup starts, user retains all WAL during backup #1
3- File datadir/A is copied by rsync in the same second as backup
   started
4- File datadir/A is *subsequently* modified by PG and the data is
   written out to the filesystem, still within the same second as when
   the backup started
5- The rsync finishes, the backup finishes, all WAL for backup #1 is
   retained, which includes the changes made to datadir/A during the
   backup.  Everything is fine at this point for backup #1.

6- A new, incremental, backup is started, called backup #2.
7- rsync does *not* copy the file datadir/A because it was not
   subsequently changed by the user and the timestamp is the same,
   according to rsync's 1-second-level granularity.
8- The WAL for backup #2 is retained, but it does not contain any of the
   changes which were made to datadir/A because *those* changes are in
   the WAL which was written out during backup #1
9- backup #2 completes, with its WAL retainined
10- At this point, backup #2 is an invalid backup.

This is not hypothetical, it's been shown to be possible to have this
happen.

(side-note: this is all from memory, so perhaps there's a detail or two
incorrect, but this is the gist of the issue)

> > > my latter explanation is: use pg_basebackup, it will do it automatically
> > > for you.
> >
> > Yes, if you are unsure about how to perform a safe backup properly,
> > using pg_basebackup or one of the existing backup tools is, by far, the
> > best approach.  Attempting to roll your own backup system based on rsync
> > is not something I am comfortable recommending any more because it is
> > *not* simple to do correctly.
>
> OK, that is fine, and actually we are using that.

You must be sure to use one of the methods with pg_basebackup that keeps
all of the WAL created during the full backup.  That would be one of:
pg_basebackup -x, pg_basebackup -X stream, or pg_basebackup +
pg_receivexlog.

> the reason why i explain about start_backup and stop_backup is to give a
> gradual understand, and hoping that people will get the mechanism in the
> back understandable.

I'm more than happy to have people explaining about
pg_start/stop_backup, but I do have an issue when the explanation is
incorrect and could cause a user to use a backup method which will
result in an invalid backup.

Thanks!

Stephen

Вложения

В списке pgsql-performance по дате отправления:

Предыдущее
От: julyanto SUTANDANG
Дата:
Сообщение: Re: [PERFORM] Backup taking long time !!!
Следующее
От: julyanto SUTANDANG
Дата:
Сообщение: Re: [PERFORM] Backup taking long time !!!