Обсуждение: Standalone Hot Backups

Поиск
Список
Период
Сортировка

Standalone Hot Backups

От
Sergey Arlashin
Дата:
Hi!

In section 'Making a Base Backup Using the Low Level API' is said that once one has WAL archiving set up it is ok to omit pg_xlog folder from the backup dump:

"You can, however, omit from the backup dump the files within the cluster's pg_xlog/ subdirectory. This slight adjustment is worthwhile because it reduces the risk of mistakes when restoring. This is easy to arrange if pg_xlog/ is a symbolic link pointing to someplace outside the cluster directory, which is a common setup anyway for performance reasons. You might also want to exclude postmaster.pid and postmaster.opts, which record information about the running postmaster, not about the postmaster which will eventually use this backup. (These files can confuse pg_ctl.)"

But in section related to making standalone hot backups there is no information about it. 

There is only the following set of commands:

touch /var/lib/pgsql/backup_in_progress
psql -c "select pg_start_backup('hot_backup');"
tar -cf /var/lib/pgsql/backup.tar /var/lib/pgsql/data/
psql -c "select pg_stop_backup();"
rm /var/lib/pgsql/backup_in_progress
tar -rf /var/lib/pgsql/backup.tar /var/lib/pgsql/archive/

I only can see that the backup folder is being archived with pg_xlog. 


So, the question is - is it ok to omit pg_xlog folder from backup dump while making standalone hot backups or not ?


--
best regards,
Sergey

Re: Standalone Hot Backups

От
Richard Poole
Дата:
On Wed, Aug 21, 2013 at 09:35:23PM +0400, Sergey Arlashin wrote:
> Hi!
> I'm reading this article http://www.postgresql.org/docs/current/static/continuous-archiving.htm
>
> In section 'Making a Base Backup Using the Low Level API' is said that once one has WAL archiving set up it is ok to
omitpg_xlog folder from the backup dump: 
>
> > "You can, however, omit from the backup dump the files within the cluster's pg_xlog/ subdirectory. This slight
adjustmentis worthwhile because it reduces the risk of mistakes when restoring. This is easy to arrange if pg_xlog/ is
asymbolic link pointing to someplace outside the cluster directory, which is a common setup anyway for performance
reasons.You might also want to exclude postmaster.pid and postmaster.opts, which record information about the running
postmaster,not about the postmaster which will eventually use this backup. (These files can confuse pg_ctl.)" 
>
> But in section related to making standalone hot backups there is no information about it.
>
> There is only the following set of commands:
>
> touch /var/lib/pgsql/backup_in_progress
> psql -c "select pg_start_backup('hot_backup');"
> tar -cf /var/lib/pgsql/backup.tar /var/lib/pgsql/data/
> psql -c "select pg_stop_backup();"
> rm /var/lib/pgsql/backup_in_progress
> tar -rf /var/lib/pgsql/backup.tar /var/lib/pgsql/archive/
>
> I only can see that the backup folder is being archived with pg_xlog.
>
>
> So, the question is - is it ok to omit pg_xlog folder from backup dump while making standalone hot backups or not ?

In order to use a base backup as a standalone hot backup, you need
the WAL files which were generated between the time that you started
the backup and the time it finished. These files are generated in the
pg_xlog directory and are automatically removed when no longer required.

In the first case, you will have them from your archive and therefore
you don't need the pg_xlog directory. In the second case, it's assumed
that you've not got archiving set up and the paragraph before the one you
quoted describes the use of the file /var/lib/pgsql/backup_in_progress
to turn archiving on temporarily; the last command in the sequence adds
the files from the temporary archive to the base backup in the tar file.
So you don't actually need the pg_xlog directory and it would be OK for
the base backup to have excluded it.

Short answer: it is OK to omit pg_xlog, because as part of a standalone
hot backup you will always have the WAL files from another source.

Richard

--
Richard Poole                 http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


Re: Standalone Hot Backups

От
Christian Ullrich
Дата:
* Sergey Arlashin wrote:

> In section 'Making a Base Backup Using the Low Level API' is said that
> once one has WAL archiving set up it is ok to omit pg_xlog folder from
> the backup dump:
>
>> "You can, however, omit from the backup dump the files within the
>> cluster's pg_xlog/ subdirectory. This slight adjustment is worthwhile

> But in section related to making standalone hot backups there is no
> information about it.
>
> There is only the following set of commands:
[...]
> I only can see that the backup folder is being archived with pg_xlog.
>
> So, the question is - is it ok to omit pg_xlog folder from backup dump
> while making standalone hot backups or not ?

I'm not sure why the example does not exclude pg_xlog; IMHO it should.
Perhaps the author of the example was using the setup mentioned in
section 24.3.3, where pg_xlog is a symbolic link elsewhere? Restoring a
backup created using that procedure will never need WAL files that are
not in the (transient) archive directory.

If your PostgreSQL version includes pg_basebackup, you should use it
rather than the low-level API. If you include the -X option, you will
get a backup that you can use as "standalone" as well as for
point-in-time recovery, and if you use "plain" mode as well as -X, you
can even use the backup directory as PGDATA directly: pg_basebackup will
put the required WAL files into pg_xlog in the backup, so there is no
need even for a recovery.conf file. Starting the database with that
directory (and no recovery.conf) will only perform crash recovery. If
you leave the -X option out, you can only do PITR, for which you need a
WAL archive.


It is interesting to follow the example's logic and see why it works:

In the example, more WAL is archived than strictly necessary to restore
the backup: Some WAL files will appear twice in the backup, once in
pg_xlog, once in the "archive" directory. The first tar command will
copy them (in an unusable intermediate state) from pg_xlog.
pg_stop_backup() will archive all files used during the backup, and the
second tar command will then append the "archive" to the backup. When
restoring from the backup, you must have a recovery.conf file with a
restore_command that will retrieve files from the (restored) archive.

The important point is that PostgreSQL will prefer files from the
archive to those in pg_xlog. Therefore, it will use the finished
versions and ignore the files copied from pg_xlog.

--
Christian