Обсуждение: base backup requirements for PITR

Поиск
Список
Период
Сортировка

base backup requirements for PITR

От
"Andrew W. Gibbs"
Дата:
Postgres Admin Wizards,

I am trying to manage an 8.4 database whose ingest rate is
sufficiently high that doing base backups on it causes it to fall
unacceptably behind.  To deal with this, I want to do WAL shipping to
another server that is in warm standby mode, and then on that server
perform the base backups in order to effect log change accumulation
without hosing the master server.

Once upon a time, months ago, before the load on the system made
creating a base backup completely untenable, I configured an
archive_command, did a pg_start_backup, kicked off a tar operation on
the data directory, and after the tar operation's completion ran
pg_stop_backup.  Last week, I was finally allocated a new server on
which to create a warm standby, so I unpacked the tarball I created,
installed a recovery.conf with a restore_command, and started the
server up.  It has been methodically chewing through the WAL segments,
and a little over a week later it has gotten through nearly all three
months' worth of WAL segments (I know, miserable, and hopefully never
an experience I will repeat).  I take this to mean that I did things
correctly and that shortly I will end up with a warm standby server,
on the assumption that if I had botched the creation of the base
backup things would have bombed out very quickly.

That said, I'm wondering whether I created the base backup with
reliably correct procedures or merely was lucky.  It's not exactly
clear to me when I turned on the warm standby server how it knew where
in the WAL to start processing data.  As I understand things, when
Postgres experiences a hard crash, it can rely on the contents of
global/pg_control to provide the value of the last checkpoint's REDO
location, from which it can start replaying WAL records.  But what
happens if you're in the middle of creating a base backup and Postgres
forces a checkpoint, possibly causing global/pg_control to be changed
at some indeterminate point during tar's execution?  And are the
concerns different for the scenario in which you're running
pg_start_backup/tar/pg_stop_backup on a master server and the scenario
in which you're just running tar on a warm standby server (because you
can't actually log into a warm standby server)?  In other words, is
pg_start_backup/pg_stop_backup doing something magical do give the
eventual recovery process a heads up about what to do?

When I first started puzzling through this, I figured that all I had
to do was run pg_controldata on the warm standby, parse out the latest
checkpoint's REDO address, calculate the corresponding file, and then
assume that I could purge my WAL archive of any files preceding that.
But then I started to worry about whether upon startup of a server in
recovery mode using a tarball that I had generated from a warm standby
Postgres would know where to start in the WAL when invoking my
restore_command.  Looking through emails from the archives spanning
the past decade, I've seen at least one saying that I need to make
sure that global/pg_control is archived first and one that says it
needs to be archived last when I'm creating a tarball on a warm
standby server, and I really can't tell what the requirement is, if
any, on 8.4, and I also can't tell, if there is such a requirement for
a warm standby server, whether such a requirement is obviated on a
master server by virtue of things that pg_start_backup/pg_stop_backup
might be doing.

So, can someone clarify the following?

1) When Postgres starts up in recovery mode (i.e. with a
recovery.conf), how does it know where in the WAL to start replaying
records?

2) Beyond just running a tar command and making sure that I keep
around the required WAL files, what constraints are there on creating
a tarball for the base backup?  Do I need to do anything in particular
to ensure that the resultant warm standby server, upon startup, knows
where in the WAL to start replay?  Is the point at which
global/pg_control is picked up by tar critical to the success of a
server in recovery mode coming online in a consistent state, or is
Postgres doing something else involving log sequence numbers stamped
onto data files to figure out what to do?

Many thanks for your guidance...

  -- AWG


Re: base backup requirements for PITR

От
Gabriele Bartolini
Дата:
Hi Andrew,

Il 29/11/12 02:52, Andrew W. Gibbs ha scritto:
> That said, I'm wondering whether I created the base backup with
> reliably correct procedures or merely was lucky.
PostgreSQL documentation on how continuous archiving works is clear, but
you need to follow exactly what it says and have a clear understanding
and how Postgres works.

In order to simplify the management of disaster recovery solutions for
PostgreSQL, we have recently developed and distributed an open source
project called Barman.
I am not sure whether you have looked at it or not. In this case, I
suggest you give it a go. More information at http://www.pgbarman.org/

It is also available as RPM and Debian/Ubuntu package and works from
PostgreSQL 8.4 onwards.

Barman takes care of backups, works remotely and assist DBAs during
recovery (both local to the backup server or remote).

Cheers,
Gabriele

--
  Gabriele Bartolini - 2ndQuadrant Italia
  PostgreSQL Training, Services and Support
  gabriele.bartolini@2ndQuadrant.it | www.2ndQuadrant.it



Re: base backup requirements for PITR

От
"Albe Laurenz"
Дата:
Gabriele Bartolini wrote:
> Il 29/11/12 02:52, Andrew W. Gibbs ha scritto:
>> That said, I'm wondering whether I created the base backup with
>> reliably correct procedures or merely was lucky.

> In order to simplify the management of disaster recovery solutions for
> PostgreSQL, we have recently developed and distributed an open source
> project called Barman.
[...]

Gabriele, I understand that you want to spread the word,
but the OP's problem was a different one:

He wants to create a base backup by just copying the
data directory of an 8.4 warm standby server.
There is no possibility to run pg_(start|stop)_backup,
but the documentation seems to suggest that it could work
anyway:
http://www.postgresql.org/docs/8.4/static/warm-standby.html#BACKUP-INCREMENTAL-UPDATED

I wonder if that's true, and I'd assume that you'd
have to backup pg_control first (to assure an old enough
checkpoint), but I don't know enough to answer the
question.  I'd like to know myself.

Yours,
Laurenz Albe

Re: base backup requirements for PITR

От
Gabriele Bartolini
Дата:
Hi Laurenz,

Il 01/12/12 12:18, Albe Laurenz ha scritto:
> Gabriele, I understand that you want to spread the word,
> but the OP's problem was a different one:
You are right. I totally missed that. I skimmed the email (very quickly)
and thought that in that particular context Andrew was referring to PITR
in general.
> He wants to create a base backup by just copying the
> data directory of an 8.4 warm standby server.
> There is no possibility to run pg_(start|stop)_backup,
> but the documentation seems to suggest that it could work
> anyway:
> http://www.postgresql.org/docs/8.4/static/warm-standby.html#BACKUP-INCREMENTAL-UPDATED
Yes, it is possible. I'd like to try and implement that in Barman, in
order to automate this. I'd try and follow this path though:
pg_start_backup() on the master, wait for the operation to be replayed
on the standby (with pg_controldata if < 9.0 or through hot standby),
rsync/tar from the standby, then pg_stop_backup() on the master. If you
think of following this path yourself, please make sure you test it
accurately (especially for production environment).

Sorry Andrew about misunderstanding the original question.

Cheers,
Gabriele

--
  Gabriele Bartolini - 2ndQuadrant Italia
  PostgreSQL Training, Services and Support
  gabriele.bartolini@2ndQuadrant.it | www.2ndQuadrant.it



Re: base backup requirements for PITR

От
"Albe Laurenz"
Дата:
Gabriele Bartolini wrote:
>> Gabriele, I understand that you want to spread the word,
>> but the OP's problem was a different one:

> You are right. I totally missed that. I skimmed the email (very quickly)
> and thought that in that particular context Andrew was referring to PITR
> in general.

I admit that it was a rather long e-mail.

>> He wants to create a base backup by just copying the
>> data directory of an 8.4 warm standby server.
>> There is no possibility to run pg_(start|stop)_backup,
>> but the documentation seems to suggest that it could work
>> anyway:
>> http://www.postgresql.org/docs/8.4/static/warm-standby.html#BACKUP-INCREMENTAL-UPDATED

> Yes, it is possible. I'd like to try and implement that in Barman, in
> order to automate this. I'd try and follow this path though:
> pg_start_backup() on the master, wait for the operation to be replayed
> on the standby (with pg_controldata if < 9.0 or through hot standby),
> rsync/tar from the standby, then pg_stop_backup() on the master. If you
> think of following this path yourself, please make sure you test it
> accurately (especially for production environment).

A viable alternative could be to shutdown the standby
and take an offline backup.  It can catch up with WAL
after restart.

Yours,
Laurenz Albe