Обсуждение: Backup of live database

Поиск
Список
Период
Сортировка

Backup of live database

От
"Brian Modra"
Дата:
Hi,
If tar reports that a file was modified while it was being archived, does that mean that the file was archived correctly, or is it corrupted in the archive?
Does tar take a snapshot of the file so that even if it is modified, at least the archive is safe?
Thanks

--
Brian Modra   Land line: +27 23 5411 462
Mobile: +27 79 183 8059
6 Jan Louw Str, Prince Albert, 6930
Postal: P.O. Box 2, Prince Albert 6930
South Africa

Re: Backup of live database

От
"Joshua D. Drake"
Дата:
Brian Modra wrote:
> Hi,
> If tar reports that a file was modified while it was being archived,
> does that mean that the file was archived correctly, or is it corrupted
> in the archive?
> Does tar take a snapshot of the file so that even if it is modified, at
> least the archive is safe?

You can not use tar to backup postgresql if it is running.

http://www.postgresql.org/docs/8.2/static/backup.html

Sincerely,

Joshua D. Drake

> Thanks
>
> --
> Brian Modra   Land line: +27 23 5411 462
> Mobile: +27 79 183 8059
> 6 Jan Louw Str, Prince Albert, 6930
> Postal: P.O. Box 2, Prince Albert 6930
> South Africa


Re: Backup of live database

От
"Brian Modra"
Дата:
The documentation about WAL says that you can start a live backup, as long as you use WAL backup also.
I'm concerned about the integrity of the tar file. Can someone help me with that?

On 16/01/2008, Joshua D. Drake <jd@commandprompt.com> wrote:
Brian Modra wrote:
> Hi,
> If tar reports that a file was modified while it was being archived,
> does that mean that the file was archived correctly, or is it corrupted
> in the archive?
> Does tar take a snapshot of the file so that even if it is modified, at
> least the archive is safe?

You can not use tar to backup postgresql if it is running.

http://www.postgresql.org/docs/8.2/static/backup.html

Sincerely,

Joshua D. Drake

> Thanks
>
> --
> Brian Modra   Land line: +27 23 5411 462
> Mobile: +27 79 183 8059
> 6 Jan Louw Str, Prince Albert, 6930
> Postal: P.O. Box 2, Prince Albert 6930
> South Africa




--
Brian Modra   Land line: +27 23 5411 462
Mobile: +27 79 183 8059
6 Jan Louw Str, Prince Albert, 6930
Postal: P.O. Box 2, Prince Albert 6930
South Africa

Re: Backup of live database

От
Tom Lane
Дата:
"Joshua D. Drake" <jd@commandprompt.com> writes:
> Brian Modra wrote:
>> If tar reports that a file was modified while it was being archived,
>> does that mean that the file was archived correctly, or is it corrupted
>> in the archive?

> You can not use tar to backup postgresql if it is running.

You can use it for a PITR base backup --- WAL replay will fix any
inconsistencies.  In that context it's just annoying that some tar
versions complain about this.

> http://www.postgresql.org/docs/8.2/static/backup.html

Yah.  Note that sections 23.2 and 23.3.32 are talking about entirely
different scenarios.  In the former case such a warning is scary,
in the latter not.

            regards, tom lane

Re: Backup of live database

От
"Joshua D. Drake"
Дата:
Brian Modra wrote:
> The documentation about WAL says that you can start a live backup, as
> long as you use WAL backup also.
> I'm concerned about the integrity of the tar file. Can someone help me
> with that?

If you are using point in time recovery:

http://www.postgresql.org/docs/8.2/static/continuous-archiving.html

You do not have to worry about it.

Joshua D. Drake






>
> On 16/01/2008, *Joshua D. Drake* <jd@commandprompt.com
> <mailto:jd@commandprompt.com>> wrote:
>
>     Brian Modra wrote:
>      > Hi,
>      > If tar reports that a file was modified while it was being archived,
>      > does that mean that the file was archived correctly, or is it
>     corrupted
>      > in the archive?
>      > Does tar take a snapshot of the file so that even if it is
>     modified, at
>      > least the archive is safe?
>
>     You can not use tar to backup postgresql if it is running.
>
>     http://www.postgresql.org/docs/8.2/static/backup.html
>     <http://www.postgresql.org/docs/8.2/static/backup.html>
>
>     Sincerely,
>
>     Joshua D. Drake
>
>      > Thanks
>      >
>      > --
>      > Brian Modra   Land line: +27 23 5411 462
>      > Mobile: +27 79 183 8059
>      > 6 Jan Louw Str, Prince Albert, 6930
>      > Postal: P.O. Box 2, Prince Albert 6930
>      > South Africa
>
>
>
>
> --
> Brian Modra   Land line: +27 23 5411 462
> Mobile: +27 79 183 8059
> 6 Jan Louw Str, Prince Albert, 6930
> Postal: P.O. Box 2, Prince Albert 6930
> South Africa


Re: Backup of live database

От
"Brian Modra"
Дата:
Sorry to be hammering this point, but I want to be totally sure its OK, rather than 5 months down the line attempt to recover, and it fails...

Are you absolutely certain that the tar backup of the file that changed, is OK? (And that even if that file is huge, tar has managed to save the file as it was before it was changed - otherwise I'm afraid that the first part of the file is saved to tar, and then the file is modified, and the last part of the file is saved to tar from the point it was modified - and so therefore not consistent with the first part... And therefore the file has lost its integrity, so even a WAL restore won't help because the base files themselves are corrupt in the tar file?

On 16/01/2008, Joshua D. Drake <jd@commandprompt.com> wrote:
Brian Modra wrote:
> The documentation about WAL says that you can start a live backup, as
> long as you use WAL backup also.
> I'm concerned about the integrity of the tar file. Can someone help me
> with that?

If you are using point in time recovery:

http://www.postgresql.org/docs/8.2/static/continuous-archiving.html

You do not have to worry about it.

Joshua D. Drake






>
> On 16/01/2008, *Joshua D. Drake* <jd@commandprompt.com
> <mailto: jd@commandprompt.com>> wrote:
>
>     Brian Modra wrote:
>      > Hi,
>      > If tar reports that a file was modified while it was being archived,
>      > does that mean that the file was archived correctly, or is it
>     corrupted
>      > in the archive?
>      > Does tar take a snapshot of the file so that even if it is
>     modified, at
>      > least the archive is safe?
>
>     You can not use tar to backup postgresql if it is running.
>
>     http://www.postgresql.org/docs/8.2/static/backup.html
>     <http://www.postgresql.org/docs/8.2/static/backup.html>
>
>     Sincerely,
>
>     Joshua D. Drake
>
>      > Thanks
>      >
>      > --
>      > Brian Modra   Land line: +27 23 5411 462
>      > Mobile: +27 79 183 8059
>      > 6 Jan Louw Str, Prince Albert, 6930
>      > Postal: P.O. Box 2, Prince Albert 6930
>      > South Africa
>
>
>
>
> --
> Brian Modra   Land line: +27 23 5411 462
> Mobile: +27 79 183 8059
> 6 Jan Louw Str, Prince Albert, 6930
> Postal: P.O. Box 2, Prince Albert 6930
> South Africa




--
Brian Modra   Land line: +27 23 5411 462
Mobile: +27 79 183 8059
6 Jan Louw Str, Prince Albert, 6930
Postal: P.O. Box 2, Prince Albert 6930
South Africa

Re: Backup of live database

От
Steve Holdoway
Дата:
You can be absolutely certain that the tar backup of a file that's changed is a complete waste of time. Because it
changedwhile you were copying it.  

Steve.
On Wed, 16 Jan 2008 10:24:00 +0200
"Brian Modra" <epailty@googlemail.com> wrote:

> Sorry to be hammering this point, but I want to be totally sure its OK,
> rather than 5 months down the line attempt to recover, and it fails...
>
> Are you absolutely certain that the tar backup of the file that changed, is
> OK? (And that even if that file is huge, tar has managed to save the file as
> it was before it was changed - otherwise I'm afraid that the first part of
> the file is saved to tar, and then the file is modified, and the last part
> of the file is saved to tar from the point it was modified - and so
> therefore not consistent with the first part... And therefore the file has
> lost its integrity, so even a WAL restore won't help because the base files
> themselves are corrupt in the tar file?
>
> On 16/01/2008, Joshua D. Drake <jd@commandprompt.com> wrote:
> >
> > Brian Modra wrote:
> > > The documentation about WAL says that you can start a live backup, as
> > > long as you use WAL backup also.
> > > I'm concerned about the integrity of the tar file. Can someone help me
> > > with that?
> >
> > If you are using point in time recovery:
> >
> > http://www.postgresql.org/docs/8.2/static/continuous-archiving.html
> >
> > You do not have to worry about it.
> >
> > Joshua D. Drake
> >
> >
> >
> >
> >
> >
> > >
> > > On 16/01/2008, *Joshua D. Drake* <jd@commandprompt.com
> > > <mailto:jd@commandprompt.com>> wrote:
> > >
> > >     Brian Modra wrote:
> > >      > Hi,
> > >      > If tar reports that a file was modified while it was being
> > archived,
> > >      > does that mean that the file was archived correctly, or is it
> > >     corrupted
> > >      > in the archive?
> > >      > Does tar take a snapshot of the file so that even if it is
> > >     modified, at
> > >      > least the archive is safe?
> > >
> > >     You can not use tar to backup postgresql if it is running.
> > >
> > >     http://www.postgresql.org/docs/8.2/static/backup.html
> > >     <http://www.postgresql.org/docs/8.2/static/backup.html>
> > >
> > >     Sincerely,
> > >
> > >     Joshua D. Drake
> > >
> > >      > Thanks
> > >      >
> > >      > --
> > >      > Brian Modra   Land line: +27 23 5411 462
> > >      > Mobile: +27 79 183 8059
> > >      > 6 Jan Louw Str, Prince Albert, 6930
> > >      > Postal: P.O. Box 2, Prince Albert 6930
> > >      > South Africa
> > >
> > >
> > >
> > >
> > > --
> > > Brian Modra   Land line: +27 23 5411 462
> > > Mobile: +27 79 183 8059
> > > 6 Jan Louw Str, Prince Albert, 6930
> > > Postal: P.O. Box 2, Prince Albert 6930
> > > South Africa
> >
> >
>
>
> --
> Brian Modra   Land line: +27 23 5411 462
> Mobile: +27 79 183 8059
> 6 Jan Louw Str, Prince Albert, 6930
> Postal: P.O. Box 2, Prince Albert 6930
> South Africa
>


--
Steve Holdoway <steve.holdoway@firetrust.com>

Re: Backup of live database

От
"Joshua D. Drake"
Дата:
Brian Modra wrote:
> Sorry to be hammering this point, but I want to be totally sure its OK,
> rather than 5 months down the line attempt to recover, and it fails...
>
> Are you absolutely certain that the tar backup of the file that changed,
> is OK?

Have you considered testing it?

Sincerely,

Joshua D. Drake



Re: Backup of live database

От
Tom Lane
Дата:
Steve Holdoway <steve.holdoway@firetrust.com> writes:
> You can be absolutely certain that the tar backup of a file that's changed is a complete waste of time. Because it
changedwhile you were copying it.  

That is, no doubt, the reasoning that prompted the gnu tar people to
make it do what it does, but it has zero to do with reality for
Postgres' usage in PITR base backups.  What we care about is consistency
on the page level: as long as each page of the backed-up file correctly
represents *some* state of that page while the backup was in progress,
everything is okay, because replay of the WAL log will correct any pages
that are out-of-date, missing, or shouldn't be there at all.  And
Postgres always writes whole pages.  So as long as write() and read()
are atomic --- which is the case on all Unixen I know of --- everything
works.

(Thinks for a bit...) Actually I guess there's one extra assumption in
there, which is that tar must issue its reads in multiples of our page
size.  But that doesn't seem like much of a stretch.

            regards, tom lane

Re: Backup of live database

От
Peter Eisentraut
Дата:
Am Mittwoch, 16. Januar 2008 schrieb Tom Lane:
> (Thinks for a bit...) Actually I guess there's one extra assumption in
> there, which is that tar must issue its reads in multiples of our page
> size.  But that doesn't seem like much of a stretch.

There is something about that here:
http://www.gnu.org/software/tar/manual/html_node/tar_149.html#SEC149

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: Backup of live database

От
Tom Lane
Дата:
Peter Eisentraut <peter_e@gmx.net> writes:
> Am Mittwoch, 16. Januar 2008 schrieb Tom Lane:
>> (Thinks for a bit...) Actually I guess there's one extra assumption in
>> there, which is that tar must issue its reads in multiples of our page
>> size.  But that doesn't seem like much of a stretch.

> There is something about that here:
> http://www.gnu.org/software/tar/manual/html_node/tar_149.html#SEC149

AFAICT that's talking about the I/O chunk size *on the archive file*.
It doesn't say anything specific about the chunk size on the file side.

            regards, tom lane

Re: Backup of live database

От
David Wall
Дата:
Brian Modra wrote:
> Sorry to be hammering this point, but I want to be totally sure its
> OK, rather than 5 months down the line attempt to recover, and it fails...
>
> Are you absolutely certain that the tar backup of the file that
> changed, is OK? (And that even if that file is huge, tar has managed
> to save the file as it was before it was changed - otherwise I'm
> afraid that the first part of the file is saved to tar, and then the
> file is modified, and the last part of the file is saved to tar from
> the point it was modified - and so therefore not consistent with the
> first part... And therefore the file has lost its integrity, so even a
> WAL restore won't help because the base files themselves are corrupt
> in the tar file?
Not sure if the answers you got answered your question or not.  Here's
my take:

1) If the database is not running, tar works fine.

2) If the database is running, you can ONLY use tar if you also use WAL
archiving since the database will not only need the tar files, which
will be inconsistent, but also the WAL files (in your $PGDATA/pg_xlog)
in order to recover from those inconsistencies.  I find this is best if
you are creating a warm standby that is keeping a backup database in
sync with a primary.

3) If the database is running, use pg_dump to create a consistent backup.

4) No matter what, as previously mentioned, you should test your backup
procedures to ensure you can reliably restore.

Good luck,
David

Re: Backup of live database

От
Tom Arthurs
Дата:
Hi, Brian

We have been doing PITR backups since the feature first became available
in postgresql.  We first used tar, then, due to the dreadful warning
being emitted by tar (which made us doubt that it was actually archiving
that particular file) we decided to try CPIO, which actually emits much
the same warnings, though not as verbose, so I think that tar will work
as well (we never bothered going back to tar, mostly through laziness,
so I can personally say that it works.)  Actually I have reason to
believe you can use any series of OS commands that create copies or
archives of the files, as long as those commands don't exit prematurely
on warnings.

The important thing is to start archiving the WAL files *prior* to the
first OS backup, or you will end up with an unusable data base.

We have actually tested and used recovered data bases with this scheme.
  We use WAL archiving to replicate a warm standby data base which we
have failed over to (and failed back from) many times, and I've had to
do an actual PITR recovery to to recover several tables that got
accidentally deleted by bad procedures/code/brain burned DBA's :)

Brian Modra wrote:
> Sorry to be hammering this point, but I want to be totally sure its OK,
> rather than 5 months down the line attempt to recover, and it fails...
>
> Are you absolutely certain that the tar backup of the file that changed,
> is OK? (And that even if that file is huge, tar has managed to save the
> file as it was before it was changed - otherwise I'm afraid that the
> first part of the file is saved to tar, and then the file is modified,
> and the last part of the file is saved to tar from the point it was
> modified - and so therefore not consistent with the first part... And
> therefore the file has lost its integrity, so even a WAL restore won't
> help because the base files themselves are corrupt in the tar file?
>
> On 16/01/2008, *Joshua D. Drake* <jd@commandprompt.com
> <mailto:jd@commandprompt.com>> wrote:
>
>     Brian Modra wrote:
>      > The documentation about WAL says that you can start a live backup, as
>      > long as you use WAL backup also.
>      > I'm concerned about the integrity of the tar file. Can someone
>     help me
>      > with that?
>
>     If you are using point in time recovery:
>
>     http://www.postgresql.org/docs/8.2/static/continuous-archiving.html
>
>     You do not have to worry about it.
>
>     Joshua D. Drake
>
>
>
>
>
>
>      >
>      > On 16/01/2008, *Joshua D. Drake* <jd@commandprompt.com
>     <mailto:jd@commandprompt.com>
>      > <mailto: jd@commandprompt.com <mailto:jd@commandprompt.com>>> wrote:
>      >
>      >     Brian Modra wrote:
>      >      > Hi,
>      >      > If tar reports that a file was modified while it was being
>     archived,
>      >      > does that mean that the file was archived correctly, or is it
>      >     corrupted
>      >      > in the archive?
>      >      > Does tar take a snapshot of the file so that even if it is
>      >     modified, at
>      >      > least the archive is safe?
>      >
>      >     You can not use tar to backup postgresql if it is running.
>      >
>      >     http://www.postgresql.org/docs/8.2/static/backup.html
>     <http://www.postgresql.org/docs/8.2/static/backup.html>
>      >     <http://www.postgresql.org/docs/8.2/static/backup.html>
>      >
>      >     Sincerely,
>      >
>      >     Joshua D. Drake
>      >
>      >      > Thanks
>      >      >
>      >      > --
>      >      > Brian Modra   Land line: +27 23 5411 462
>      >      > Mobile: +27 79 183 8059
>      >      > 6 Jan Louw Str, Prince Albert, 6930
>      >      > Postal: P.O. Box 2, Prince Albert 6930
>      >      > South Africa
>      >
>      >
>      >
>      >
>      > --
>      > Brian Modra   Land line: +27 23 5411 462
>      > Mobile: +27 79 183 8059
>      > 6 Jan Louw Str, Prince Albert, 6930
>      > Postal: P.O. Box 2, Prince Albert 6930
>      > South Africa
>
>
>
>
> --
> Brian Modra   Land line: +27 23 5411 462
> Mobile: +27 79 183 8059
> 6 Jan Louw Str, Prince Albert, 6930
> Postal: P.O. Box 2, Prince Albert 6930
> South Africa

Re: Backup of live database

От
Steve Holdoway
Дата:
On Wed, 16 Jan 2008 10:19:12 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Steve Holdoway <steve.holdoway@firetrust.com> writes:
> > You can be absolutely certain that the tar backup of a file that's changed is a complete waste of time. Because it
changedwhile you were copying it.  
>
> That is, no doubt, the reasoning that prompted the gnu tar people to
> make it do what it does, but it has zero to do with reality for
> Postgres' usage in PITR base backups.  What we care about is consistency
> on the page level: as long as each page of the backed-up file correctly
> represents *some* state of that page while the backup was in progress,
> everything is okay, because replay of the WAL log will correct any pages
> that are out-of-date, missing, or shouldn't be there at all.  And
> Postgres always writes whole pages.  So as long as write() and read()
> are atomic --- which is the case on all Unixen I know of --- everything
> works.
>
> (Thinks for a bit...) Actually I guess there's one extra assumption in
> there, which is that tar must issue its reads in multiples of our page
> size.  But that doesn't seem like much of a stretch.
>
>             regards, tom lane

That's OK for the WAL logs, but what about the initial archive - the recovery's got to start somewhere...

Вложения

Re: Backup of live database

От
Tom Davies
Дата:
On 17/01/2008, at 4:42 AM, Tom Arthurs wrote:
> The important thing is to start archiving the WAL files *prior* to
> the first OS backup, or you will end up with an unusable data base.

Why does the recovery need WAL files from before the backup?

Tom

Re: Backup of live database

От
Tom Lane
Дата:
Tom Davies <tgdavies@gmail.com> writes:
> On 17/01/2008, at 4:42 AM, Tom Arthurs wrote:
>> The important thing is to start archiving the WAL files *prior* to
>> the first OS backup, or you will end up with an unusable data base.

> Why does the recovery need WAL files from before the backup?

It doesn't, but there's no reasonable way to start both processes at
exactly the same instant, so the standard advice is to start archiving
first.

            regards, tom lane

Re: Backup of live database

От
"Scott Marlowe"
Дата:
On Jan 16, 2008 4:56 PM, Tom Davies <tgdavies@gmail.com> wrote:
>
> On 17/01/2008, at 4:42 AM, Tom Arthurs wrote:
> > The important thing is to start archiving the WAL files *prior* to
> > the first OS backup, or you will end up with an unusable data base.
>
> Why does the recovery need WAL files from before the backup?

It's a timeline thing.  The database is coherent at time x1.  The wal
file started at point x0 and moving forward, at some point, matches
up.  You run the start_archive command which tells pgsql you're
starting your backup at point x1.  You start the backup.  You now have
a backup of the pgsql datastore that's a mix of what you had at x1
when you started, and x2 where you stopped.

You apply the WAL from x0 forward to, say x3., and it conveniently
rewrites the datastore to be coherent.  If your WAL was from some
point between x1 and x2 you might have some data in the database that
the WAL file wouldn't write over, but was incoherent in regards to
what you'd get from point x3.  So, some pages now are out of date,
because your WAL file isn't old enough.

Re: Backup of live database

От
Tom Arthurs
Дата:
If you don't start archiving log files, your first backup won't be valid
-- well I suppose you could do it the hard way and start the backup and
the log archiving  at exactly the same time (can't picture how to time
that), but the point is you need the current log when you kick off the
backup.  If you kick off archiving first, you are assured of a valid
backup (when the recovery is done.)  You may get some extra log files
that way, but better too many than too few.  (been there, done that.)

Tom Davies wrote:
>
> On 17/01/2008, at 4:42 AM, Tom Arthurs wrote:
>> The important thing is to start archiving the WAL files *prior* to the
>> first OS backup, or you will end up with an unusable data base.
>
> Why does the recovery need WAL files from before the backup?
>
> Tom
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>               http://www.postgresql.org/docs/faq
>
>