Обсуждение: Backup of live database
Hi,
If tar reports that a file was modified while it was being archived, does that mean that the file was archived correctly, or is it corrupted in the archive?
Does tar take a snapshot of the file so that even if it is modified, at least the archive is safe?
Thanks
--
Brian Modra Land line: +27 23 5411 462
Mobile: +27 79 183 8059
6 Jan Louw Str, Prince Albert, 6930
Postal: P.O. Box 2, Prince Albert 6930
South Africa
If tar reports that a file was modified while it was being archived, does that mean that the file was archived correctly, or is it corrupted in the archive?
Does tar take a snapshot of the file so that even if it is modified, at least the archive is safe?
Thanks
--
Brian Modra Land line: +27 23 5411 462
Mobile: +27 79 183 8059
6 Jan Louw Str, Prince Albert, 6930
Postal: P.O. Box 2, Prince Albert 6930
South Africa
Brian Modra wrote: > Hi, > If tar reports that a file was modified while it was being archived, > does that mean that the file was archived correctly, or is it corrupted > in the archive? > Does tar take a snapshot of the file so that even if it is modified, at > least the archive is safe? You can not use tar to backup postgresql if it is running. http://www.postgresql.org/docs/8.2/static/backup.html Sincerely, Joshua D. Drake > Thanks > > -- > Brian Modra Land line: +27 23 5411 462 > Mobile: +27 79 183 8059 > 6 Jan Louw Str, Prince Albert, 6930 > Postal: P.O. Box 2, Prince Albert 6930 > South Africa
The documentation about WAL says that you can start a live backup, as long as you use WAL backup also.
I'm concerned about the integrity of the tar file. Can someone help me with that?
--
Brian Modra Land line: +27 23 5411 462
Mobile: +27 79 183 8059
6 Jan Louw Str, Prince Albert, 6930
Postal: P.O. Box 2, Prince Albert 6930
South Africa
I'm concerned about the integrity of the tar file. Can someone help me with that?
On 16/01/2008, Joshua D. Drake <jd@commandprompt.com> wrote:
Brian Modra wrote:
> Hi,
> If tar reports that a file was modified while it was being archived,
> does that mean that the file was archived correctly, or is it corrupted
> in the archive?
> Does tar take a snapshot of the file so that even if it is modified, at
> least the archive is safe?
You can not use tar to backup postgresql if it is running.
http://www.postgresql.org/docs/8.2/static/backup.html
Sincerely,
Joshua D. Drake
> Thanks
>
> --
> Brian Modra Land line: +27 23 5411 462
> Mobile: +27 79 183 8059
> 6 Jan Louw Str, Prince Albert, 6930
> Postal: P.O. Box 2, Prince Albert 6930
> South Africa
--
Brian Modra Land line: +27 23 5411 462
Mobile: +27 79 183 8059
6 Jan Louw Str, Prince Albert, 6930
Postal: P.O. Box 2, Prince Albert 6930
South Africa
"Joshua D. Drake" <jd@commandprompt.com> writes: > Brian Modra wrote: >> If tar reports that a file was modified while it was being archived, >> does that mean that the file was archived correctly, or is it corrupted >> in the archive? > You can not use tar to backup postgresql if it is running. You can use it for a PITR base backup --- WAL replay will fix any inconsistencies. In that context it's just annoying that some tar versions complain about this. > http://www.postgresql.org/docs/8.2/static/backup.html Yah. Note that sections 23.2 and 23.3.32 are talking about entirely different scenarios. In the former case such a warning is scary, in the latter not. regards, tom lane
Brian Modra wrote: > The documentation about WAL says that you can start a live backup, as > long as you use WAL backup also. > I'm concerned about the integrity of the tar file. Can someone help me > with that? If you are using point in time recovery: http://www.postgresql.org/docs/8.2/static/continuous-archiving.html You do not have to worry about it. Joshua D. Drake > > On 16/01/2008, *Joshua D. Drake* <jd@commandprompt.com > <mailto:jd@commandprompt.com>> wrote: > > Brian Modra wrote: > > Hi, > > If tar reports that a file was modified while it was being archived, > > does that mean that the file was archived correctly, or is it > corrupted > > in the archive? > > Does tar take a snapshot of the file so that even if it is > modified, at > > least the archive is safe? > > You can not use tar to backup postgresql if it is running. > > http://www.postgresql.org/docs/8.2/static/backup.html > <http://www.postgresql.org/docs/8.2/static/backup.html> > > Sincerely, > > Joshua D. Drake > > > Thanks > > > > -- > > Brian Modra Land line: +27 23 5411 462 > > Mobile: +27 79 183 8059 > > 6 Jan Louw Str, Prince Albert, 6930 > > Postal: P.O. Box 2, Prince Albert 6930 > > South Africa > > > > > -- > Brian Modra Land line: +27 23 5411 462 > Mobile: +27 79 183 8059 > 6 Jan Louw Str, Prince Albert, 6930 > Postal: P.O. Box 2, Prince Albert 6930 > South Africa
Sorry to be hammering this point, but I want to be totally sure its OK, rather than 5 months down the line attempt to recover, and it fails...
Are you absolutely certain that the tar backup of the file that changed, is OK? (And that even if that file is huge, tar has managed to save the file as it was before it was changed - otherwise I'm afraid that the first part of the file is saved to tar, and then the file is modified, and the last part of the file is saved to tar from the point it was modified - and so therefore not consistent with the first part... And therefore the file has lost its integrity, so even a WAL restore won't help because the base files themselves are corrupt in the tar file?
--
Brian Modra Land line: +27 23 5411 462
Mobile: +27 79 183 8059
6 Jan Louw Str, Prince Albert, 6930
Postal: P.O. Box 2, Prince Albert 6930
South Africa
Are you absolutely certain that the tar backup of the file that changed, is OK? (And that even if that file is huge, tar has managed to save the file as it was before it was changed - otherwise I'm afraid that the first part of the file is saved to tar, and then the file is modified, and the last part of the file is saved to tar from the point it was modified - and so therefore not consistent with the first part... And therefore the file has lost its integrity, so even a WAL restore won't help because the base files themselves are corrupt in the tar file?
On 16/01/2008, Joshua D. Drake <jd@commandprompt.com> wrote:
Brian Modra wrote:
> The documentation about WAL says that you can start a live backup, as
> long as you use WAL backup also.
> I'm concerned about the integrity of the tar file. Can someone help me
> with that?
If you are using point in time recovery:
http://www.postgresql.org/docs/8.2/static/continuous-archiving.html
You do not have to worry about it.
Joshua D. Drake
>
> On 16/01/2008, *Joshua D. Drake* <jd@commandprompt.com
> <mailto: jd@commandprompt.com>> wrote:
>
> Brian Modra wrote:
> > Hi,
> > If tar reports that a file was modified while it was being archived,
> > does that mean that the file was archived correctly, or is it
> corrupted
> > in the archive?
> > Does tar take a snapshot of the file so that even if it is
> modified, at
> > least the archive is safe?
>
> You can not use tar to backup postgresql if it is running.
>
> http://www.postgresql.org/docs/8.2/static/backup.html
> <http://www.postgresql.org/docs/8.2/static/backup.html>
>
> Sincerely,
>
> Joshua D. Drake
>
> > Thanks
> >
> > --
> > Brian Modra Land line: +27 23 5411 462
> > Mobile: +27 79 183 8059
> > 6 Jan Louw Str, Prince Albert, 6930
> > Postal: P.O. Box 2, Prince Albert 6930
> > South Africa
>
>
>
>
> --
> Brian Modra Land line: +27 23 5411 462
> Mobile: +27 79 183 8059
> 6 Jan Louw Str, Prince Albert, 6930
> Postal: P.O. Box 2, Prince Albert 6930
> South Africa
--
Brian Modra Land line: +27 23 5411 462
Mobile: +27 79 183 8059
6 Jan Louw Str, Prince Albert, 6930
Postal: P.O. Box 2, Prince Albert 6930
South Africa
You can be absolutely certain that the tar backup of a file that's changed is a complete waste of time. Because it changedwhile you were copying it. Steve. On Wed, 16 Jan 2008 10:24:00 +0200 "Brian Modra" <epailty@googlemail.com> wrote: > Sorry to be hammering this point, but I want to be totally sure its OK, > rather than 5 months down the line attempt to recover, and it fails... > > Are you absolutely certain that the tar backup of the file that changed, is > OK? (And that even if that file is huge, tar has managed to save the file as > it was before it was changed - otherwise I'm afraid that the first part of > the file is saved to tar, and then the file is modified, and the last part > of the file is saved to tar from the point it was modified - and so > therefore not consistent with the first part... And therefore the file has > lost its integrity, so even a WAL restore won't help because the base files > themselves are corrupt in the tar file? > > On 16/01/2008, Joshua D. Drake <jd@commandprompt.com> wrote: > > > > Brian Modra wrote: > > > The documentation about WAL says that you can start a live backup, as > > > long as you use WAL backup also. > > > I'm concerned about the integrity of the tar file. Can someone help me > > > with that? > > > > If you are using point in time recovery: > > > > http://www.postgresql.org/docs/8.2/static/continuous-archiving.html > > > > You do not have to worry about it. > > > > Joshua D. Drake > > > > > > > > > > > > > > > > > > On 16/01/2008, *Joshua D. Drake* <jd@commandprompt.com > > > <mailto:jd@commandprompt.com>> wrote: > > > > > > Brian Modra wrote: > > > > Hi, > > > > If tar reports that a file was modified while it was being > > archived, > > > > does that mean that the file was archived correctly, or is it > > > corrupted > > > > in the archive? > > > > Does tar take a snapshot of the file so that even if it is > > > modified, at > > > > least the archive is safe? > > > > > > You can not use tar to backup postgresql if it is running. > > > > > > http://www.postgresql.org/docs/8.2/static/backup.html > > > <http://www.postgresql.org/docs/8.2/static/backup.html> > > > > > > Sincerely, > > > > > > Joshua D. Drake > > > > > > > Thanks > > > > > > > > -- > > > > Brian Modra Land line: +27 23 5411 462 > > > > Mobile: +27 79 183 8059 > > > > 6 Jan Louw Str, Prince Albert, 6930 > > > > Postal: P.O. Box 2, Prince Albert 6930 > > > > South Africa > > > > > > > > > > > > > > > -- > > > Brian Modra Land line: +27 23 5411 462 > > > Mobile: +27 79 183 8059 > > > 6 Jan Louw Str, Prince Albert, 6930 > > > Postal: P.O. Box 2, Prince Albert 6930 > > > South Africa > > > > > > > -- > Brian Modra Land line: +27 23 5411 462 > Mobile: +27 79 183 8059 > 6 Jan Louw Str, Prince Albert, 6930 > Postal: P.O. Box 2, Prince Albert 6930 > South Africa > -- Steve Holdoway <steve.holdoway@firetrust.com>
Brian Modra wrote: > Sorry to be hammering this point, but I want to be totally sure its OK, > rather than 5 months down the line attempt to recover, and it fails... > > Are you absolutely certain that the tar backup of the file that changed, > is OK? Have you considered testing it? Sincerely, Joshua D. Drake
Steve Holdoway <steve.holdoway@firetrust.com> writes: > You can be absolutely certain that the tar backup of a file that's changed is a complete waste of time. Because it changedwhile you were copying it. That is, no doubt, the reasoning that prompted the gnu tar people to make it do what it does, but it has zero to do with reality for Postgres' usage in PITR base backups. What we care about is consistency on the page level: as long as each page of the backed-up file correctly represents *some* state of that page while the backup was in progress, everything is okay, because replay of the WAL log will correct any pages that are out-of-date, missing, or shouldn't be there at all. And Postgres always writes whole pages. So as long as write() and read() are atomic --- which is the case on all Unixen I know of --- everything works. (Thinks for a bit...) Actually I guess there's one extra assumption in there, which is that tar must issue its reads in multiples of our page size. But that doesn't seem like much of a stretch. regards, tom lane
Am Mittwoch, 16. Januar 2008 schrieb Tom Lane: > (Thinks for a bit...) Actually I guess there's one extra assumption in > there, which is that tar must issue its reads in multiples of our page > size. But that doesn't seem like much of a stretch. There is something about that here: http://www.gnu.org/software/tar/manual/html_node/tar_149.html#SEC149 -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut <peter_e@gmx.net> writes: > Am Mittwoch, 16. Januar 2008 schrieb Tom Lane: >> (Thinks for a bit...) Actually I guess there's one extra assumption in >> there, which is that tar must issue its reads in multiples of our page >> size. But that doesn't seem like much of a stretch. > There is something about that here: > http://www.gnu.org/software/tar/manual/html_node/tar_149.html#SEC149 AFAICT that's talking about the I/O chunk size *on the archive file*. It doesn't say anything specific about the chunk size on the file side. regards, tom lane
Brian Modra wrote: > Sorry to be hammering this point, but I want to be totally sure its > OK, rather than 5 months down the line attempt to recover, and it fails... > > Are you absolutely certain that the tar backup of the file that > changed, is OK? (And that even if that file is huge, tar has managed > to save the file as it was before it was changed - otherwise I'm > afraid that the first part of the file is saved to tar, and then the > file is modified, and the last part of the file is saved to tar from > the point it was modified - and so therefore not consistent with the > first part... And therefore the file has lost its integrity, so even a > WAL restore won't help because the base files themselves are corrupt > in the tar file? Not sure if the answers you got answered your question or not. Here's my take: 1) If the database is not running, tar works fine. 2) If the database is running, you can ONLY use tar if you also use WAL archiving since the database will not only need the tar files, which will be inconsistent, but also the WAL files (in your $PGDATA/pg_xlog) in order to recover from those inconsistencies. I find this is best if you are creating a warm standby that is keeping a backup database in sync with a primary. 3) If the database is running, use pg_dump to create a consistent backup. 4) No matter what, as previously mentioned, you should test your backup procedures to ensure you can reliably restore. Good luck, David
Hi, Brian We have been doing PITR backups since the feature first became available in postgresql. We first used tar, then, due to the dreadful warning being emitted by tar (which made us doubt that it was actually archiving that particular file) we decided to try CPIO, which actually emits much the same warnings, though not as verbose, so I think that tar will work as well (we never bothered going back to tar, mostly through laziness, so I can personally say that it works.) Actually I have reason to believe you can use any series of OS commands that create copies or archives of the files, as long as those commands don't exit prematurely on warnings. The important thing is to start archiving the WAL files *prior* to the first OS backup, or you will end up with an unusable data base. We have actually tested and used recovered data bases with this scheme. We use WAL archiving to replicate a warm standby data base which we have failed over to (and failed back from) many times, and I've had to do an actual PITR recovery to to recover several tables that got accidentally deleted by bad procedures/code/brain burned DBA's :) Brian Modra wrote: > Sorry to be hammering this point, but I want to be totally sure its OK, > rather than 5 months down the line attempt to recover, and it fails... > > Are you absolutely certain that the tar backup of the file that changed, > is OK? (And that even if that file is huge, tar has managed to save the > file as it was before it was changed - otherwise I'm afraid that the > first part of the file is saved to tar, and then the file is modified, > and the last part of the file is saved to tar from the point it was > modified - and so therefore not consistent with the first part... And > therefore the file has lost its integrity, so even a WAL restore won't > help because the base files themselves are corrupt in the tar file? > > On 16/01/2008, *Joshua D. Drake* <jd@commandprompt.com > <mailto:jd@commandprompt.com>> wrote: > > Brian Modra wrote: > > The documentation about WAL says that you can start a live backup, as > > long as you use WAL backup also. > > I'm concerned about the integrity of the tar file. Can someone > help me > > with that? > > If you are using point in time recovery: > > http://www.postgresql.org/docs/8.2/static/continuous-archiving.html > > You do not have to worry about it. > > Joshua D. Drake > > > > > > > > > > On 16/01/2008, *Joshua D. Drake* <jd@commandprompt.com > <mailto:jd@commandprompt.com> > > <mailto: jd@commandprompt.com <mailto:jd@commandprompt.com>>> wrote: > > > > Brian Modra wrote: > > > Hi, > > > If tar reports that a file was modified while it was being > archived, > > > does that mean that the file was archived correctly, or is it > > corrupted > > > in the archive? > > > Does tar take a snapshot of the file so that even if it is > > modified, at > > > least the archive is safe? > > > > You can not use tar to backup postgresql if it is running. > > > > http://www.postgresql.org/docs/8.2/static/backup.html > <http://www.postgresql.org/docs/8.2/static/backup.html> > > <http://www.postgresql.org/docs/8.2/static/backup.html> > > > > Sincerely, > > > > Joshua D. Drake > > > > > Thanks > > > > > > -- > > > Brian Modra Land line: +27 23 5411 462 > > > Mobile: +27 79 183 8059 > > > 6 Jan Louw Str, Prince Albert, 6930 > > > Postal: P.O. Box 2, Prince Albert 6930 > > > South Africa > > > > > > > > > > -- > > Brian Modra Land line: +27 23 5411 462 > > Mobile: +27 79 183 8059 > > 6 Jan Louw Str, Prince Albert, 6930 > > Postal: P.O. Box 2, Prince Albert 6930 > > South Africa > > > > > -- > Brian Modra Land line: +27 23 5411 462 > Mobile: +27 79 183 8059 > 6 Jan Louw Str, Prince Albert, 6930 > Postal: P.O. Box 2, Prince Albert 6930 > South Africa
On Wed, 16 Jan 2008 10:19:12 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote: > Steve Holdoway <steve.holdoway@firetrust.com> writes: > > You can be absolutely certain that the tar backup of a file that's changed is a complete waste of time. Because it changedwhile you were copying it. > > That is, no doubt, the reasoning that prompted the gnu tar people to > make it do what it does, but it has zero to do with reality for > Postgres' usage in PITR base backups. What we care about is consistency > on the page level: as long as each page of the backed-up file correctly > represents *some* state of that page while the backup was in progress, > everything is okay, because replay of the WAL log will correct any pages > that are out-of-date, missing, or shouldn't be there at all. And > Postgres always writes whole pages. So as long as write() and read() > are atomic --- which is the case on all Unixen I know of --- everything > works. > > (Thinks for a bit...) Actually I guess there's one extra assumption in > there, which is that tar must issue its reads in multiples of our page > size. But that doesn't seem like much of a stretch. > > regards, tom lane That's OK for the WAL logs, but what about the initial archive - the recovery's got to start somewhere...
Вложения
On 17/01/2008, at 4:42 AM, Tom Arthurs wrote: > The important thing is to start archiving the WAL files *prior* to > the first OS backup, or you will end up with an unusable data base. Why does the recovery need WAL files from before the backup? Tom
Tom Davies <tgdavies@gmail.com> writes: > On 17/01/2008, at 4:42 AM, Tom Arthurs wrote: >> The important thing is to start archiving the WAL files *prior* to >> the first OS backup, or you will end up with an unusable data base. > Why does the recovery need WAL files from before the backup? It doesn't, but there's no reasonable way to start both processes at exactly the same instant, so the standard advice is to start archiving first. regards, tom lane
On Jan 16, 2008 4:56 PM, Tom Davies <tgdavies@gmail.com> wrote: > > On 17/01/2008, at 4:42 AM, Tom Arthurs wrote: > > The important thing is to start archiving the WAL files *prior* to > > the first OS backup, or you will end up with an unusable data base. > > Why does the recovery need WAL files from before the backup? It's a timeline thing. The database is coherent at time x1. The wal file started at point x0 and moving forward, at some point, matches up. You run the start_archive command which tells pgsql you're starting your backup at point x1. You start the backup. You now have a backup of the pgsql datastore that's a mix of what you had at x1 when you started, and x2 where you stopped. You apply the WAL from x0 forward to, say x3., and it conveniently rewrites the datastore to be coherent. If your WAL was from some point between x1 and x2 you might have some data in the database that the WAL file wouldn't write over, but was incoherent in regards to what you'd get from point x3. So, some pages now are out of date, because your WAL file isn't old enough.
If you don't start archiving log files, your first backup won't be valid -- well I suppose you could do it the hard way and start the backup and the log archiving at exactly the same time (can't picture how to time that), but the point is you need the current log when you kick off the backup. If you kick off archiving first, you are assured of a valid backup (when the recovery is done.) You may get some extra log files that way, but better too many than too few. (been there, done that.) Tom Davies wrote: > > On 17/01/2008, at 4:42 AM, Tom Arthurs wrote: >> The important thing is to start archiving the WAL files *prior* to the >> first OS backup, or you will end up with an unusable data base. > > Why does the recovery need WAL files from before the backup? > > Tom > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > >