Обсуждение: Are file system level differential/incremental backups possible?
Is it possible to do a full file system level backup of the data directory, say once a week, and differentials or incrementals daily? I'm wondering if there are files that would normally be removed that a restore: Full then diff/inc would not remove and perhaps corrupt/confuse things. Process: Saturday: Full backup (reset archive bits) of data dir with database shutdown Sunday: Differential (don't reset archive bits) of data dir with database shutdown Monday: Differential (don't reset archive bits) of data dir with database shutdown Wednesday: Restore to test server using Saturday's Full and Monday's Differential. Obviously this works for regular files/file systems; however, I'm not sure this is a good method with postgresql as the resulting data dir *may* (?) contain extra files (or other issues)? Note: our database is 850GB (Windows 2008 R2 pg version 8.3.12)
As there's one file for each object, a single update on each would make you to copy the all the file again. I heard there was tool to make differentiel copy of a part of a file but I don't know if it's really efficient.
Anyway, a better way for you would be to do a regular backup (with pg_start_backup, copy and pg_stop_backup) and then use wal archive_command to keep the xlogs between 2 full backups.
Anyway, a better way for you would be to do a regular backup (with pg_start_backup, copy and pg_stop_backup) and then use wal archive_command to keep the xlogs between 2 full backups.
On Wed, Oct 12, 2011 at 11:30 PM, Bob Hatfield <bobhatfield@gmail.com> wrote:
Is it possible to do a full file system level backup of the data
directory, say once a week, and differentials or incrementals daily?
I'm wondering if there are files that would normally be removed that a
restore: Full then diff/inc would not remove and perhaps
corrupt/confuse things.
Process:
Saturday: Full backup (reset archive bits) of data dir with database shutdown
Sunday: Differential (don't reset archive bits) of data dir with
database shutdown
Monday: Differential (don't reset archive bits) of data dir with
database shutdown
Wednesday: Restore to test server using Saturday's Full and Monday's
Differential.
Obviously this works for regular files/file systems; however, I'm not
sure this is a good method with postgresql as the resulting data dir
*may* (?) contain extra files (or other issues)?
Note: our database is 850GB (Windows 2008 R2 pg version 8.3.12)
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
> Anyway, a better way for you would be to do a regular backup (with pg_start_backup, copy and pg_stop_backup) and then usewal archive_command to keep the xlogs between 2 full backups. Thanks Julien. Can pg_start/stop_backup() be used for regular full file system backups? All of the documentation I've read only refers to using those for warm standby/wal shipping methods.
On Thu, Oct 13, 2011 at 12:04 AM, Bob Hatfield <bobhatfield@gmail.com> wrote:
> Anyway, a better way for you would be to do a regular backup (with pg_start_backup, copy and pg_stop_backup) and then use wal archive_command to keep the xlogs between 2 full backups.Thanks Julien. Can pg_start/stop_backup() be used for regular full
file system backups? All of the documentation I've read only refers
to using those for warm standby/wal shipping methods.
Yes, and it's the only way to do a file backup without stopping the server. Careful, the command can last a while as it forces a checkpoint (see the doc for more details).
It's used for warm standby to create a copy of the server, before the wals that'll be generated can continue to restore it.
On October 12, 2011 03:04:30 PM Bob Hatfield wrote: > > Anyway, a better way for you would be to do a regular backup (with > > pg_start_backup, copy and pg_stop_backup) and then use wal > > archive_command to keep the xlogs between 2 full backups. > > Thanks Julien. Can pg_start/stop_backup() be used for regular full > file system backups? All of the documentation I've read only refers > to using those for warm standby/wal shipping methods. The base backup necessary to initialize a warm standby server is a full file system backup of the database, which can also be used for restores to any point in time after the base backup is completed, assuming you also have all the archived WAL files.
> The base backup necessary to initialize a warm standby server is a full file system backup of the database, which can alsobe used for restores to any point in time after the base backup is completed, assuming you also have all the archivedWAL files. Thanks to both of you. I currently use the base backup technique for use with a warm standby server but was not aware I could use that for full file system level backups as well. In fact, I currently run a warm standby server (created from a base backup every several weeks) as well as stop/start the pg process to do a full file system level copy each night. If I think about this a bit, I'm sure there's a more efficient way of doing this. (As well as may solve another problem I'm having (another post) with the standby server's database getting corrupt after stopping/starting the primary's pg process.)
Bob Hatfield wrote: > Is it possible to do a full file system level backup of the data > directory, say once a week, and differentials or incrementals daily? > > I'm wondering if there are files that would normally be removed that a > restore: Full then diff/inc would not remove and perhaps > corrupt/confuse things. If you drop or truncate a table between the full and the incremental backup, will that file be "resurrected"? Such resurrected files will not disturb PostgreSQL, but if you keep them around, you might end up with a lot of dead files if you have to restore a couple of times. Yours, Laurenz Albe
> If you drop or truncate a table between the full and the incremental backup, will that file be "resurrected"? > > Such resurrected files will not disturb PostgreSQL, but if you keep them around, you might end up with a lot of dead filesif you have to restore a couple of times. That makes sense, thank you. At one point, I was doing delta copies of the database: full one night, then only new/changed files the following nights -- I am pretty sure that didn't work and thought it was because the destination had extra files that pg had removed on the primary. I now use a robocopy /mir which removes files on the destination that aren't on the primary. This seems to work fine -- but it made me think that PostreSQL doesn't work with the former process.
On 10/13/2011 05:30 AM, Bob Hatfield wrote: > Is it possible to do a full file system level backup of the data > directory, say once a week, and differentials or incrementals daily? I'd love to be able to do this, but you can't do it usefully at a file-system level. There's too much churn in the data files for even a binary diff to be much use - and even if it were, the performance of it would be miserable. You *could* do a differential so long as it's a proper one that tracks file removals as well as additions/changes. It'd be pretty pointless though as you wouldn't save much if any storage. I've looked into database-level diffs, but from what I can find out it seems that PostgreSQL's MVCC system doesn't store enough information to produce a differential dump, either. There's no way to detect and record tuples that were deleted then vacuumed away since the last backup was taken, so a trigger-based or WAL-based system is necessary. I'd love a way to "collapse" or merge a set of WAL segments into a minimal diff that only contained just enough information to get from the start to end state of the series of WAL segments, rather than all the churn in-between. This would be great for storing longer PITR histories (but more coarsely) and being able to do faster restores. Unfortunately given how scattered writes are I doubt it'd actually be possible or any faster if it was. SQL-level differentials would be great, though. -- Craig Ringer
>> Is it possible to do a full file system level backup of the data >> directory, say once a week, and differentials or incrementals daily? > > I'd love to be able to do this, but you can't do it usefully at a > file-system level. There's too much churn in the data files for even a > binary diff to be much use - and even if it were, the performance of it > would be miserable. Actually, for us, a full backup is currently about 950GB which takes about 24 hours and a diff several days later is only 150GB and takes two hours. (It takes significantly less time not only becuase it's less data but also because the diff job doesn't have to compete with other backup jobs (since it's faster due to less data).) I currently stop pg, robocopy (rsync) the changes to our DRS server, restart pg -- then backup the data on the DRS server later. This allows us to have our database back up within 30 minutes. I've tested the database after doing a restore: full + diff and it seems fine (both viewing with the application as well as running a reindex on the entire db). I think we'll stick with this process with the understanding that if a disaster struck, and we needed to do a full + diff restore, that we may have a few files that are there that that aren't needed.
On 10/20/2011 02:00 AM, Bob Hatfield wrote: >>> Is it possible to do a full file system level backup of the data >>> directory, say once a week, and differentials or incrementals daily? >> >> I'd love to be able to do this, but you can't do it usefully at a >> file-system level. There's too much churn in the data files for even a >> binary diff to be much use - and even if it were, the performance of it >> would be miserable. > > Actually, for us, a full backup is currently about 950GB which takes > about 24 hours and a diff several days later is only 150GB and takes > two hours. (It takes significantly less time not only becuase it's > less data but also because the diff job doesn't have to compete with > other backup jobs (since it's faster due to less data).) Wow. I'm very surprised by that, but it's good to hear. I would've expected a much poorer result, and have never seen anything like that good a result when I've tested with my (much smaller) data sets. Perhaps a binary diff based backup mechanism that's aware of Pg's structure and can skip holes, etc is worth looking into after all. -- Craig Ringer
-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 >> Is it possible to do a full file system level backup of the data >> directory, say once a week, and differentials or incrementals daily? ... > taken, so a trigger-based or WAL-based system is necessary. ... > SQL-level differentials would be great, though. FWIW, Bucardo 5 has a 'flatfile' mode that will output the deletion and insert statements to get a table from an old state to a new one. Not quite what is being asked for here, but could be useful for similar things as well as for auditing. - -- Greg Sabino Mullane greg@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201110221339 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iEYEAREDAAYFAk6i/9oACgkQvJuQZxSWSsjbkgCfRIIstVLsw6OeMUJZ2iL2if4Z 4hQAoIpOZXK/OseM45FAUmi8vcESpRif =ed0H -----END PGP SIGNATURE-----