Обсуждение: backup database by cloning itself
Hello, my database is not very big so I want to adopt this backup strategy: I want to clone my database every 1 hour to another database 'currenttime_mydatabase' in order to have 24 backup a day, overwriting the yesterday backups by today-same-time backups. This is good for me because I have all the backups readily available to be read by my program (opening the backup read only). This is a very important for my needs. I'm writing a script run by cron each hour to do accomplish the backup task. My target is to have the backup operation not affecting the users, so I want to be able to copy a database even if the database is used by someone. Can I use CREATE DATABASE my_backup_database TEMPLATE current_database? Is there a better way to get what I need? Thanks, Filippo
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 02/13/07 07:54, filippo wrote: > Hello, > > my database is not very big so I want to adopt this backup strategy: > > I want to clone my database every 1 hour to another > database 'currenttime_mydatabase' in order to have 24 backup a day, > overwriting the yesterday backups by today-same-time backups. > > This is good for me because I have all the backups readily available > to be read by my program (opening the backup read only). This is a > very important for my needs. There's almost definitely a better way to do what you want to do. What benefit are you trying to obtain by creating 720 almost identical databases per month? > I'm writing a script run by cron each hour to do accomplish the backup > task. > > My target is to have the backup operation not affecting the users, so > I want to be able to copy a database even if the database is used by > someone. > > Can I use > CREATE DATABASE my_backup_database TEMPLATE current_database? > > Is there a better way to get what I need? Have you tried pg_dump? -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFF0rtOS9HxQb37XmcRAtUxAKDWLK7x3uDGxwni47Y+o1yJsHXOzACg4XYu ik9TtDFb6DJ+uZllXxahSMs= =zcis -----END PGP SIGNATURE-----
On Tue, 2007-02-13 at 07:54, filippo wrote: > Hello, > > my database is not very big so I want to adopt this backup strategy: > > I want to clone my database every 1 hour to another > database 'currenttime_mydatabase' in order to have 24 backup a day, > overwriting the yesterday backups by today-same-time backups. > > This is good for me because I have all the backups readily available > to be read by my program (opening the backup read only). This is a > very important for my needs. > > I'm writing a script run by cron each hour to do accomplish the backup > task. > > My target is to have the backup operation not affecting the users, so > I want to be able to copy a database even if the database is used by > someone. > > Can I use > CREATE DATABASE my_backup_database TEMPLATE current_database? > > Is there a better way to get what I need? Create database ain't gonna work, cause it needs a database with no users connected. You could do: dropdb hour_13; createdb hour_13 pg_dump masterdb | psql hour_13 with the number after hour being a var you set every hour when you run it.
Scott Marlowe <smarlowe@g2switchworks.com> writes:
> On Tue, 2007-02-13 at 07:54, filippo wrote:
>> my database is not very big so I want to adopt this backup strategy:
>> I want to clone my database every 1 hour  to another
>> database 'currenttime_mydatabase' in order to have 24 backup a day,
>> overwriting the yesterday backups by today-same-time backups.
>> Can I use
>> CREATE DATABASE my_backup_database TEMPLATE current_database?
> Create database ain't gonna work, cause it needs a database with no
> users connected.
There's a more serious objection, which is that storing a duplicate
database under the same postmaster doesn't give you an independent copy.
If something bad happens to pg_clog or pg_global, *all* your backups may
be rendered useless.
Now if your purpose in making the backups is only to protect against
user errors, and not any sort of hardware failure or Postgres bug,
maybe this isn't an issue.  But it's not what I'd call a backup.
            regards, tom lane
			
		-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 02/14/07 12:41, Tom Lane wrote: > Scott Marlowe <smarlowe@g2switchworks.com> writes: >> On Tue, 2007-02-13 at 07:54, filippo wrote: >>> my database is not very big so I want to adopt this backup strategy: >>> I want to clone my database every 1 hour to another >>> database 'currenttime_mydatabase' in order to have 24 backup a day, >>> overwriting the yesterday backups by today-same-time backups. >>> Can I use >>> CREATE DATABASE my_backup_database TEMPLATE current_database? > >> Create database ain't gonna work, cause it needs a database with no >> users connected. > > There's a more serious objection, which is that storing a duplicate > database under the same postmaster doesn't give you an independent copy. > If something bad happens to pg_clog or pg_global, *all* your backups may > be rendered useless. > > Now if your purpose in making the backups is only to protect against > user errors, and not any sort of hardware failure or Postgres bug, > maybe this isn't an issue. But it's not what I'd call a backup. Maybe his real goal "all the backups readily available to be read by my program (opening the backup read only)" is to have a historical record of what certain records looked like in the past. There are other ways of doing that, though. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFF01oIS9HxQb37XmcRAqYQAKDoSNb76asUadv9InNXroshleKZEQCgl6w6 SwWu3841RN4B+GBBkxoa/DQ= =bdEY -----END PGP SIGNATURE-----
> Maybe his real goal "all the backups readily available to be read by > my program (opening the backup read only)" is to have a historical > record of what certain records looked like in the past. > > There are other ways of doing that, though. > If your speculation is right, perhaps the OP ought to explain a little more fully why he needs 24 snapshots a day, or indeed any at all. It seems to me that if you really want a historical record of what certain tables looked like in the past, it would be smarter and more accurate to create triggers, for each possible operation, that store the relevant details in an audit table including especially who made the edits and when. This strikes me as being much less work than developing code that processes so many backups. Cheers Ted
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 02/14/07 13:40, Ted Byers wrote: >> Maybe his real goal "all the backups readily available to be read by >> my program (opening the backup read only)" is to have a historical >> record of what certain records looked like in the past. >> >> There are other ways of doing that, though. >> > > If your speculation is right, perhaps the OP ought to explain a little > more fully why he needs 24 snapshots a day, or indeed any at all. > > It seems to me that if you really want a historical record of what > certain tables looked like in the past, it would be smarter and more > accurate to create triggers, for each possible operation, that store the > relevant details in an audit table including especially who made the > edits and when. This strikes me as being much less work than developing > code that processes so many backups. I dunno about that. We use triggers to populate log tables that get extracted, truncated then loaded into a reporting db every night. Because of the night time batch cycle, there is no quiescent time to do this, so we have 2 log tables, and the triggers alternate which log table to insert into, depending on whether the day number is even or odd. That's in addition to the trigger logic to insert into history tables. It's a royal pain. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFF03URS9HxQb37XmcRAq0RAJ4iAHtbst+Gq9QndTr36lErYUwSmgCg7dM/ luIRI+F9eqYqUoMz9VNNaNc= =NYOz -----END PGP SIGNATURE-----
> >> Maybe his real goal "all the backups readily available to be read by > >> my program (opening the backup read only)" is to have a historical > >> record of what certain records looked like in the past. What postgresql "time travel"? I have never used it, and it looks a little bit unmaintained, but it might be perfect with some tweaking: contrib/spi/README.timetravel I know that someone at SFPUG presented on using time travel last year, and she may have more info .
----- Original Message ----- From: "Ron Johnson" <ron.l.johnson@cox.net> To: <pgsql-general@postgresql.org> Sent: Wednesday, February 14, 2007 3:46 PM Subject: Re: [GENERAL] backup database by cloning itself >> >> It seems to me that if you really want a historical record of what >> certain tables looked like in the past, it would be smarter and more >> accurate to create triggers, for each possible operation, that store the >> relevant details in an audit table including especially who made the >> edits and when. This strikes me as being much less work than developing >> code that processes so many backups. > > I dunno about that. We use triggers to populate log tables that get > extracted, truncated then loaded into a reporting db every night. > Because of the night time batch cycle, there is no quiescent time to > do this, so we have 2 log tables, and the triggers alternate which > log table to insert into, depending on whether the day number is > even or odd. > > That's in addition to the trigger logic to insert into history tables. > > It's a royal pain. Sounds painful, but not as painful as trying to do the same thing with a suite of backups and client code to read the backups. Or maybe the latter task is easier than it looks at first glance. Doesn't the amount of pain depend on the amount of traffic your servers see, the capability of the hardware you can devote to the task, what you're doing with the history tables, &c.? When I've worked on this kind of problem, everything was in a single DB. The traffic, though, was low enough that for daily reports a simple view of the log selecting only the previous day's data, was sufficient, but only occasionally used. The principal reason for the log was to support accountability, to know who made changes, and when, and who knew what when. Historical reconstructions were required, e.g., only when something went awry and there was a need to know if a good decision was made based on bad data or a bad decision was made based on good data, or, e.g., during an audit of the business processes and decision support systems. Cheers, Ted
Here is the link to Elein's presentation: http://www.varlena.com/GeneralBits/Tidbits/tt.pdf > What [about] postgresql "time travel"? I have never used it, and it looks a > little bit unmaintained, but it might be perfect with some tweaking:
On 13 Feb, 14:54, "filippo" <filippo2...@virgilio.it> wrote: > My target is to have the backup operation not affecting the users, so > I want to be able to copy a database even if the database is used by > someone. I could use pg_dump/pg_restore. pg_dump doesn't have to have exclusive access to database to perform the operation. My only problem is that pg_dump create a backup on a file, the best to me whould be to have a perfect clone (users/ data etc) of original database ready to be used just after the cloning. Is it possible? Thanks, Filippo
On 14 Feb, 08:33, ron.l.john...@cox.net (Ron Johnson) wrote: > -----BEGIN PGP SIGNED MESSAGE----- > There's almost definitely a better way to do what you want to do. > > What benefit are you trying to obtain by creating 720 almost > identical databases per month? I only need the last 24, overwriting each day. In my case 8 are enough (8am to 24pm, one every 2 hours). The important thing is to be able to open each very simply (read only) without restore anything. Just tell my application (written be me), to point the basckup database instead of latest one. > Have you tried pg_dump? probably this is ok. pg_dump create a backup file, not a true clone db. I want a exact copy (data and users and everithing) with a different name (05-mydatabase 06-mydatabase ...)
13 Feb 2007 05:54:44 -0800, filippo <filippo2991@virgilio.it>: > Hello, > > my database is not very big so I want to adopt this backup strategy: > > I want to clone my database every 1 hour to another > database 'currenttime_mydatabase' in order to have 24 backup a day, > overwriting the yesterday backups by today-same-time backups. > > This is good for me because I have all the backups readily available > to be read by my program (opening the backup read only). This is a > very important for my needs. > > I'm writing a script run by cron each hour to do accomplish the backup > task. > > My target is to have the backup operation not affecting the users, so > I want to be able to copy a database even if the database is used by > someone. > > Can I use > CREATE DATABASE my_backup_database TEMPLATE current_database? no. database used as template must not be accessed during copy > > Is there a better way to get what I need? you can script this: pg_dump sourcedb | psql targetdb -- Filip Rembiałkowski
On Feb 14, 2:14 am, "filippo" <filippo2...@virgilio.it> wrote: > On 13 Feb, 14:54, "filippo" <filippo2...@virgilio.it> wrote: > > > My target is to have the backup operation not affecting the users, so > > I want to be able to copy a database even if the database is used by > > someone. > > I could use pg_dump/pg_restore. pg_dump doesn't have to have exclusive > access to database to perform the operation. My only problem is that > pg_dump create a backup on a file, the best to me whould be to have a > perfect clone (users/ data etc) of original database ready to be used > just after the cloning. Is it possible? > > Thanks, > > Filippo Well, I could see you writing a client application that creates a clone by first recreating all the schemas in your database and then copying the data to the clone, and probably quite a bit more, In such a case, since you have absolute control over your client code, you can do anything you want. I am not sure, though, that that is the best use of your time and hardware resources, especially if all you're after is a backup. Just think of all the overhead involved in creating a new clone, and everything that implies, every hour. But why not further explore your backup options if all you're concerned about is a reliable backup. You may find "23.3. On-line backup and point-in-time recovery (PITR)" in the postgresql documentation useful. You haven't given any information about why it might not be appropriate in your situation. If you're really doing what it looks to me like you're doing, then you may be in the beginning stages of reinventing Postgresql's PITR capability. The builtin support for PITR in Postgresql strikes me as sufficient for what you say you need. If you require more, which would imply you want more than the simple backup you say you're after, then defining a suitable suite of triggers and audit tables may serve. Neither should adversely affect your users. especially if your "database is not very big ". HTH Ted
filippo wrote: > On 14 Feb, 08:33, ron.l.john...@cox.net (Ron Johnson) wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > > There's almost definitely a better way to do what you want to do. > > > > What benefit are you trying to obtain by creating 720 almost > > identical databases per month? > > I only need the last 24, overwriting each day. In my case 8 are enough > (8am to 24pm, one every 2 hours). The important thing is to be able to > open each very simply (read only) without restore anything. Just tell > my application (written be me), to point the basckup database instead > of latest one. I wonder if you could have a PITR warm standby instead, and every hour stop it and backup that. This is very hand-wavy, you're expected to fill in the details :-) -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 02/14/07 01:14, filippo wrote: > On 13 Feb, 14:54, "filippo" <filippo2...@virgilio.it> wrote: > >> My target is to have the backup operation not affecting the users, so >> I want to be able to copy a database even if the database is used by >> someone. > > I could use pg_dump/pg_restore. pg_dump doesn't have to have exclusive > access to database to perform the operation. My only problem is that > pg_dump create a backup on a file, the best to me whould be to have a > perfect clone (users/ data etc) of original database ready to be used > just after the cloning. Is it possible? pg_dump | pg_restore. But you still haven't told us why you need copies of the database every 2 hours. What is the "business need" you are trying to solve. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFF1MGpS9HxQb37XmcRAlYvAJ92Hl9wI/7mb/zOh1xsZwRHR8uDvQCdFKE2 SIdsgnhecZKpEUMWYARLWWA= =lqeo -----END PGP SIGNATURE-----