Обсуждение: Planning hot/live backups?
The owners of the animal hospital where I work at want to consider live/hot backups through out the day so we're less likely to lose a whole day of transaction. We use Postgresql 8.0.15. We do 3AM backups, using pg_dumpall, to a file when there is very little activity. The hospital enjoys the overall performance of the veterinary application running on Postgresql. I know doing a mid-day backup when up to 60 computers (consistently 35-40) are access client/patient information, it will cause some frustration. I understand there needs to be balance of performance and backup of current records. While I know that not all situations are the same, I am hoping there is a performance latency that others have experienced when doing backups during the day and/or planning for cluster (or other types of redundancy). My animal hospital operates 24x7 and is in the south part of the San Francisco Bay area. Outside of sharing your experiences/input with me, I would not mind if you/your company do this type of consulting offline. Thank you. Steve
I back up around 10 Gig of data every half hour using pg_dump. I don't backup the entire database at once. Instead I backup at the schema namespace level. But I do all of them every half hour. It takes four minutes. That includes the time to copy the files to the backup server. I do each schema namespace backup consecutively. I also run vacuum full analyze once a day. My system is up 24/7 as well. I don't backup in the middle of the night. There is so little back. But I could. I am able to have more backups by not doing it when there are only a handful of transactions. Thanks, Lance Campbell Project Manager/Software Architect Web Services at Public Affairs University of Illinois 217.333.0382 http://webservices.uiuc.edu -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Steve Poe Sent: Monday, March 24, 2008 3:23 PM To: pgsql-performance@postgresql.org Subject: [PERFORM] Planning hot/live backups? The owners of the animal hospital where I work at want to consider live/hot backups through out the day so we're less likely to lose a whole day of transaction. We use Postgresql 8.0.15. We do 3AM backups, using pg_dumpall, to a file when there is very little activity. The hospital enjoys the overall performance of the veterinary application running on Postgresql. I know doing a mid-day backup when up to 60 computers (consistently 35-40) are access client/patient information, it will cause some frustration. I understand there needs to be balance of performance and backup of current records. While I know that not all situations are the same, I am hoping there is a performance latency that others have experienced when doing backups during the day and/or planning for cluster (or other types of redundancy). My animal hospital operates 24x7 and is in the south part of the San Francisco Bay area. Outside of sharing your experiences/input with me, I would not mind if you/your company do this type of consulting offline. Thank you. Steve - Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Steve Poe wrote: > The owners of the animal hospital where I work at want to consider live/hot > backups through out the day so we're less likely to lose a whole > day of transaction. We use Postgresql 8.0.15. We do 3AM > backups, using pg_dumpall, to a file when there is very little activity. You probably want to look into PITR, you can have a constant ongoing backup of your data and never lose more than a few minutes of data. The overhead isn't all the big especially if you are shipping the log files to a separate server.
Matthew T. O'Connor wrote: > Steve Poe wrote: >> The owners of the animal hospital where I work at want to consider >> live/hot >> backups through out the day so we're less likely to lose a whole >> day of transaction. We use Postgresql 8.0.15. We do 3AM >> backups, using pg_dumpall, to a file when there is very little activity. > > > > You probably want to look into PITR, you can have a constant ongoing > backup of your data and never lose more than a few minutes of data. > The overhead isn't all the big especially if you are shipping the log > files to a separate server. > > I'll second that. PITR is IMHO the way to go, and I believe you'll be pleasantly surprised how easy it is to do. As always, test your backup strategy by restoring. Even better, make a point of periodically testing a restore of production backups to a non-production system. Paul
"Matthew T. O'Connor" <matthew@zeut.net> writes: > Steve Poe wrote: >> The owners of the animal hospital where I work at want to consider live/hot >> backups through out the day so we're less likely to lose a whole >> day of transaction. We use Postgresql 8.0.15. We do 3AM >> backups, using pg_dumpall, to a file when there is very little activity. > You probably want to look into PITR, you can have a constant ongoing > backup of your data and never lose more than a few minutes of data. The > overhead isn't all the big especially if you are shipping the log files > to a separate server. But note that you really need to update to a newer major release before depending on PITR. While 8.0 nominally has support for it, it's taken us several releases to really get the operational gotchas sorted out. regards, tom lane
Steve Poe wrote: > At this point, I am just moving the pg_dumpall file to another server. Pardon > my question: how would you 'ship the log files'? > [ You should cc the mailing list so that everyone can benefit from the conversation. ] RTM: http://www.postgresql.org/docs/8.3/interactive/continuous-archiving.html
Tom, So, are you saying we need to get to at least 8.1.x before considering PITR for a production environment? Unfortunately, the vendor/supplier of our veterinary application does not support higher versions. We would be proceeding "at our own risk". Is there anything else we can do we 8.0.15 version? Steve On Mon, Mar 24, 2008 at 3:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > "Matthew T. O'Connor" <matthew@zeut.net> writes: > > Steve Poe wrote: > >> The owners of the animal hospital where I work at want to consider live/hot > >> backups through out the day so we're less likely to lose a whole > >> day of transaction. We use Postgresql 8.0.15. We do 3AM > >> backups, using pg_dumpall, to a file when there is very little activity. > > > You probably want to look into PITR, you can have a constant ongoing > > backup of your data and never lose more than a few minutes of data. The > > overhead isn't all the big especially if you are shipping the log files > > to a separate server. > > But note that you really need to update to a newer major release before > depending on PITR. While 8.0 nominally has support for it, it's taken > us several releases to really get the operational gotchas sorted out. > > regards, tom lane >
Hi! Going a bit off topic, but one quick question: to avoid storing GB of WAL files that will probably take a lot of time to reload, how can the backup be "reset"? I suspect that it's something like stopping the WAL archiving, doing a new base backup, and restart archiving, but I've never done it (have been using SQL dumps), so... Yours Miguel Arroz On 2008/03/24, at 22:28, Matthew T. O'Connor wrote: > Steve Poe wrote: >> At this point, I am just moving the pg_dumpall file to another >> server. Pardon >> my question: how would you 'ship the log files'? >> > > [ You should cc the mailing list so that everyone can benefit from > the conversation. ] > > RTM: http://www.postgresql.org/docs/8.3/interactive/continuous-archiving.html > > > - > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org > ) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance Miguel Arroz http://www.terminalapp.net http://www.ipragma.com
Вложения
Miguel Arroz wrote: > Going a bit off topic, but one quick question: to avoid storing GB of > WAL files that will probably take a lot of time to reload, how can the > backup be "reset"? I suspect that it's something like stopping the WAL > archiving, doing a new base backup, and restart archiving, but I've > never done it (have been using SQL dumps), so... Basically, you only need WAL logs from the last time you made a back-up of your data directory. We do nightly re-rsync of our data directory and then purge old WAL files that are no longer needed.