Обсуждение: PITR backup to Novell Netware file server
We have a database in each Wisconsin county which is part of the official court record for the circuit courts in that county. We are required to keep a backup in each county, such that we could recover a lost or corrupted database with what is in the county (as well as keeping at least four separate sources, each regularly confirmed as usable, at our central site). Currently, the only two boxes available besides the database server (in most counties) are a "utility server" (a Windows desktop class machine used to push workstation images and other support tasks) and a Novell Netware file server. We had to get something going quickly when we started rolling PostgreSQL out to the counties as a replacement of a commercial database product; the quick and (way too) dirty approach was to create a samba share on the utility server for the target of the archive_command and our base backups. We rsync from there back to the central site using the samba share, too. We frequently lose the mount points or they become unresponsive. We also frequently have corrupted files when they rsync across the WAN, so we want to move away from all use of samba. The Netware server supports ssh, scp, and an rsync daemon. I don't see how the ssh implementation is helpful, though, since it just gets you to the Netware console -- you can't cat to a disk file through it, for example. (At least not as far as we have been able to see.) It appears that the scp and rsync techniques both require the initial copy of the file to be saved on the database server itself, which we were hoping to avoid for performance reasons. We could create ncp mounts on the database servers; but, frankly, we haven't had much better luck in keeping those connected than we have with samba. Has anyone else been through this and found a robust and reliable way to do PITR backups from a PostgreSQL database on one machine to a Netware file server on another? Apparently we will be moving to a Linux-based implementation of Netware at some unspecified future date, at which point we will apparently be able to deal directly with the Linux layer. At that point, there are obvious, clean solutions; but we've got to have something reasonable until such date, which is most likely a few years off. -Kevin
On Tue, Aug 07, 2007 at 06:29:35AM -0500, Kevin Grittner wrote: > We have a database in each Wisconsin county which is part of the official > court record for the circuit courts in that county. We are required to keep > a backup in each county, such that we could recover a lost or corrupted > database with what is in the county (as well as keeping at least four > separate sources, each regularly confirmed as usable, at our central site). > Currently, the only two boxes available besides the database server (in most > counties) are a "utility server" (a Windows desktop class machine used to > push workstation images and other support tasks) and a Novell Netware file > server. > > We had to get something going quickly when we started rolling PostgreSQL out > to the counties as a replacement of a commercial database product; the quick > and (way too) dirty approach was to create a samba share on the utility > server for the target of the archive_command and our base backups. We rsync > from there back to the central site using the samba share, too. We > frequently lose the mount points or they become unresponsive. We also > frequently have corrupted files when they rsync across the WAN, so we want > to move away from all use of samba. > > The Netware server supports ssh, scp, and an rsync daemon. I don't see how > the ssh implementation is helpful, though, since it just gets you to the > Netware console -- you can't cat to a disk file through it, for example. > (At least not as far as we have been able to see.) It appears that the scp > and rsync techniques both require the initial copy of the file to be saved > on the database server itself, which we were hoping to avoid for performance > reasons. > > We could create ncp mounts on the database servers; but, frankly, we haven't > had much better luck in keeping those connected than we have with samba. > Has anyone else been through this and found a robust and reliable way to > do PITR backups from a PostgreSQL database on one machine to a Netware file > server on another? > > Apparently we will be moving to a Linux-based implementation of Netware at > some unspecified future date, at which point we will apparently be able to > deal directly with the Linux layer. At that point, there are obvious, clean > solutions; but we've got to have something reasonable until such date, which > is most likely a few years off. Given your situation, I certainly wouldn't trust WAL files from the Samba server going back to your central site. I think you've got 2 options here: 1) Write a script that copies to the local backup as well as the remote backup and call that script from archive_command. Make certain that the script returns a non-zero exit code if *either* copy operation fails. 2) Have archive_command copy to someplace on the database server, and have another process copy from there to both the local backup as well as the central backup. I'd be inclined to go with 2, since it means that if you can hit at least one backup location you'll keep that backup as up-to-date as possible, whereas with 1, a failure of a mount point or inability to connect to the central location would mean no backup. As for performance, just how hard are you pushing these machines? Copying a 16MB file that's already in memory isn't exactly an intensive operation... -- Decibel!, aka Jim Nasby decibel@decibel.org EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Вложения
> >>> Decibel! <decibel@decibel.org> 08/07/07 1:28 PM >>> > On Tue, Aug 07, 2007 at 06:29:35AM -0500, Kevin Grittner wrote: >> We have a database in each Wisconsin county which is part of the official >> court record for the circuit courts in that county. We are required to keep >> a backup in each county, such that we could recover a lost or corrupted >> database with what is in the county (as well as keeping at least four >> separate sources, each regularly confirmed as usable, at our central site). >> Currently, the only two boxes available besides the database server (in most >> counties) are a "utility server" (a Windows desktop class machine used to >> push workstation images and other support tasks) and a Novell Netware file >> server. >> >> We had to get something going quickly when we started rolling PostgreSQL out >> to the counties as a replacement of a commercial database product; the quick >> and (way too) dirty approach was to create a samba share on the utility >> server for the target of the archive_command and our base backups. We rsync >> from there back to the central site using the samba share, too. We >> frequently lose the mount points or they become unresponsive. We also >> frequently have corrupted files when they rsync across the WAN, so we want >> to move away from all use of samba. >> >> The Netware server supports ssh, scp, and an rsync daemon. I don't see how >> the ssh implementation is helpful, though, since it just gets you to the >> Netware console -- you can't cat to a disk file through it, for example. >> (At least not as far as we have been able to see.) It appears that the scp >> and rsync techniques both require the initial copy of the file to be saved >> on the database server itself, which we were hoping to avoid for performance >> reasons. >> >> We could create ncp mounts on the database servers; but, frankly, we haven't >> had much better luck in keeping those connected than we have with samba. >> Has anyone else been through this and found a robust and reliable way to >> do PITR backups from a PostgreSQL database on one machine to a Netware file >> server on another? >> >> Apparently we will be moving to a Linux-based implementation of Netware at >> some unspecified future date, at which point we will apparently be able to >> deal directly with the Linux layer. At that point, there are obvious, clean >> solutions; but we've got to have something reasonable until such date, which >> is most likely a few years off. > > Given your situation, I certainly wouldn't trust WAL files from the > Samba server going back to your central site. We don't. We feed them all into warm standbys as they arrive, which fail over to production when a gzip is corrupted. Then we gunzip all the files for that county to /dev/null to identify any other corrupted files (the corruptions seem to come in clusters) and delete all the bad ones. We restart the warm standby, let rsync try again, and are back in business. Due to our monitoring software, we get both a jabber popup and an email within a minute of when the failover occurs, so it's not down for long. Keep in mind that when we go to the file server instead of the utility server, we will eliminate the samba shares, and I don't think that we'll see this sort of problem with either an rsync daemon or scp -- at least not at this order of magnitude. > I think you've got 2 options here: > > 1) Write a script that copies to the local backup as well as the remote > backup and call that script from archive_command. Make certain that the > script returns a non-zero exit code if *either* copy operation fails. Absolutely not an option. The management mandate is clear that a WAN failure which blocks the transfer of files back to the central cite must not block the movement of files off of the database server to another box on its LAN. > 2) Have archive_command copy to someplace on the database server, and > have another process copy from there to both the local backup as well as > the central backup. A possible option; although if the rsync daemon on the file server proves reliable, I don't see the benefit over having the archive command make a copy on the database server which flows to the file server and from the file server back to the central site. I'd rather add the load to the file server than the database server. > As for performance, just how hard are you pushing these machines? That varies tremendously with the county and the activity. Milwaukee County keeps a pretty steady load on the database during the business day, and even a moderately sized county can cause people to wait for the computer a bit when the have, for example, a weekly traffic court session or the auditors run a receivables audit report covering the prior year. I'll grant that with PostgreSQL these issues aren't as accute as with the commercial product its replacing. We've replace most of our big central databases and we're about half-way done rolling it out to the counties. We've gotten a lot of comments from end users on how much faster things are with the new system. :-) > Copying a 16MB file that's already in memory isn't exactly an intensive > operation... That's true for the WAL files. The base backups are another story. We will normally have a database vacuum analyze between the base backup and the users being in there to care about performance, but that's not always the case -- sometimes jury trials go late into the night and could overlap with this a base backup. And some judges put in a lot of late hours; although they don't tend to bang on the database very heavily, they hate to be made to wait. Anyway, we're sort of resigned to making the local copy and using an rsync daemon on the file server to move it onward. I was just hoping that someone had picked up on an option within the Novell Netware environment that we have missed. Thanks for your comments, -Kevin
On Tue, Aug 07, 2007 at 02:12:29PM -0500, Kevin Grittner wrote: > > 2) Have archive_command copy to someplace on the database server, and > > have another process copy from there to both the local backup as well as > > the central backup. > > A possible option; although if the rsync daemon on the file server proves > reliable, I don't see the benefit over having the archive command make a copy > on the database server which flows to the file server and from the file server > back to the central site. I'd rather add the load to the file server than the > database server. Yeah, if you can make that work reliably, then I agree it's probably better. > > Copying a 16MB file that's already in memory isn't exactly an intensive > > operation... > > That's true for the WAL files. The base backups are another story. We will > normally have a database vacuum analyze between the base backup and the users > being in there to care about performance, but that's not always the case -- > sometimes jury trials go late into the night and could overlap with this a > base backup. And some judges put in a lot of late hours; although they don't > tend to bang on the database very heavily, they hate to be made to wait. Ahh... well, that's something where rsync could actually help you since it allows you to put a bandwidth cap on it. Another option is that some OSes (FreeBSD for one) will respect process priority when it comes to scheduling IO as well, so if you nice the backup process it hopefully wouldn't impact the database as much. -- Decibel!, aka Jim Nasby decibel@decibel.org EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Вложения
> >>> Decibel! <decibel@decibel.org> 08/07/07 4:51 PM >>> > On Tue, Aug 07, 2007 at 02:12:29PM -0500, Kevin Grittner wrote: > > > Copying a 16MB file that's already in memory isn't exactly an intensive > > > operation... > > > > That's true for the WAL files. The base backups are another story. We will > > normally have a database vacuum analyze between the base backup and the users > > being in there to care about performance, but that's not always the case -- > > sometimes jury trials go late into the night and could overlap with this a > > base backup. And some judges put in a lot of late hours; although they don't > > tend to bang on the database very heavily, they hate to be made to wait. > > Ahh... well, that's something where rsync could actually help you since > it allows you to put a bandwidth cap on it. Another option is that some > OSes (FreeBSD for one) will respect process priority when it comes to > scheduling IO as well, so if you nice the backup process it hopefully > wouldn't impact the database as much. Thanks for the suggestions. A new OS is not in the cards any time soon, but I think the --bwlimit option makes sense -- there's not a lot of point moving it from the database server to the file server faster than the WAN can take it, anyway. I suppose I could "nice" the rsync requester on the database side, too. -Kevin
Kevin Grittner wrote: > The Netware server supports ssh, scp, and an rsync daemon. I don't see how > the ssh implementation is helpful, though, since it just gets you to the > Netware console -- you can't cat to a disk file through it, for example. > (At least not as far as we have been able to see.) It appears that the scp > and rsync techniques both require the initial copy of the file to be saved > on the database server itself, which we were hoping to avoid for performance > reasons. scp & rsync can't really deal well with stdin. However, you can accomplish something with ssh like the following (on Linux): cat source_file | ssh remote_host "cat >/path/to/file" The ssh command will pass everything from its stdin to the cat command on the remote host. You can take the input from any command you want that generates its output on stdout. There are a few caveats though: 1. I have no idea how this will work with Netware. I'd assume it would have the equivalent of "cat" but I have absolutely zero Netware knowledge. 2. I have no idea how robust this is with respect to error detection. My simple tests do return an error back to the calling shell if I don't have permission to create the file on the target server, but I'm not sure if there are any corner cases that may slip by unnoticed. Again, I'm also unsure how Netware will handle any errors. 3. This approach unconditionally clobbers the target file on the remote server. If something dies part-way through, the original file is gone and you will only have part of the new file on the remote server. There are no built-in backup or all-or-nothing capabilities like you have with rsync. You may be able to build a script or use shell operators like && and || to build and pass a compound command to ssh, but again I don't know what Netware can do along those lines. Hope this helps. Andrew
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > Apparently we will be moving to a Linux-based implementation of Netware at > some unspecified future date, at which point we will apparently be able to > deal directly with the Linux layer. At that point, there are obvious, clean > solutions; but we've got to have something reasonable until such date, which > is most likely a few years off. Netware supports NFS. Sincerely, Joshua D. Drake > > -Kevin > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org > - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGuTxDATb/zqfZUUQRAlE4AJ9Qqr9G04e3hHt4qf8isCuqpsXVbwCgiCuq sgdxcqajSbH2ZbfU2r9FCt8= =hn6C -----END PGP SIGNATURE-----
>>> On Tue, Aug 7, 2007 at 10:31 PM, in message <46B93907.3060401@sprocks.gotdns.com>, Andrew Kroeger <andrew@sprocks.gotdns.com> wrote: > Kevin Grittner wrote: >> The Netware server supports ssh, scp, and an rsync daemon. I don't see how >> the ssh implementation is helpful, though, since it just gets you to the >> Netware console -- you can't cat to a disk file through it, for example. > > scp & rsync can't really deal well with stdin. However, you can > accomplish something with ssh like the following (on Linux): > > cat source_file | ssh remote_host "cat >/path/to/file" > Right, that's what I was saying I couldn't see how to do with Netware, because the ssh just gets you to the Netware console. We have been using that technique a lot with Linux. I trust that error checking is covered by the ssh layer and the TCP layer on which it rides. -Kevin