Re: PITR backup to Novell Netware file server

Поиск
Список
Период
Сортировка
От Kevin Grittner
Тема Re: PITR backup to Novell Netware file server
Дата
Msg-id 46B87DCD.EE98.0025.0@wicourts.gov
обсуждение исходный текст
Ответ на Re: PITR backup to Novell Netware file server  (Decibel! <decibel@decibel.org>)
Ответы Re: PITR backup to Novell Netware file server  (Decibel! <decibel@decibel.org>)
Список pgsql-admin
> >>> Decibel! <decibel@decibel.org> 08/07/07 1:28 PM >>>
> On Tue, Aug 07, 2007 at 06:29:35AM -0500, Kevin Grittner wrote:
>> We have a database in each Wisconsin county which is part of the official
>> court record for the circuit courts in that county.  We are required to keep
>> a backup in each county, such that we could recover a lost or corrupted
>> database with what is in the county (as well as keeping at least four
>> separate sources, each regularly confirmed as usable, at our central site).
>> Currently, the only two boxes available besides the database server (in most
>> counties) are a "utility server" (a Windows desktop class machine used to
>> push workstation images and other support tasks) and a Novell Netware file
>> server.
>>
>> We had to get something going quickly when we started rolling PostgreSQL out
>> to the counties as a replacement of a commercial database product; the quick
>> and (way too) dirty approach was to create a samba share on the utility
>> server for the target of the archive_command and our base backups.  We rsync
>> from there back to the central site using the samba share, too.  We
>> frequently lose the mount points or they become unresponsive.  We also
>> frequently have corrupted files when they rsync across the WAN, so we want
>> to move away from all use of samba.
>>
>> The Netware server supports ssh, scp, and an rsync daemon.  I don't see how
>> the ssh implementation is helpful, though, since it just gets you to the
>> Netware console -- you can't cat to a disk file through it, for example.
>> (At least not as far as we have been able to see.)  It appears that the scp
>> and rsync techniques both require the initial copy of the file to be saved
>> on the database server itself, which we were hoping to avoid for performance
>> reasons.
>>
>> We could create ncp mounts on the database servers; but, frankly, we haven't
>> had much better luck in keeping those connected than we have with samba.
>> Has anyone else been through this and found a robust and reliable way to
>> do PITR backups from a PostgreSQL database on one machine to a Netware file
>> server on another?
>>
>> Apparently we will be moving to a Linux-based implementation of Netware at
>> some unspecified future date, at which point we will apparently be able to
>> deal directly with the Linux layer.  At that point, there are obvious, clean
>> solutions; but we've got to have something reasonable until such date, which
>> is most likely a few years off.
>
> Given your situation, I certainly wouldn't trust WAL files from the
> Samba server going back to your central site.

We don't.  We feed them all into warm standbys as they arrive, which fail over
to production when a gzip is corrupted.  Then we gunzip all the files for that
county to /dev/null to identify any other corrupted files (the corruptions seem
to come in clusters) and delete all the bad ones.  We restart the warm standby,
let rsync try again, and are back in business.  Due to our monitoring software,
we get both a jabber popup and an email within a minute of when the failover
occurs, so it's not down for long.

Keep in mind that when we go to the file server instead of the utility server,
we will eliminate the samba shares, and I don't think that we'll see this sort
of problem with either an rsync daemon or scp -- at least not at this order of
magnitude.

> I think you've got 2 options here:
>
> 1) Write a script that copies to the local backup as well as the remote
> backup and call that script from archive_command. Make certain that the
> script returns a non-zero exit code if *either* copy operation fails.

Absolutely not an option.  The management mandate is clear that a WAN failure
which blocks the transfer of files back to the central cite must not block the
movement of files off of the database server to another box on its LAN.

> 2) Have archive_command copy to someplace on the database server, and
> have another process copy from there to both the local backup as well as
> the central backup.

A possible option; although if the rsync daemon on the file server proves
reliable, I don't see the benefit over having the archive command make a copy
on the database server which flows to the file server and from the file server
back to the central site.  I'd rather add the load to the file server than the
database server.

> As for performance, just how hard are you pushing these machines?

That varies tremendously with the county and the activity.  Milwaukee County
keeps a pretty steady load on the database during the business day, and even
a moderately sized county can cause people to wait for the computer a bit when
the have, for example, a weekly traffic court session or the auditors run a
receivables audit report covering the prior year.

I'll grant that with PostgreSQL these issues aren't as accute as with the
commercial product its replacing.  We've replace most of our big central
databases and we're about half-way done rolling it out to the counties.  We've
gotten a lot of comments from end users on how much faster things are with the
new system.  :-)

> Copying a 16MB file that's already in memory isn't exactly an intensive
> operation...

That's true for the WAL files.  The base backups are another story.  We will
normally have a database vacuum analyze between the base backup and the users
being in there to care about performance, but that's not always the case --
sometimes jury trials go late into the night and could overlap with this a
base backup.  And some judges put in a lot of late hours; although they don't
tend to bang on the database very heavily, they hate to be made to wait.

Anyway, we're sort of resigned to making the local copy and using an rsync
daemon on the file server to move it onward.  I was just hoping that someone
had picked up on an option within the Novell Netware environment that we have
missed.

Thanks for your comments,

-Kevin



В списке pgsql-admin по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Transaction-Overflow
Следующее
От: Martin Fandel
Дата:
Сообщение: Transaction-Override