Обсуждение: HA for high insert volume
I'm looking for some suggestions for setting up a remote failover Postgres instance (warm spare). In my situation, the db to be failed over has a very high # of INSERTS (it does a lot of logging), and it seems like there might be too much overhead using something like usogres or one of the replication options. We expect it to grow very quickly to the point at which a full dump and load is not practical (5-15G). Are people having luck using something like rsync in this instance? This document : http://www.taygeta.com/ha-postgresql.html describes using rsync, but in the case where the # of inserts is low. Is that working for people, or is there a better way to do this? We're willing to consider commercial solutions but would prefer OSS. Thanks in advance, John Sequeira
On Wed, Oct 30, 2002 at 04:27:33PM -0500, John Sequeira wrote: > over has a very high # of INSERTS (it does a lot of logging), and it > seems like there might be too much overhead using something like usogres > or one of the replication options. We expect it to grow very quickly to How many inserts? We have tested eRServer (commercial version) to > 1000 inserts per second, and it kept up reasonably well. > Are people having luck using something like rsync in this instance? > This document : > http://www.taygeta.com/ha-postgresql.html > describes using rsync, but in the case where the # of inserts is low. > > Is that working for people, or is there a better way to do this? It will not work safely. This is exactly as dangerous as doing tar backups of your filesystem, except that tar has to get all of the files whereas rsync can get just the things that changed. But there is _certainly_ a window in which you could get an inconsistent snapshot, and then your backup would be completely useless. A high volume of write operations is a good way to cause the problem. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
On Mon, Nov 04, 2002 at 10:06:59AM -0500, Eric Comeau wrote: > Andrew > > 1) Would you use erServer in a non-DBA environment? I'm not sure what a non-DBA environment is -- _someone_ has to be running the database. But if your question is, "Does someone need to hold its hand all the time?" the answer is no. It is a bit of a bear to set up, but I gather they're fixing that for a client right now. I suppose we have a DBA environment here; and we have needed to be a little creative from time to time when we were first getting the system going (we funded part of the development of it, so it was pretty raw there for a while). But I don't think it's any more complicated than running, say, a properly-configured mail server. > 2) Could you point me to further info on what type of failures could > possibly occur using the solution below with Rsync? Sure. http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/backup-file.html You say you tested it, and it worked for you. That's nice, and it's certainly possible; but there is a severe risk of corruption of the second database doing things this way. Since part of the point here is safety, doing things in a manner that is _by design_ unsafe is a bad idea. > Initially when I read the article below, I was skeptical it would work, > as I said to myself how is that any different than taking an OS backup > of the db why the db is still up and active - something you normally > don't want to do. Right. For the same reasons it worked for you, tarring the whole data directory might just work, if you happened to be lucky. If you happen not to be lucky, you will discover that when you go to restore your backup -- a time that I, at least, don't want to learn of yet another bit of bad luck. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
Andrew 1) Would you use erServer in a non-DBA environment? 2) Could you point me to further info on what type of failures could possibly occur using the solution below with Rsync? Initially when I read the article below, I was skeptical it would work, as I said to myself how is that any different than taking an OS backup of the db why the db is still up and active - something you normally don't want to do. Like John, I am looking for a HA solution (standby database), but my solution must be _Simple_ .I was initially looking for something like Oracle's standby database, where they use the transaction log to continuously roll-forward the standby database. I have done some simple testing using the solution mentioned below (using scripted FTP instead, to move the dbs between servers since they were small). I ran the test for 1 week, every 30 minutes I would copy the db across, startup postgres on the remote server and dump the complete database to verify I could read all the records. On the master (original db) I had a very simple transaction running continuously (update, insert, delete). From the postgres log on the remote, I could see that it would detect the database was _not_ shutdown cleanly and would go through its recovery process to verify everything was sane. In all tests it can dumped the database without a problem. I will be doing further testing with Rsync and a heavier load on the system to see if I can break it. I haven't dug into the source code yet to see if I could shed any light on how it might break. I would like to find a test where I could make the solution fail to get a better understanding of the issues with using this situation. My problem is I need a _Simple_ solution, and I'm concerned that using one of the replication solutions is not simple enough. (I have used Oracle's Advanced replication in the past) Thanks Eric Comeau <ecomeau@signiant.com> -----Original Message----- From: Andrew Sullivan [mailto:andrew@libertyrms.info] Sent: Friday, November 01, 2002 1:27 PM To: pgsql-admin@postgresql.org Subject: Re: HA for high insert volume On Wed, Oct 30, 2002 at 04:27:33PM -0500, John Sequeira wrote: > over has a very high # of INSERTS (it does a lot of logging), and it > seems like there might be too much overhead using something like usogres > or one of the replication options. We expect it to grow very quickly to How many inserts? We have tested eRServer (commercial version) to > 1000 inserts per second, and it kept up reasonably well. > Are people having luck using something like rsync in this instance? > This document : > http://www.taygeta.com/ha-postgresql.html > describes using rsync, but in the case where the # of inserts is low. > > Is that working for people, or is there a better way to do this? It will not work safely. This is exactly as dangerous as doing tar backups of your filesystem, except that tar has to get all of the files whereas rsync can get just the things that changed. But there is _certainly_ a window in which you could get an inconsistent snapshot, and then your backup would be completely useless. A high volume of write operations is a good way to cause the problem. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110