Обсуждение: HA for high insert volume

Поиск
Список
Период
Сортировка

HA for high insert volume

От
John Sequeira
Дата:
I'm looking for some suggestions for setting up a remote failover
Postgres instance (warm spare).  In my situation,  the db to be failed
over has a very high # of INSERTS (it does a lot of logging),  and it
seems like there might be too much overhead using something like usogres
or one of the replication options.  We expect it to grow very quickly to
the point at which a full dump and load is not practical (5-15G).

Are people having luck using something like rsync in this instance?
This document :
http://www.taygeta.com/ha-postgresql.html
describes using rsync,  but in the case where the #  of inserts is low.

Is that working for people,  or is there a better way to do this?
We're willing to consider commercial solutions but would prefer OSS.

Thanks in advance,

John Sequeira


Re: HA for high insert volume

От
Andrew Sullivan
Дата:
On Wed, Oct 30, 2002 at 04:27:33PM -0500, John Sequeira wrote:

> over has a very high # of INSERTS (it does a lot of logging),  and it
> seems like there might be too much overhead using something like usogres
> or one of the replication options.  We expect it to grow very quickly to

How many inserts?  We have tested eRServer (commercial version) to
> 1000 inserts per second, and it kept up reasonably well.

> Are people having luck using something like rsync in this instance?
> This document :
> http://www.taygeta.com/ha-postgresql.html
> describes using rsync,  but in the case where the #  of inserts is low.
>
> Is that working for people,  or is there a better way to do this?

It will not work safely.  This is exactly as dangerous as doing tar
backups of your filesystem, except that tar has to get all of the
files whereas rsync can get just the things that changed.  But there
is _certainly_ a window in which you could get an inconsistent
snapshot, and then your backup would be completely useless.  A high
volume of write operations is a good way to cause the problem.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: HA for high insert volume

От
Andrew Sullivan
Дата:
On Mon, Nov 04, 2002 at 10:06:59AM -0500, Eric Comeau wrote:
> Andrew
>
> 1) Would you use erServer in a non-DBA environment?

I'm not sure what a non-DBA environment is -- _someone_ has to be
running the database.  But if your question is, "Does someone need to
hold its hand all the time?" the answer is no.  It is a bit of a bear
to set up, but I gather they're fixing that for a client right now.
I suppose we have a DBA environment here; and we have needed to be a
little creative from time to time when we were first getting the
system going (we funded part of the development of it, so it was
pretty raw there for a while).  But I don't think it's any more
complicated than running, say, a properly-configured mail server.

> 2) Could you point me to further info on what type of failures could
> possibly occur using the solution below with Rsync?

Sure.

http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/backup-file.html

You say you tested it, and it worked for you.  That's nice, and it's
certainly possible; but there is a severe risk of corruption of the
second database doing things this way.  Since part of the point here
is safety, doing things in a manner that is _by design_ unsafe is a
bad idea.

> Initially when I read the article below, I was skeptical it would work,
> as I said to myself how is that any different than taking an OS backup
> of the db why the db is still up and active - something you normally
> don't want to do.

Right.  For the same reasons it worked for you, tarring the whole
data directory might just work, if you happened to be lucky.  If you
happen not to be lucky, you will discover that when you go to restore
your backup -- a time that I, at least, don't want to learn of yet
another bit of bad luck.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: HA for high insert volume

От
"Eric Comeau"
Дата:
Andrew

1) Would you use erServer in a non-DBA environment?
2) Could you point me to further info on what type of failures could
possibly occur using the solution below with Rsync?

Initially when I read the article below, I was skeptical it would work,
as I said to myself how is that any different than taking an OS backup
of the db why the db is still up and active - something you normally
don't want to do.

Like John, I am looking for a HA solution (standby database), but my
solution must be _Simple_ .I was initially looking for something like
Oracle's standby database, where they use the transaction log to
continuously roll-forward the standby database.

I have done some simple testing using the solution mentioned below
(using scripted FTP instead, to move the dbs between servers since they
were small).

I ran the test for 1 week, every 30 minutes I would copy the db across,
startup postgres on the remote server and dump the complete database to
verify I could read all the records. On the master (original db) I had a
very simple transaction running continuously (update, insert, delete).
From the postgres log on the remote, I could see that it would detect
the database was _not_ shutdown cleanly and would go through its
recovery process to verify everything was sane. In all tests it can
dumped the database without a problem.

I will be doing further testing with Rsync and a heavier load on the
system to see if I can break it. I haven't dug into the source code yet
to see if I could shed any light on how it might break.

I would like to find a test where I could make the solution fail to get
a better understanding of the issues with using this situation. My
problem is I need a _Simple_ solution, and I'm concerned that using one
of the replication solutions is not simple enough. (I have used Oracle's
Advanced replication in the past)

Thanks
Eric Comeau
<ecomeau@signiant.com>



-----Original Message-----
From: Andrew Sullivan [mailto:andrew@libertyrms.info]
Sent: Friday, November 01, 2002 1:27 PM
To: pgsql-admin@postgresql.org
Subject: Re: HA for high insert volume


On Wed, Oct 30, 2002 at 04:27:33PM -0500, John Sequeira wrote:

> over has a very high # of INSERTS (it does a lot of logging),  and it
> seems like there might be too much overhead using something like
usogres
> or one of the replication options.  We expect it to grow very quickly
to

How many inserts?  We have tested eRServer (commercial version) to
> 1000 inserts per second, and it kept up reasonably well.

> Are people having luck using something like rsync in this instance?
> This document :
> http://www.taygeta.com/ha-postgresql.html
> describes using rsync,  but in the case where the #  of inserts is
low.
>
> Is that working for people,  or is there a better way to do this?

It will not work safely.  This is exactly as dangerous as doing tar
backups of your filesystem, except that tar has to get all of the
files whereas rsync can get just the things that changed.  But there
is _certainly_ a window in which you could get an inconsistent
snapshot, and then your backup would be completely useless.  A high
volume of write operations is a good way to cause the problem.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110