Обсуждение: Request for replication advice

Поиск
Список
Период
Сортировка

Request for replication advice

От
"Brendan Jurd"
Дата:
Hi all,

Yes, it's yet another request for advice on replicating pgsql.  I have
been doing some research on this lately, and so far I am getting the
impression that what I want is Slony + log shipping.  The scenario is:

 * One master postgres database on the internal, which is being
frequently updated by the internal users.
 * One slave postgres database which is hosted remotely, available
from the Internet, and accessible via an IPsec tunnel from the
intranet.

The purpose of the public slave is to allow external users read-only
access to the data in the master database, without compromising the
security of the internal network.

To this end, we would like to "push" all inserts, updates and deletes
out to the slave database from the master database, with a purely
one-way connection (the slave is never allowed to connect in to the
internal network).

It is important that the data in the slave be as current as possible
with the live data.  A delay of several minutes for any given update
to propagate to the slave would be tolerable, but any more than that
and the application would lose much of its appeal.  So it's okay if
the replication is asynchronous, so long as it is fast.

The slave also needs to be highly available 24/7.

So, my question for the list is: is Slony + log shipping the direction
I should be investigating, or is there something else out that I ought
to consider?  My understanding of WAL-based replication is that the
slave is purely a standby, and cannot be used to service queries.

TIA,
BJ

Re: Request for replication advice

От
Brad Nicholson
Дата:
On Sat, 2006-11-11 at 06:34 +1100, Brendan Jurd wrote:

> So, my question for the list is: is Slony + log shipping the direction
> I should be investigating, or is there something else out that I ought
> to consider?  My understanding of WAL-based replication is that the

This is certainly the direction to look at.  The speed of which
replication keeps on on a log shipped node is going to depend on the
frequency of inserts/updates/deletes, you class of hardware and network
connectivity.

You also need to be absolutely certain that long running queries and
'idle in transaction' transactions are minimized as much as possible (or
completely eliminated).  Anything that prevents vacuums from doing there
job properly is going to hurt a slony replica.

Oh, and regarding log shipping itself, I recommend waiting for the
release of 1.1.6 or 1.2.1, there is a fairly serious bug in the current
version that causes log shipping to fall over if you have more than 2
nodes in your config (not just log shipped nodes).

If you have more questions, please sign up for the Slony list.

--
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.


Re: Request for replication advice

От
Tom Lane
Дата:
"Brendan Jurd" <direvus@gmail.com> writes:
> So, my question for the list is: is Slony + log shipping the direction
> I should be investigating, or is there something else out that I ought
> to consider?

Those are two different methods: you'd use one or the other, not both.

Slony-I is much the more battle-tested of the two at the moment.  In
theory WAL log shipping should be higher performance for heavy-update
scenarios, but its latency is variable (low update rate = higher
latency), and not easy to put a bound on pre-8.2.

Have you looked at the replication info recently added to the PG docs?
http://developer.postgresql.org/pgdocs/postgres/failover.html

            regards, tom lane

Re: Request for replication advice

От
Brad Nicholson
Дата:
On Fri, 2006-11-10 at 15:07 -0500, Tom Lane wrote:
> "Brendan Jurd" <direvus@gmail.com> writes:
> > So, my question for the list is: is Slony + log shipping the direction
> > I should be investigating, or is there something else out that I ought
> > to consider?
>
> Those are two different methods: you'd use one or the other, not both.

Slony has its own log shipping, I think that was what he was referring
to.

> Slony-I is much the more battle-tested of the two at the moment.  In
> theory WAL log shipping should be higher performance for heavy-update
> scenarios, but its latency is variable (low update rate = higher
> latency), and not easy to put a bound on pre-8.2.

I'm not entirely sure how battle tested the Slony log shipping stuff
actually is.
--
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.


Re: Request for replication advice

От
Tom Lane
Дата:
Brad Nicholson <bnichols@ca.afilias.info> writes:
> On Fri, 2006-11-10 at 15:07 -0500, Tom Lane wrote:
>> Those are two different methods: you'd use one or the other, not both.

> Slony has its own log shipping, I think that was what he was referring
> to.

Oh, OK, I was thinking of the trigger-driven version.

            regards, tom lane

Re: Request for replication advice

От
Brad Nicholson
Дата:
On Fri, 2006-11-10 at 15:16 -0500, Tom Lane wrote:
> Brad Nicholson <bnichols@ca.afilias.info> writes:
> > On Fri, 2006-11-10 at 15:07 -0500, Tom Lane wrote:
> >> Those are two different methods: you'd use one or the other, not both.
>
> > Slony has its own log shipping, I think that was what he was referring
> > to.
>
> Oh, OK, I was thinking of the trigger-driven version.

Same thing, actually.  There's an option that tells the slon daemon to
write the data syncs to files instead of applying directly to a
subscriber.
--
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.


Re: Request for replication advice

От
Chris Browne
Дата:
tgl@sss.pgh.pa.us (Tom Lane) writes:
> Brad Nicholson <bnichols@ca.afilias.info> writes:
>> On Fri, 2006-11-10 at 15:07 -0500, Tom Lane wrote:
>>> Those are two different methods: you'd use one or the other, not both.
>
>> Slony has its own log shipping, I think that was what he was referring
>> to.
>
> Oh, OK, I was thinking of the trigger-driven version.

Slony-I is /always/ trigger-driven...

The "extra" that "log shipping" offers is that you can configure a
subscriber to serialize queries into a file as well as applying them
to the subscriber.  (It would be nice to have the option of not
bothering with the subscriber, but that hasn't been tried yet.)

Thus, you have a series of "log files" that contain Plain Ordinary
SQL.  (There's a *little* bit of Slony-I-specific stuff in them, that
could be trivially trimmed out.)

Those log files get spooled in a directory; you could do just about
anything you like with them, such as FTPing them somewhere, burning
them onto a DVD, or copying them onto a little USB key like
<http://www.kingmaxdigi.com/product/superstick.htm> this one, and
attach that to the leg of a bird, to implement some "avian data
transfer protocol" :-).
--
let name="cbbrowne" and tld="acm.org" in String.concat "@" [name;tld];;
http://linuxdatabases.info/info/logshipping.html
Why  are  there  flotation   devices  under  plane  seats  instead  of
parachutes?

Re: Request for replication advice

От
"Brendan Jurd"
Дата:
On 11/11/06, Brad Nicholson <bnichols@ca.afilias.info> wrote:
> On Fri, 2006-11-10 at 15:07 -0500, Tom Lane wrote:
> > "Brendan Jurd" <direvus@gmail.com> writes:
> > > So, my question for the list is: is Slony + log shipping the direction
> > > I should be investigating, or is there something else out that I ought
> > > to consider?
> >
> > Those are two different methods: you'd use one or the other, not both.
>
> Slony has its own log shipping, I think that was what he was referring
> to.

Indeed I was; sorry if my terminology caused confusion.

The reason I am looking at Slony with log shipping is that it can
operate across a one-way connection, whereas plain Slony requires
communication in both directions.  A bi-directional connection would
negate the purpose of having two separate databases, which is to
protect the internal database (and the internal network as a whole)
from a compromised external system.

If we were willing to have a bi-directional connection, I don't see
any further disadvantage in allowing the external application(s) to
connect straight into our internal postgres database over the IPsec
tunnel, and ignoring the replication issue entirely.

Re: Request for replication advice

От
Chris Browne
Дата:
direvus@gmail.com ("Brendan Jurd") writes:

> On 11/11/06, Brad Nicholson <bnichols@ca.afilias.info> wrote:
>> On Fri, 2006-11-10 at 15:07 -0500, Tom Lane wrote:
>> > "Brendan Jurd" <direvus@gmail.com> writes:
>> > > So, my question for the list is: is Slony + log shipping the direction
>> > > I should be investigating, or is there something else out that I ought
>> > > to consider?
>> >
>> > Those are two different methods: you'd use one or the other, not both.
>>
>> Slony has its own log shipping, I think that was what he was referring
>> to.
>
> Indeed I was; sorry if my terminology caused confusion.
>
> The reason I am looking at Slony with log shipping is that it can
> operate across a one-way connection, whereas plain Slony requires
> communication in both directions.  A bi-directional connection would
> negate the purpose of having two separate databases, which is to
> protect the internal database (and the internal network as a whole)
> from a compromised external system.
>
> If we were willing to have a bi-directional connection, I don't see
> any further disadvantage in allowing the external application(s) to
> connect straight into our internal postgres database over the IPsec
> tunnel, and ignoring the replication issue entirely.

Let me point out one possible downside to using Slony-I log shipping;
it may not be an issue for you, but it's worth observing...

Log shipping works via serializing the subscription work done on a
subscriber to files.  Thus, you MUST have at least one subscriber in
order to have log shipping work.  If that's a problem, that's a
problem...
--
(format nil "~S@~S" "cbbrowne" "acm.org")
http://linuxfinances.info/info/oses.html
Microsoft Outlook: Deploying Viruses Has Never Been This Easy!

Re: Request for replication advice

От
"Brendan Jurd"
Дата:
On 11/11/06, Chris Browne <cbbrowne@acm.org> wrote:
> Let me point out one possible downside to using Slony-I log shipping;
> it may not be an issue for you, but it's worth observing...
>
> Log shipping works via serializing the subscription work done on a
> subscriber to files.  Thus, you MUST have at least one subscriber in
> order to have log shipping work.  If that's a problem, that's a
> problem...

So I would have a normal Slony subscriber sitting somewhere on the
internal network, which pushes its log files out to the remote server.
 And the remote server then has a process sitting on it which inhales
the log files into the database as they arrive.

Have I got the right idea?

Why *does* Slony require a bi-directional connection to the
subscriber?  The data is travelling in one direction only ... what
needs to come back the other way?

This seems to be getting rather messy.  I wonder if I might not be
better off just writing AFTER triggers on all the tables I'm
interested in, which replicate the query to the slave system with
psql.  It would probably be relatively labour intensive, and increase
the burden of administering the schema, but it would also be a much
more direct and simple approach.

BJ

Re: Request for replication advice

От
Shane Ambler
Дата:
Brendan Jurd wrote:

> Why *does* Slony require a bi-directional connection to the
> subscriber?  The data is travelling in one direction only ... what
> needs to come back the other way?

So the slave can say "yes I got that data you can remove it from my TODO
list" ?

> This seems to be getting rather messy.  I wonder if I might not be
> better off just writing AFTER triggers on all the tables I'm
> interested in, which replicate the query to the slave system with
> psql.  It would probably be relatively labour intensive, and increase
> the burden of administering the schema, but it would also be a much
> more direct and simple approach.
>

That sounds like contrib/dbmirror


--

Shane Ambler
pgSQL@007Marketing.com

Get Sheeky @ http://Sheeky.Biz

Re: Request for replication advice

От
Chris Browne
Дата:
direvus@gmail.com ("Brendan Jurd") writes:
> On 11/11/06, Chris Browne <cbbrowne@acm.org> wrote:
>> Let me point out one possible downside to using Slony-I log shipping;
>> it may not be an issue for you, but it's worth observing...
>>
>> Log shipping works via serializing the subscription work done on a
>> subscriber to files.  Thus, you MUST have at least one subscriber in
>> order to have log shipping work.  If that's a problem, that's a
>> problem...
>
> So I would have a normal Slony subscriber sitting somewhere on the
> internal network, which pushes its log files out to the remote server.
> And the remote server then has a process sitting on it which inhales
> the log files into the database as they arrive.
>
> Have I got the right idea?

Exactly.

> Why *does* Slony require a bi-directional connection to the
> subscriber?  The data is travelling in one direction only ... what
> needs to come back the other way?

- So that any node might be promoted to be origin in case of
  emergency.

- So that nodes know when logged data (sl_log_{1,2}) can be safely
  purged.

> This seems to be getting rather messy.  I wonder if I might not be
> better off just writing AFTER triggers on all the tables I'm
> interested in, which replicate the query to the slave system with
> psql.  It would probably be relatively labour intensive, and
> increase the burden of administering the schema, but it would also
> be a much more direct and simple approach.

There might be some improved elegance in that; it is quite possible
that Slony-I has more functionality than you need.
--
"cbbrowne","@","linuxdatabases.info"
http://linuxfinances.info/info/slony.html
"In man-machine symbiosis, it is man who must adjust: The machines
can't." -- Alan J. Perlis

Re: Request for replication advice

От
"Brendan Jurd"
Дата:
On 11/14/06, Chris Browne <cbbrowne@acm.org> wrote:
> direvus@gmail.com ("Brendan Jurd") writes:
> > This seems to be getting rather messy.  I wonder if I might not be
> > better off just writing AFTER triggers on all the tables I'm
> > interested in, which replicate the query to the slave system with
> > psql.  It would probably be relatively labour intensive, and
> > increase the burden of administering the schema, but it would also
> > be a much more direct and simple approach.
>
> There might be some improved elegance in that; it is quite possible
> that Slony-I has more functionality than you need.

I think that's the case.

Well, I believe I have all the information I need on this.  Thanks to
everyone who replied, I appreciate your help.

BJ