Обсуждение: Streaming-only Remastering

Поиск
Список
Период
Сортировка

Streaming-only Remastering

От
Joshua Berkus
Дата:
So currently we have a major limitation in binary replication, where it is not possible to "remaster" your system (that
is,designate the most caught-up standby as the new master) based on streaming replication only.  This is a major
limitationbecause the requirement to copy physical logs over scp (or similar methods), manage and expire them more than
doublesthe administrative overhead of managing replication.  This becomes even more of a problem if you're doing
cascadingreplication.
 

Therefore I think this is a high priority for 9.3.

As far as I can tell, the change required for remastering over streaming is relatively small; we just need to add a new
recordtype to the streaming protocol, and then start writing the timeline change to that.  Are there other steps
requiredwhich I'm not seeing?
 

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco


Re: Streaming-only Remastering

От
Rob Wultsch
Дата:
On Sun, Jun 10, 2012 at 11:47 AM, Joshua Berkus <josh@agliodbs.com> wrote:
> So currently we have a major limitation in binary replication, where it is not possible to "remaster" your system
(thatis, designate the most caught-up standby as the new master) based on streaming replication only.  This is a major
limitationbecause the requirement to copy physical logs over scp (or similar methods), manage and expire them more than
doublesthe administrative overhead of managing replication.  This becomes even more of a problem if you're doing
cascadingreplication. 
>
> Therefore I think this is a high priority for 9.3.
>
> As far as I can tell, the change required for remastering over streaming is relatively small; we just need to add a
newrecord type to the streaming protocol, and then start writing the timeline change to that.  Are there other steps
requiredwhich I'm not seeing? 
>

Problem that may exist and is likely out of scope:
It is possible for a master with multiple slave servers to have slaves
which have not read all of the logs off of the master. It is annoying
to have to rebuild a replica because it was 1kb behind in reading logs
from the master. If the new master could deliver the last bit of the
old masters logs that would be very nice.

--
Rob Wultsch
wultsch@gmail.com


Re: Streaming-only Remastering

От
Josh Berkus
Дата:
On 6/10/12 11:47 AM, Joshua Berkus wrote:
> So currently we have a major limitation in binary replication, where it is not possible to "remaster" your system
(thatis, designate the most caught-up standby as the new master) based on streaming replication only.  This is a major
limitationbecause the requirement to copy physical logs over scp (or similar methods), manage and expire them more than
doublesthe administrative overhead of managing replication.  This becomes even more of a problem if you're doing
cascadingreplication.
 
> 
> Therefore I think this is a high priority for 9.3.
> 
> As far as I can tell, the change required for remastering over streaming is relatively small; we just need to add a
newrecord type to the streaming protocol, and then start writing the timeline change to that.  Are there other steps
requiredwhich I'm not seeing?
 

*sound of crickets chirping*

Is there other work involved which isn't immediately apparent?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Re: Streaming-only Remastering

От
Simon Riggs
Дата:
On 10 June 2012 19:47, Joshua Berkus <josh@agliodbs.com> wrote:

> So currently we have a major limitation in binary replication, where it is not possible to "remaster" your system
(thatis, designate the most caught-up standby as the new master) based on streaming replication only.  This is a major
limitationbecause the requirement to copy physical logs over scp (or similar methods), manage and expire them more than
doublesthe administrative overhead of managing replication.  This becomes even more of a problem if you're doing
cascadingreplication. 

The "major limitation" was solved by repmgr close to 2 years ago now.
So while you're correct that the patch to fix that assumed that
archiving worked as well, it has been possible to operate happily
without it.

http://www.repmgr.org

New versions for 9.2 will be out soon.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Streaming-only Remastering

От
Magnus Hagander
Дата:
On Sat, Jun 16, 2012 at 6:53 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 10 June 2012 19:47, Joshua Berkus <josh@agliodbs.com> wrote:
>
>> So currently we have a major limitation in binary replication, where it is not possible to "remaster" your system
(thatis, designate the most caught-up standby as the new master) based on streaming replication only.  This is a major
limitationbecause the requirement to copy physical logs over scp (or similar methods), manage and expire them more than
doublesthe administrative overhead of managing replication.  This becomes even more of a problem if you're doing
cascadingreplication. 
>
> The "major limitation" was solved by repmgr close to 2 years ago now.

It was solved for limited (but important) cases.

For example, repmgr does (afaik, maybe I missed a major update at some
point?) still require you to have set up ssh with trusted keys between
the servers. There are many usecases where that's not an acceptable
solution. One of the more obvious ones being when you're on Windows.

repmgr hasn't really *solved* it, it has provided a well working workaround...

IIRC repmgs is also GPLv3, which means that some companies just won't
look at it... Not many, but some. And it's a license that's
incompatible with PostgreSQL itself.


> So while you're correct that the patch to fix that assumed that
> archiving worked as well, it has been possible to operate happily
> without it.
>
> http://www.repmgr.org
>
> New versions for 9.2 will be out soon.

That's certainly good, but that doesn't actually solve the problem
either. It updates the good workaround.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: Streaming-only Remastering

От
Daniel Farina
Дата:
On Fri, Jun 15, 2012 at 3:53 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 10 June 2012 19:47, Joshua Berkus <josh@agliodbs.com> wrote:
>
>> So currently we have a major limitation in binary replication, where it is not possible to "remaster" your system
(thatis, designate the most caught-up standby as the new master) based on streaming replication only.  This is a major
limitationbecause the requirement to copy physical logs over scp (or similar methods), manage and expire them more than
doublesthe administrative overhead of managing replication.  This becomes even more of a problem if you're doing
cascadingreplication. 
>
> The "major limitation" was solved by repmgr close to 2 years ago now.
> So while you're correct that the patch to fix that assumed that
> archiving worked as well, it has been possible to operate happily
> without it.

Remastering is one of the biggest thorns in my side over the last
year.  I don't think it's yet a trivially mechanized issue yet, but I
do need to get there, and probably a few alterations in Postgres would
help, although I have not itemized what they are (rather, I was
intending to work around problems with what I have today).  But since
it is apropos to this discussion, here's what I've been thinking along
these lines:

Instead of using re-synchronization (e.g. repmgr in its relation to
rsync), I intend to proxy and also inspect the streaming replication
traffic and then quiesce all standbys and figure out what node is
farthest ahead.  Once I figure out the node that is farthest ahead, if
it is not a node that is eligible for promotion to the master, I need
to exchange its changes to nodes that are eligible for promotion[0],
and then promote one of those, repointing all other standbys to that
node. This must all take place nominally within a second or thirty.
Conceptually it is simple, but mechanically it's somewhat intense,
especially in relation to the inconvenience of doing this incorrectly.

I surmise someone could come up with supporting mechanisms to make it
less burdensome to write.

One snarl is the interaction with the archive and restore commands:
Postgres might, for example, have been in the middle of  download and
replaying a WAL segment even when I wish to be quiesced, and there's
not a great way to stop it[1].

Ideally, I could replace those archive/dearchive commands with
software that speaks the streaming replication protocol and just have
less code involved overall.  I think that is technically possible
today, but maybe could be made easier, in particular being able to
more easily chunk and align the WAL stream into units of some kind
from the streaming protocol.  Maybe it's already possible, but it will
take a little thinking.  I had already written off getting this level
of cohesion in the next year (intending a detailed mix of
archive_command and streaming protocol software), but it's not
something that leaves me close to satisfied by any measure.

Furthermore, some use cases demand that no matter what the user
setting with regard to syncrep is that Postgres not make progress
unless it has synchronously replicated to a special piece of proxy
software.  This is useful if one wants to offload the exact location
and storage strategy for crash recovery to another piece of software.
That's the obvious next step after a cohesive delegation of
(de-)archiving.

So, all in all, Postgres has no great way to cohesively delegate all
WAL-persistence and WAL-restoration and I don't know if the streaming
protocol + sync rep facilities can completely conveniently subsume all
those use cases (but I think it probably can without enormous
modification).  I think it should learn what it needs to learn to make
that happen.  It might even allow the existing shell-command based
(de-)archiver to live as a contrib.


[0]: Use case: When a small standby used for some reporting happens to
be the farthest ahead)

[1]: Details: a simple touched file to no-op the restore_command is
unsatisfying, because the restore_command may have already been
started by postgres, so now you have to make your restore_command
coordinate with your streaming replication proxy software to be safe
or wait "long enough" for a single segment to replay as so one can be
assured that the system is quiesced.  I see this is an anti-feature of
the current file-based archiving strategy)

--
fdr


Re: Streaming-only Remastering

От
Josh Berkus
Дата:
Simon,

> The "major limitation" was solved by repmgr close to 2 years ago now.
> So while you're correct that the patch to fix that assumed that
> archiving worked as well, it has been possible to operate happily
> without it.

repmgr is not able to remaster using only streaming replication.  It
also requires an SSH connection, as well as a bunch of other
administative setup (and compiling from source on most platforms, a not
at all insignificant obstacle).  So you haven't solved the problem,
you've just provided a somewhat less awkward packaged workaround.

It's certainly possible to devise all kinds of workarounds for the
problem; I have a few myself in Bash and Python.  What I want is to stop
using workarounds.

Without the requirement for archiving, PostgreSQL binary replication is
almost ideally simple to set up and administer.  Turn settings on in
server A and Server B, run pg_basebackup and you're replicating.  It's
like 4 steps, all but one of which can be scripted through puppet.
However, the moment you add log-shipping to the mix things get an order
of magnitude more complicated, repmgr or not.

There's really only too things standing in the way of binary replication
being completely developer-friendly.  Remastering is the big one, and
the separate recovery.conf is the small one.  We can fix both.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com




Re: Streaming-only Remastering

От
Josh Berkus
Дата:
> Instead of using re-synchronization (e.g. repmgr in its relation to
> rsync), I intend to proxy and also inspect the streaming replication
> traffic and then quiesce all standbys and figure out what node is
> farthest ahead.  Once I figure out the node that is farthest ahead, if
> it is not a node that is eligible for promotion to the master, I need
> to exchange its changes to nodes that are eligible for promotion[0],
> and then promote one of those, repointing all other standbys to that
> node. This must all take place nominally within a second or thirty.
> Conceptually it is simple, but mechanically it's somewhat intense,
> especially in relation to the inconvenience of doing this incorrectly.

So you're suggesting that it would be great to be able to
double-remaster?  i.e. given OM = Original Master, 1S = standby furthest
ahead, NM = desired new master, to do:

1S <--- OM ---> NM

OM dies, then:

1S -----------> NM

until NM is caught up, then

1S <----------- NM

Yes?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com




Re: Streaming-only Remastering

От
Daniel Farina
Дата:
On Sun, Jun 17, 2012 at 1:11 PM, Josh Berkus <josh@agliodbs.com> wrote:
>
>> Instead of using re-synchronization (e.g. repmgr in its relation to
>> rsync), I intend to proxy and also inspect the streaming replication
>> traffic and then quiesce all standbys and figure out what node is
>> farthest ahead.  Once I figure out the node that is farthest ahead, if
>> it is not a node that is eligible for promotion to the master, I need
>> to exchange its changes to nodes that are eligible for promotion[0],
>> and then promote one of those, repointing all other standbys to that
>> node. This must all take place nominally within a second or thirty.
>> Conceptually it is simple, but mechanically it's somewhat intense,
>> especially in relation to the inconvenience of doing this incorrectly.
>
> So you're suggesting that it would be great to be able to
> double-remaster?  i.e. given OM = Original Master, 1S = standby furthest
> ahead, NM = desired new master, to do:

Yeah. Although it seems like it would degenerate to single-remastering
applied a couple times, no?

--
fdr