Обсуждение: Fully-automatic streaming replication failover when master dies?

Поиск

Список

Период

Сортировка

Fully-automatic streaming replication failover when master dies?

От

Dmitry Koterov

Дата:

23 января 2014 г., 00:35:51

Hello.

I googled 1 hour approximately, but have not found a ready solution for this. So maybe this feature is in PostgreSQL todo-list, or something similar exists somewhere...

Before the actual question, I'd like to give a small analogy. What I mostly love in MongoDB is that it supports a fully transparent scheme of replication failover. If you have >= 3 MongoDB notes (e.g. 1 master and 2 replicas), and the master dies, in a couple of seconds a replica is AUTOMATICALLY elected as a new master, and all other replicas are AUTOMATICALLY begin to follow it. If the dead master is back again suddenly, it first appears as a replica, but in a couple of seconds it becomes a new master back (because it initially had a highest weight assigned), and all replicas become to follow it. All these steps are done automatically and transparently. It just works.

So does something similar and more-or-less stable exist for PostgrSQL too?

P.S.

The links I've found already:

- http://www.databasesoup.com - that's it, but too young and, possibly, not yet stable

- repmgr: always referred at StackOverflow (thousands of times), but it does not provide an automatic failover :-)

- pgpool-2: it contains a couple of config options for "failover", but in practice they are too poor to be used in production and with large databases, I believe after reading the docs

- pacemaker's plugin: extremely complex (IMHO) overweighted for this purpose

- EnterpriseDB's solutions: they are not free

Re: Fully-automatic streaming replication failover when master dies?

От

John R Pierce

Дата:

23 января 2014 г., 00:55:07

On 1/22/2014 1:35 PM, Dmitry Koterov wrote:
>
> So does something similar and more-or-less stable exist for PostgrSQL
> too?
>

you'd need to implement that yourself using a cluster management
package, and something like repmgr to handle the transitions.

database failover is extremely tricky stuff.  you have to make very very
sure you don't get into a stoned cluster where both nodes THINK they are
master.   most well implemented failover clusters make use of hardware
'fencing' to block the presumed-dead former master from coming back
online without manual intervention.

--
john r pierce                                      37N 122W
somewhere on the middle of the left coast

Re: Fully-automatic streaming replication failover when master dies?

От

Dmitry Koterov

Дата:

23 января 2014 г., 02:12:34

It's even more tricky if it's not well-covered by automated tests, I suppose. :-) There could be a number of methods to forbid the master to rise up from the grave, e.g. voting quorum (or the simplest, but less elegant, solution - a centralized "witness" daemon who does all the work for all the machines). At least MongoDB does the work well, and with almost zero configuration.

My question was about a ready and well-tested solutions, do they exist.

On Thu, Jan 23, 2014 at 1:54 AM, John R Pierce <pierce@hogranch.com> wrote:

On 1/22/2014 1:35 PM, Dmitry Koterov wrote:

So does something similar and more-or-less stable exist for PostgrSQL too?

you'd need to implement that yourself using a cluster management package, and something like repmgr to handle the transitions.

database failover is extremely tricky stuff. you have to make very very sure you don't get into a stoned cluster where both nodes THINK they are master. most well implemented failover clusters make use of hardware 'fencing' to block the presumed-dead former master from coming back online without manual intervention.

--
john r pierce 37N 122W
somewhere on the middle of the left coast

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Fully-automatic streaming replication failover when master dies?

От

Andrew Sullivan

Дата:

23 января 2014 г., 06:18:34

On Thu, Jan 23, 2014 at 03:12:07AM +0400, Dmitry Koterov wrote:
> for all the machines). At least MongoDB does the work well, and with almost
> zero configuration.

Mongo's data guarantees are, um, somewhat less robust than
PostgreSQL's.  Failover is easy if you don't have to be exactly right.

A

--
Andrew Sullivan
ajs@crankycanuck.ca

Re: Fully-automatic streaming replication failover when master dies?

От

Susan Cassidy

Дата:

23 января 2014 г., 20:38:17

pgpool-II may do what you want. Lots of people use it.

On Wed, Jan 22, 2014 at 1:35 PM, Dmitry Koterov <dmitry.koterov@gmail.com> wrote:

Hello.

I googled 1 hour approximately, but have not found a ready solution for this. So maybe this feature is in PostgreSQL todo-list, or something similar exists somewhere...

Before the actual question, I'd like to give a small analogy. What I mostly love in MongoDB is that it supports a fully transparent scheme of replication failover. If you have >= 3 MongoDB notes (e.g. 1 master and 2 replicas), and the master dies, in a couple of seconds a replica is AUTOMATICALLY elected as a new master, and all other replicas are AUTOMATICALLY begin to follow it. If the dead master is back again suddenly, it first appears as a replica, but in a couple of seconds it becomes a new master back (because it initially had a highest weight assigned), and all replicas become to follow it. All these steps are done automatically and transparently. It just works.

So does something similar and more-or-less stable exist for PostgrSQL too?

P.S.
The links I've found already:
- http://www.databasesoup.com - that's it, but too young and, possibly, not yet stable
- repmgr: always referred at StackOverflow (thousands of times), but it does not provide an automatic failover :-)
- pgpool-2: it contains a couple of config options for "failover", but in practice they are too poor to be used in production and with large databases, I believe after reading the docs
- pacemaker's plugin: extremely complex (IMHO) overweighted for this purpose
- EnterpriseDB's solutions: they are not free

Re: Fully-automatic streaming replication failover when master dies?

От

Sameer Kumar

Дата:

24 января 2014 г., 05:17:37

On Fri, Jan 24, 2014 at 1:38 AM, Susan Cassidy <susan.cassidy@decisionsciencescorp.com> wrote:

pgpool-II may do what you want. Lots of people use it.

I don't think pgpool adds the lost node on its own (once the node is live or available again). Plus if you have a 3 node replication you need to have your own failover_command (as a shell script) which changes the master node for 2nd secondary when one of the secondary servers decides to be promoted to primary). I hope things will get easy with version 9.4 (I guess in 9.4 one won't have to rebuild a master node from backup. if the wal files are available it will just roll forward).

> for all the machines). At least MongoDB does the work well, and with almost
> zero configuration.
Mongo's data guarantees are, um, somewhat less robust than
PostgreSQL's.

I don't think this has anything to do with data reliability or ACID property (if that is what you are referring to).

Failover is easy if you don't have to be exactly right.

IMHO That's not a fair point. PostgreSQL supports sync replication (as well as async) and does that complicate the failover process or an async replication? I guess what he is asking for is automation of whatever feature PostgreSQL already supports.

Best Regards,
Sameer Kumar | Database Consultant
ASHNIK PTE. LTD.
101 Cecil Street, #11-11 Tong Eng Building, Singapore 069533
M : +65 8110 0350 T: +65 6438 3504 | www.ashnik.com
www.facebook.com/ashnikbiz | www.twitter.com/ashnikbiz

This email may contain confidential, privileged or copyright material and is solely for the use of the intended recipient(s).

Вложения

image002.jpg

Re: Fully-automatic streaming replication failover when master dies?

От

Scott Marlowe

Дата:

24 января 2014 г., 05:35:34

Currnently there is no automatic failover for postgresql.  If I were
to build something to do this I'd start with repmgr:
http://www.repmgr.org/

On Wed, Jan 22, 2014 at 2:35 PM, Dmitry Koterov
<dmitry.koterov@gmail.com> wrote:
> Hello.
>
> I googled 1 hour approximately, but have not found a ready solution for
> this. So maybe this feature is in PostgreSQL todo-list, or something similar
> exists somewhere...
>
> Before the actual question, I'd like to give a small analogy. What I mostly
> love in MongoDB is that it supports a fully transparent scheme of
> replication failover. If you have >= 3 MongoDB notes (e.g. 1 master and 2
> replicas), and the master dies, in a couple of seconds a replica is
> AUTOMATICALLY elected as a new master, and all other replicas are
> AUTOMATICALLY begin to follow it. If the dead master is back again suddenly,
> it first appears as a replica, but in a couple of seconds it becomes a new
> master back (because it initially had a highest weight assigned), and all
> replicas become to follow it. All these steps are done automatically and
> transparently. It just works.
>
> So does something similar and more-or-less stable exist for PostgrSQL too?
>
> P.S.
> The links I've found already:
> - http://www.databasesoup.com - that's it, but too young and, possibly, not
> yet stable
> - repmgr: always referred at StackOverflow (thousands of times), but it does
> not provide an automatic failover :-)
> - pgpool-2: it contains a couple of config options for "failover", but in
> practice they are too poor to be used in production and with large
> databases, I believe after reading the docs
> - pacemaker's plugin: extremely complex (IMHO) overweighted for this purpose
> - EnterpriseDB's solutions: they are not free



--
To understand recursion, one must first understand recursion.

Re: Fully-automatic streaming replication failover when master dies?

От

Scott Marlowe

Дата:

24 января 2014 г., 05:41:59

On Thu, Jan 23, 2014 at 7:16 PM, Sameer Kumar <sameer.kumar@ashnik.com> wrote:
>
>
> On Fri, Jan 24, 2014 at 1:38 AM, Susan Cassidy <susan.cassidy@decisionsciencescorp.com> wrote:
>>
>> pgpool-II may do what you want.  Lots of people use it.
>
>
> I don't think pgpool adds the lost node on its own (once the node is live or available again). Plus if you have a 3
nodereplication you need to have your own failover_command (as a shell script) which changes the master node for 2nd
secondarywhen one of the secondary servers decides to be promoted to primary). I hope things will get easy with version
9.4(I guess in 9.4 one won't have to rebuild a master node from backup. if the wal files are available it will just
rollforward). 
>
>> > for all the machines). At least MongoDB does the work well, and with almost
>> > zero configuration.
>> Mongo's data guarantees are, um, somewhat less robust than
>> PostgreSQL's.
>
>
> I don't think this has anything to do with data reliability or ACID property (if that is what you are referring to).
>
>>  Failover is easy if you don't have to be exactly right.
>
>
> IMHO That's not a fair point. PostgreSQL supports sync replication (as well as async) and does that complicate the
failoverprocess or an async replication? I guess what he is asking for is automation of whatever feature PostgreSQL
alreadysupports. 

No it's a fair point. When you go from "we promise to try and not lose
your data" to "we promise to not lose any of your data" the situation
is much different.

There are many things to consider in the postgresql situation. Is it
more important to keep your application up and running, even if only
in read only mode? Is performance more important than data integrity?
How many nodes do you have? How man can auto-fail over before you
auto-fail over to the very last one? How do you rejoin failed nodes,
one at a time, all at once, by hand, automagically? And so on. There
are a LOT of questions to ask that mongo already decided for you, and
the decision was that if you lose some data that's OK as long as the
cluster stays up. With PostgreSQL the decision making process probably
has a big impact on how you answer these types of questions and how
you fail over.

Add to that that most postgresql database servers are VERY robust,
with multi-lane RAID array controllers and / or sturdy SANs underneath
them, and their failure rates are very low, you run the risk of your
auto-failover causing much of an outage as the server failing, since
most failovers are going to cause some short interruption in service.
It's not a simple push a button take a banana, one size fits all
problem and solution.

Re: Fully-automatic streaming replication failover when master dies?

От

Tatsuo Ishii

Дата:

24 января 2014 г., 06:29:23

> I don't think pgpool adds the lost node on its own (once the node is live
> or available again).

That's one of my TODOs. If we limit it to the standby nodes, it will
be safe. I hope this is added to next major version of pgpool-II.

> Plus if you have a 3 node replication you need to have
> your own failover_command (as a shell script) which changes the master node
> for 2nd secondary when one of the secondary servers decides to be promoted
> to primary). I hope things will get easy with version 9.4 (I guess in 9.4
> one won't have to rebuild a master node from backup. if the wal files are
> available it will just roll forward).

Yes, if you have multiple candicate standbys to be promoted, you
should make a discion on this. Pgpool-II cannot guess your will.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

Re: Fully-automatic streaming replication failover when master dies?

От

Sameer Kumar

Дата:

24 января 2014 г., 07:16:15

On Fri, Jan 24, 2014 at 11:28 AM, Tatsuo Ishii <ishii@postgresql.org> wrote:

> I don't think pgpool adds the lost node on its own (once the node is live
> or available again).

That's one of my TODOs. If we limit it to the standby nodes, it will
be safe. I hope this is added to next major version of pgpool-II.

Looking forward to it :-)

> Plus if you have a 3 node replication you need to have
> your own failover_command (as a shell script) which changes the master node
> for 2nd secondary when one of the secondary servers decides to be promoted
> to primary). I hope things will get easy with version 9.4 (I guess in 9.4
> one won't have to rebuild a master node from backup. if the wal files are
> available it will just roll forward).

Yes, if you have multiple candicate standbys to be promoted, you
should make a discion on this. Pgpool-II cannot guess your will.

Can't we decide based on the xlog location replayed on each standby? Whichever is ahead gets the priority (the idea is to lose least number of transactions).

This email may contain confidential, privileged or copyright material and is solely for the use of the intended recipient(s).

Вложения

image002.jpg

Re: Fully-automatic streaming replication failover when master dies?

От

Thomas Kellerer

Дата:

24 января 2014 г., 10:33:39

Dmitry Koterov, 22.01.2014 22:35:
> I googled 1 hour approximately, but have not found a ready solution
> for this. So maybe this feature is in PostgreSQL todo-list, or
> something similar exists somewhere...
>
> Before the actual question, I'd like to give a small analogy. What I
> mostly love in MongoDB is that it supports a fully transparent scheme
> of replication failover. If you have >= 3 MongoDB notes (e.g. 1
> master and 2 replicas), and the master dies, in a couple of seconds a
> replica is AUTOMATICALLY elected as a new master, and all other
> replicas are AUTOMATICALLY begin to follow it. If the dead master is
> back again suddenly, it first appears as a replica, but in a couple
> of seconds it becomes a new master back (because it initially had a
> highest weight assigned), and all replicas become to follow it. All
> these steps are done automatically and transparently. It just works.
>
> So does something similar and more-or-less stable exist for PostgrSQL
> too?

Someone just blogged about this:

http://evol-monkey.blogspot.de/2014/01/setting-up-postgres-automated-failover.html

Re: Fully-automatic streaming replication failover when master dies?

От

Dmitry Koterov

Дата:

25 января 2014 г., 21:29:39

Failover is NOT about the RAID or SAN robusness mostly. It's about datacenters connectivity and network issues. If you lose one datacenter (it happens, and there is no aid for it), you should redirect all traffic to another DC ASAP and failover the master DB to it. When the disconnected DC is up again, it should recover from this situation.

So +1 for the previous man, PostgreSQL ACID and MongoDB non-ACID have absolute no relevance to the failover problem.

On Friday, January 24, 2014, Scott Marlowe <scott.marlowe@gmail.com> wrote:

On Thu, Jan 23, 2014 at 7:16 PM, Sameer Kumar <sameer.kumar@ashnik.com> wrote:
>
>
> On Fri, Jan 24, 2014 at 1:38 AM, Susan Cassidy <susan.cassidy@decisionsciencescorp.com> wrote:
>>
>> pgpool-II may do what you want. Lots of people use it.
>
>
> I don't think pgpool adds the lost node on its own (once the node is live or available again). Plus if you have a 3 node replication you need to have your own failover_command (as a shell script) which changes the master node for 2nd secondary when one of the secondary servers decides to be promoted to primary). I hope things will get easy with version 9.4 (I guess in 9.4 one won't have to rebuild a master node from backup. if the wal files are available it will just roll forward).
>
>> > for all the machines). At least MongoDB does the work well, and with almost
>> > zero configuration.
>> Mongo's data guarantees are, um, somewhat less robust than
>> PostgreSQL's.
>
>
> I don't think this has anything to do with data reliability or ACID property (if that is what you are referring to).
>
>> Failover is easy if you don't have to be exactly right.
>
>
> IMHO That's not a fair point. PostgreSQL supports sync replication (as well as async) and does that complicate the failover process or an async replication? I guess what he is asking for is automation of whatever feature PostgreSQL already supports.

No it's a fair point. When you go from "we promise to try and not lose
your data" to "we promise to not lose any of your data" the situation
is much different.

There are many things to consider in the postgresql situation. Is it
more important to keep your application up and running, even if only
in read only mode? Is performance more important than data integrity?
How many nodes do you have? How man can auto-fail over before you
auto-fail over to the very last one? How do you rejoin failed nodes,
one at a time, all at once, by hand, automagically? And so on. There
are a LOT of questions to ask that mongo already decided for you, and
the decision was that if you lose some data that's OK as long as the
cluster stays up. With PostgreSQL the decision making process probably
has a big impact on how you answer these types of questions and how
you fail over.

Add to that that most postgresql database servers are VERY robust,
with multi-lane RAID array controllers and / or sturdy SANs underneath
them, and their failure rates are very low, you run the risk of your
auto-failover causing much of an outage as the server failing, since
most failovers are going to cause some short interruption in service.
It's not a simple push a button take a banana, one size fits all
problem and solution.

Re: Fully-automatic streaming replication failover when master dies?

От

Scott Marlowe

Дата:

26 января 2014 г., 07:50:12

Please don't top post in technical discussions.

On Sat, Jan 25, 2014 at 11:29 AM, Dmitry Koterov
<dmitry.koterov@gmail.com> wrote:
>
> On Friday, January 24, 2014, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>>
>> On Thu, Jan 23, 2014 at 7:16 PM, Sameer Kumar <sameer.kumar@ashnik.com>
>> wrote:
>> >
>> >
>> > On Fri, Jan 24, 2014 at 1:38 AM, Susan Cassidy
>> > <susan.cassidy@decisionsciencescorp.com> wrote:
>> >>
>> >> pgpool-II may do what you want.  Lots of people use it.
>> >
>> >
>> > I don't think pgpool adds the lost node on its own (once the node is
>> > live or available again). Plus if you have a 3 node replication you need to
>> > have your own failover_command (as a shell script) which changes the master
>> > node for 2nd secondary when one of the secondary servers decides to be
>> > promoted to primary). I hope things will get easy with version 9.4 (I guess
>> > in 9.4 one won't have to rebuild a master node from backup. if the wal files
>> > are available it will just roll forward).
>> >
>> >> > for all the machines). At least MongoDB does the work well, and with
>> >> > almost
>> >> > zero configuration.
>> >> Mongo's data guarantees are, um, somewhat less robust than
>> >> PostgreSQL's.
>> >
>> >
>> > I don't think this has anything to do with data reliability or ACID
>> > property (if that is what you are referring to).
>> >
>> >>  Failover is easy if you don't have to be exactly right.
>> >
>> >
>> > IMHO That's not a fair point. PostgreSQL supports sync replication (as
>> > well as async) and does that complicate the failover process or an async
>> > replication? I guess what he is asking for is automation of whatever feature
>> > PostgreSQL already supports.
>>
>> No it's a fair point. When you go from "we promise to try and not lose
>> your data" to "we promise to not lose any of your data" the situation
>> is much different.
>>
>> There are many things to consider in the postgresql situation. Is it
>> more important to keep your application up and running, even if only
>> in read only mode? Is performance more important than data integrity?
>> How many nodes do you have? How man can auto-fail over before you
>> auto-fail over to the very last one? How do you rejoin failed nodes,
>> one at a time, all at once, by hand, automagically? And so on. There
>> are a LOT of questions to ask that mongo already decided for you, and
>> the decision was that if you lose some data that's OK as long as the
>> cluster stays up. With PostgreSQL the decision making process probably
>> has a big impact on how you answer these types of questions and how
>> you fail over.
>>
>> Add to that that most postgresql database servers are VERY robust,
>> with multi-lane RAID array controllers and / or sturdy SANs underneath
>> them, and their failure rates are very low, you run the risk of your
>> auto-failover causing much of an outage as the server failing, since
>> most failovers are going to cause some short interruption in service.
>> It's not a simple push a button take a banana, one size fits all
>> problem and solution.

> Failover is NOT about the RAID or SAN robusness mostly. It's about
> datacenters connectivity and network issues. If you lose one datacenter (it
> happens, and there is no aid for it), you should redirect all traffic to
> another DC ASAP and failover the master DB to it. When the disconnected DC
> is up again, it should recover from this situation.
>
> So +1 for the previous man, PostgreSQL ACID and MongoDB non-ACID have
> absolute no relevance to the failover problem.

If you'll bother reading what I wrote AGAIN, you'll notice my mention
on ACID etc was more of an afterthought here. There are real questions
about data loss and recovery that matter when you are failing over.
Are you running your cluster in synchronous mode across geographically
diverse data centers? If not how long do you wait for the master to
come back before you fail over? A millisecond? A second? A minute? The
answer will likely be different for me than for you.

While ACID isn't the main or only reason for things being different,
it IS a valid reason because different people use PostgreSQL for
different things. If I'm running it as a session server, I treat it
one way, as a key-value store another, as a transactional database
handling monetary funds yet another. You're refusal to accept that
this is a complex issue with complex answers isn't helping you find
the right answer to your problem.

--
To understand recursion, one must first understand recursion.

Re: Fully-automatic streaming replication failover when master dies?

От

Sameer Kumar

Дата:

27 января 2014 г., 06:11:00

On Sun, Jan 26, 2014 at 12:50 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:

Are you running your cluster in synchronous mode across geographically
diverse data centers? If not how long do you wait for the master to
come back before you fail over? A millisecond? A second? A minute? The
answer will likely be different for me than for you.

Questions like this and other failover criteria could be configurations. at the least simplest of clusters and failovers sceanrios could be facilitated out of box (either using PostgreSQL itself using a tool e.g. pgpool).

While ACID isn't the main or only reason for things being different,

it IS a valid reason because different people use PostgreSQL for
different things. If I'm running it as a session server, I treat it
one way, as a key-value store another, as a transactional database
handling monetary funds yet another.

If there is a failover and something is not replicated, I am gonna lose those transaction [unless I have wal and archives available from the primary node]. I would believe that same should apply to mongoDB too (instead of losing transactions you will be losing operations). I am not getting how being ACID or non-ACID changes this scenario. If the stand by was synchronous [which is what someone using it for montary funds would do], I do no lose any transaction [which is what someone using it for montary funds would need]. Still I see that an automatic failover can be performed (and scenario is more simple with sync-replication rather than it being complex).

The decision of where to failover in a multi-slave cluster can be based on xlogs replayed. To avoid split brain scenario one can always do what mongoDB does (which is again out of a very simple mathematical theory and nothing which one can attribute to it being non-ACID). Cluster always has odd nodes. lets say node1 replicates to node2 and node3 and because of n/w failure node2 and node1 and node2 can not see node3 but can see each other, so node1 will see that it can still remain master since it has majority of nodes in its visibility. it was node1 which was not visible to node2 and node3, one of them would become primary.

You're refusal to accept that
this is a complex issue with complex answers isn't helping you find
the right answer to your problem.

I am just trying to understand the complexity and see if I am overlooking anything :-)

Dmitry Koterov wrote:

PostgreSQL supports synchronous multi-master, MongoDB supports write concern, but this causes a performance penalty).

Well, if multi-master is what you were talking about all along then may be my arguments are all wrong (and I am sorry to everyone).

ACID (primarily isolation and serilization) would make a big deal here. Multi Master with Relational databases is an all together different thing. Anyways I doubt that "PostgreSQL supports synchronous multi-master"

This email may contain confidential, privileged or copyright material and is solely for the use of the intended recipient(s).

Вложения

image002.jpg

Re: Fully-automatic streaming replication failover when master dies?

От

Michael Paquier

Дата:

27 января 2014 г., 06:25:03

On Mon, Jan 27, 2014 at 12:10 PM, Sameer Kumar <sameer.kumar@ashnik.com> wrote:

Dmitry Koterov wrote:
PostgreSQL supports synchronous multi-master, MongoDB supports write concern, but this causes a performance penalty).
Anyways I doubt that "PostgreSQL supports synchronous multi-master"

Postgres core taken as such does not support multi-master. The fork of PostgreSQL called Postgres-XC somewhat does, for OLTP applications.

Regards,

--
Michael

Re: Fully-automatic streaming replication failover when master dies?

От

Sameer Kumar

Дата:

27 января 2014 г., 06:35:59

On Mon, Jan 27, 2014 at 11:24 AM, Michael Paquier <michael.paquier@gmail.com> wrote:

PostgreSQL supports synchronous multi-master, MongoDB supports write concern, but this causes a performance penalty).
Anyways I doubt that "PostgreSQL supports synchronous multi-master"
Postgres core taken as such does not support multi-master. The fork of PostgreSQL called Postgres-XC somewhat does, for OLTP applications.

I know about PostgresXC, but I thought it is distributed database (similar to shards of mongoDB). [Though if I am correct there could be tables which are shared across different nodes, but that is not the best way of utilizing features of PostgresXC] I think it is not apt to call it "synchronous" (since there is no replication happening).

This email may contain confidential, privileged or copyright material and is solely for the use of the intended recipient(s).

Вложения

image002.jpg

Re: Fully-automatic streaming replication failover when master dies?

От

Tatsuo Ishii

Дата:

27 января 2014 г., 06:48:07

> I know about PostgresXC, but I thought it is distributed database (similar
> to shards of mongoDB). [Though if I am correct there could be tables which
> are shared across different nodes, but that is not the best way of
> utilizing features of PostgresXC] I think it is not apt to call it
> "synchronous" (since there is no replication happening).

I thought non "eventual consistency" type DB can be called
"synchronous", no?  If I am correct, Posgres-XC is definitely
synchronous.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

Re: Fully-automatic streaming replication failover when master dies?

От

Michael Paquier

Дата:

27 января 2014 г., 07:57:06

On Mon, Jan 27, 2014 at 12:35 PM, Sameer Kumar <sameer.kumar@ashnik.com> wrote:

On Mon, Jan 27, 2014 at 11:24 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
PostgreSQL supports synchronous multi-master, MongoDB supports write concern, but this causes a performance penalty).
Anyways I doubt that "PostgreSQL supports synchronous multi-master"
Postgres core taken as such does not support multi-master. The fork of PostgreSQL called Postgres-XC somewhat does, for OLTP applications.

I know about PostgresXC, but I thought it is distributed database (similar to shards of mongoDB). [Though if I am correct there could be tables which are shared across different nodes, but that is not the best way of utilizing features of PostgresXC] I think it is not apt to call it "synchronous" (since there is no replication happening).

XC supports table sharding with distribution of data using a column key, or even data replication. Table definitions and schemas are the same across all the nodes, it is just that the data is present in a portion of the nodes (Datanodes more precisely) of the cluster. Operations are done *synchronously* through the cluster using what is called a GTM (global transaction manager) that is unique and feeds all the other nodes with globally-consistent transaction IDs and snapshots. GTM SPOF is solved with the presence of a Standby.

--
Michael

Re: Fully-automatic streaming replication failover when master dies?

От

Dmitry Koterov

Дата:

28 января 2014 г., 05:01:36

Complex in its implementation - maybe. Complex in its configuration and ideology - nope.

> Are you running your cluster in synchronous mode across geographically diverse data centers?

Config option #1 ("allow replicas to re-bind to the second synchronous master if the first one fails, and allow the second master to run separately").

> How long do you wait for the master to come back before you fail over?

Config option #2 ("how many missed heartbeats cause the automatic failover process").

There could be more options, of course. But all this is not a rocket science, it's just not yet implemented, I suppose. I thought your point was "The difference is that in MongoDB automatic failover is simple, in PostgreSQL it is much more complex" - I don't agree with that, I think the tasks have more or less same complexity. There could be no "silver bullet" with no data loss for PostgreSQL and MongoDB both (though both PostgreSQL and MongoDB support synchronous multi-node commits: PostgreSQL supports synchronous multi-master, MongoDB supports write concern, but this causes a performance penalty). I just mentioned MongoDB, because it has an excellent automatic failover mechanism, which originates not from its noSQLness and could be theoretically implemented in any other databases (including PostgreSQL), not for a holy war.

On Sun, Jan 26, 2014 at 8:50 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:

Please don't top post in technical discussions.

On Sat, Jan 25, 2014 at 11:29 AM, Dmitry Koterov
<dmitry.koterov@gmail.com> wrote:
>
> On Friday, January 24, 2014, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>>
>> On Thu, Jan 23, 2014 at 7:16 PM, Sameer Kumar <sameer.kumar@ashnik.com>
>> wrote:
>> >
>> >
>> > On Fri, Jan 24, 2014 at 1:38 AM, Susan Cassidy
>> > <susan.cassidy@decisionsciencescorp.com> wrote:
>> >>
>> >> pgpool-II may do what you want. Lots of people use it.
>> >
>> >
>> > I don't think pgpool adds the lost node on its own (once the node is
>> > live or available again). Plus if you have a 3 node replication you need to
>> > have your own failover_command (as a shell script) which changes the master
>> > node for 2nd secondary when one of the secondary servers decides to be
>> > promoted to primary). I hope things will get easy with version 9.4 (I guess
>> > in 9.4 one won't have to rebuild a master node from backup. if the wal files
>> > are available it will just roll forward).
>> >
>> >> > for all the machines). At least MongoDB does the work well, and with
>> >> > almost
>> >> > zero configuration.
>> >> Mongo's data guarantees are, um, somewhat less robust than
>> >> PostgreSQL's.
>> >
>> >
>> > I don't think this has anything to do with data reliability or ACID
>> > property (if that is what you are referring to).
>> >
>> >> Failover is easy if you don't have to be exactly right.
>> >
>> >
>> > IMHO That's not a fair point. PostgreSQL supports sync replication (as
>> > well as async) and does that complicate the failover process or an async
>> > replication? I guess what he is asking for is automation of whatever feature
>> > PostgreSQL already supports.
>>
>> No it's a fair point. When you go from "we promise to try and not lose
>> your data" to "we promise to not lose any of your data" the situation
>> is much different.
>>
>> There are many things to consider in the postgresql situation. Is it
>> more important to keep your application up and running, even if only
>> in read only mode? Is performance more important than data integrity?
>> How many nodes do you have? How man can auto-fail over before you
>> auto-fail over to the very last one? How do you rejoin failed nodes,
>> one at a time, all at once, by hand, automagically? And so on. There
>> are a LOT of questions to ask that mongo already decided for you, and
>> the decision was that if you lose some data that's OK as long as the
>> cluster stays up. With PostgreSQL the decision making process probably
>> has a big impact on how you answer these types of questions and how
>> you fail over.
>>
>> Add to that that most postgresql database servers are VERY robust,
>> with multi-lane RAID array controllers and / or sturdy SANs underneath
>> them, and their failure rates are very low, you run the risk of your
>> auto-failover causing much of an outage as the server failing, since
>> most failovers are going to cause some short interruption in service.
>> It's not a simple push a button take a banana, one size fits all
>> problem and solution.

> Failover is NOT about the RAID or SAN robusness mostly. It's about
> datacenters connectivity and network issues. If you lose one datacenter (it
> happens, and there is no aid for it), you should redirect all traffic to
> another DC ASAP and failover the master DB to it. When the disconnected DC
> is up again, it should recover from this situation.
>
> So +1 for the previous man, PostgreSQL ACID and MongoDB non-ACID have
> absolute no relevance to the failover problem.

If you'll bother reading what I wrote AGAIN, you'll notice my mention
on ACID etc was more of an afterthought here. There are real questions
about data loss and recovery that matter when you are failing over.
Are you running your cluster in synchronous mode across geographically
diverse data centers? If not how long do you wait for the master to
come back before you fail over? A millisecond? A second? A minute? The
answer will likely be different for me than for you.

While ACID isn't the main or only reason for things being different,
it IS a valid reason because different people use PostgreSQL for
different things. If I'm running it as a session server, I treat it
one way, as a key-value store another, as a transactional database
handling monetary funds yet another. You're refusal to accept that
this is a complex issue with complex answers isn't helping you find
the right answer to your problem.

--
To understand recursion, one must first understand recursion.

Re: Fully-automatic streaming replication failover when master dies?

От

John R Pierce

Дата:

28 января 2014 г., 06:40:56

On 1/26/2014 12:35 AM, Dmitry Koterov wrote:

PostgreSQL supports synchronous multi-master

no, PostgreSQL does not support multi-master, not without some sort of third party replication system, all of which have serious compromises.

-- 
john r pierce                                      37N 122W
somewhere on the middle of the left coast

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Fully-automatic streaming replication failover when master dies?

Вложения

Вложения

Вложения

Вложения