Обсуждение: Auotmated postgres failover

Поиск
Список
Период
Сортировка

Auotmated postgres failover

От
jwiencek3@comcast.net
Дата:
Hello


I'm looking for a tool to automate PostgreSQL cluster management failover in the event the master database were to become unavailable.   Currently are manually issuing a "pg_ctl promote"  once we become aware that the master database has crashed.


Is repmgr a via solution?  Please pass along experiences with repmgr.

Are there any other  tools available to automatically issue the "promote" in the event of a master database crash?



Regards

John Wiencek

Re: Auotmated postgres failover

От
John R Pierce
Дата:
On 1/21/2016 11:07 AM, jwiencek3@comcast.net wrote:


I'm looking for a tool to automate PostgreSQL cluster management failover in the event the master database were to become unavailable.   Currently are manually issuing a "pg_ctl promote"  once we become aware that the master database has crashed.


Is repmgr a via solution?  Please pass along experiences with repmgr.

Are there any other  tools available to automatically issue the "promote" in the event of a master database crash?


repmgr is a tool you could use in conjunction with a generic cluster management system like linuxha/heartbeat, vcs, etc.  

the most difficult part is reliably determining that A) the master has crashed, and B) fencing the failed old master so it doesn't wake up and think its still in charge.



-- 
john r pierce, recycling bits in santa cruz

Re: Auotmated postgres failover

От
Andrew Sullivan
Дата:
On Thu, Jan 21, 2016 at 11:34:18AM -0800, John R Pierce wrote:
> the most difficult part is reliably determining that A) the master has
> crashed, and B) fencing the failed old master so it doesn't wake up and
> think its still in charge.
>

And, depending on your workload, C) that you actually want to fail over.

I've seen an awful lot of people want automatic failover who also
can't afford for the already-committed transactions on the master to
be lost.  Unless you're running synchronous, be sure you have the
workload that can actually accept lost writes.

A

--
Andrew Sullivan
ajs@crankycanuck.ca


Re: Auotmated postgres failover

От
Jehan-Guillaume de Rorthais
Дата:
Le Thu, 21 Jan 2016 11:34:18 -0800,
John R Pierce <pierce@hogranch.com> a écrit :

> On 1/21/2016 11:07 AM, jwiencek3@comcast.net wrote:
> >
> >
> > I'm looking for a tool to automate PostgreSQL cluster management
> > failover in the event the master database were to become unavailable.
> >   Currently are manually issuing a "pg_ctl promote"  once we become
> > aware that the master database has crashed.
> >
> >
> > Is repmgr a via solution?  Please pass along experiences with repmgr.
> >
> > Are there any other  tools available to automatically issue the
> > "promote" in the event of a master database crash?

Yes, 3 different Pacemaker resource agents exist for PostgreSQL:

 * official one, in the package "resource-agents" on most linux distribs.
   This one is pretty complex and support multistate and stateless setup.
 * a simple, stupid, easy and stateless, agent:
   https://github.com/dalibo/pgsql-resource-agent/tree/master/stateless
   This one is fine for a 2 node cluster
 * a multistate-aware agent:
   https://github.com/dalibo/pgsql-resource-agent/tree/master/multistate
   This one is nice for multi-node cluster, searching for the best known slave
   to elect after a master lost.

Some important docs are available in the pgsql-resource-agent (PRA) repo:
 * https://github.com/dalibo/pgsql-resource-agent/blob/master/FENCING.md
 * the stateless:
   https://github.com/dalibo/pgsql-resource-agent/blob/master/stateless/README.md
 * the multistate:
   https://github.com/dalibo/pgsql-resource-agent/blob/master/multistate/README.md
   https://github.com/dalibo/pgsql-resource-agent/blob/master/multistate/INSTALL.md
   https://github.com/dalibo/pgsql-resource-agent/blob/master/multistate/docs/Quick_Start.md

> repmgr is a tool you could use in conjunction with a generic cluster
> management system like linuxha/heartbeat, vcs, etc.
>
> the most difficult part is reliably determining that A) the master has
> crashed, and B) fencing the failed old master so it doesn't wake up and
> think its still in charge.

+1
--
Jehan-Guillaume de Rorthais
Dalibo