Manual failover cluster

Поиск

Список

Период

Сортировка

От	Hispaniola Sol
Тема	Manual failover cluster
Дата	20 августа 2021 г. 15:48:43
Msg-id	SA1PR13MB5023BFAC9954144ECBE2181184C19@SA1PR13MB5023.namprd13.prod.outlook.com обсуждение исходный текст
Ответы	Re: Manual failover cluster (Ninad Shah <nshah.postgres@gmail.com>) Re: Manual failover cluster (Saul Perdomo <saul.perdomo@gmail.com>)
Список	pgsql-general

Дерево обсуждения

Team,

I have a pg 10 cluster with a master and two hot-standby nodes. There is a requirement for a manual failover (nodes switching the roles) at will. This is a vanilla 3 node PG cluster that was built with WAL archiving (central location) and streaming replication to two hot standby nodes. The failover is scripted in Ansible. Ansible massages and moves around the archive/restore scripts, the conf files and the trigger and calls ` pg_ctlcluster` to start/stop. This part _seems_ to be doing the job fine.

The issue I am struggling with is the apparent fragility of the process - all 3 nodes will end up in a "good" state after the switch only every other time. Other times I have to rebase the hot-standby from the new master with pg_basebackup. It seems the issues are mostly with those nodes, ending up as slaves after the roles switch runs.

They get errors like mismatch in timelines, recovering from the same WAL over and over again, invalid resource manager ID in primary checkpoint record, etc.

In this light, I am wondering - using what's offered by PostgreSQL itself, i.e. streaming WAL replication with log shipping - can I expect to have this kind of failover 100% reliable on PG side ? Anyone is doing this reliably on PostgreSQL 10.1x ?

Thanks !

Moishe

В списке pgsql-general по дате отправления:

Предыдущее

От: Rich Shepard
Дата: 20 августа 2021 г., 15:37:25
Сообщение: Re: Selecting table row with latest date

Следующее

От: Jayadevan M
Дата: 20 августа 2021 г., 16:25:23
Сообщение: Re: log_statement setting

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Manual failover cluster

Предыдущее

Следующее