Re: Synchronous Standalone Master Redoux

Поиск
Список
Период
Сортировка
От Josh Berkus
Тема Re: Synchronous Standalone Master Redoux
Дата
Msg-id 500206AE.1090002@agliodbs.com
обсуждение исходный текст
Ответ на Re: Synchronous Standalone Master Redoux  (Jose Ildefonso Camargo Tolosa <ildefonso.camargo@gmail.com>)
Ответы Re: Synchronous Standalone Master Redoux  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
So, here's the core issue with degraded mode.  I'm not mentioning this
to block any patch anyone has, but rather out of a desire to see someone
address this core problem with some clever idea I've not thought of.
The problem in a nutshell is: indeterminancy.

Assume someone implements degraded mode.  Then:

1. Master has one synchronous standby, Standby1, and two asynchronous,
Standby2 and Standby3.

2. Standby1 develops a NIC problem and is in and out of contact with
Master.  As a result, it's flipping in and out of synchronous / degraded
mode.

3. Master fails catastrophically due to a RAID card meltdown.  All data
lost.

At this point, the DBA is in kind of a pickle, because he doesn't know:

(a) Was Standby1 in synchronous or degraded mode when Master died?  The
only log for that was on Master, which is now gone.

(b) Is Standby1 actually the most caught up standby, and thus the
appropriate new master for Standby2 and Standby3, or is it behind?

With the current functionality of Synchronous Replication, you don't
have either piece of indeterminancy, because some external management
process (hopefully located on another server) needs to disable
synchronous replication when Standby1 develops its problem.  That is, if
the master is accepting synchronous transactions at all, you know that
Standby1 is up-to-date, and no data is lost.

While you can answer (b) by checking all servers, (a) is particularly
pernicious, because unless you have the application log all "operating
in degraded mode" messages, there is no way to ever determine the truth.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: elog/ereport noreturn decoration
Следующее
От: Jeff Janes
Дата:
Сообщение: Re: [PERFORM] DELETE vs TRUNCATE explanation