Re: Synchronization levels in SR

Поиск
Список
Период
Сортировка
От Simon Riggs
Тема Re: Synchronization levels in SR
Дата
Msg-id 1274894788.6203.3279.camel@ebony
обсуждение исходный текст
Ответ на Re: Synchronization levels in SR  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Synchronization levels in SR  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Wed, 2010-05-26 at 11:31 -0400, Robert Haas wrote:
> > Your reply has again avoided the subject of how we would handle failure
> > modes with per-standby settings. That is important.
> 
> I don't think anyone is avoiding that, we just haven't discussed it.

You haven't discussed it, but even before you do, you know its better.
Not very compelling perspective... 

> The thing is, I don't think quorum commit actually does anything to
> address that problem.  If I have a master and a standby configured for
> sync rep and the standby goes down, we have to decide what impact that
> has on the master.  If I have a master and two standbys configured for
> sync rep with quorum commit such that I only need an ack from one of
> them, and they both go down, we still have to decide what impact that
> has on the master.  

That's already been discussed, and AFAIK Masao and I already agreed on
how that would be handled in the quorum commit case.

What we haven't had explained is how you would handle all the sub cases
or failure modes for the per-standby situation.

The most common case for synch rep IMHO is this:

* 2 near standbys, 1 remote. Want to be able to ACK to first near
standby that responds, or if both are down, ACK to the remote.

I've proposed a way of specifying that with 3 simple parameters, e.g.
synch_rep_acks = 1
synch_rep_timeout = 30
synch_rep_timeout_action = commit

In Oracle this would be all of the following

* all nodes given unique names
DB_UNIQUE_NAME=master
DB_UNIQUE_NAME=near1
DB_UNIQUE_NAME=near2
DB_UNIQUE_NAME=remote

* parameter settings
LOG_ARCHIVE_CONFIG='DG_CONFIG=(master,near1, near2, remote)'

LOG_ARCHIVE_DEST_2='SERVICE=near1 SYNC AFFIRM NET_TIMEOUT=30
DB_UNIQUE_NAME=near1'
LOG_ARCHIVE_DEST_STATE_2='ENABLE'

LOG_ARCHIVE_DEST_3='SERVICE=near2 SYNC AFFIRM NET_TIMEOUT=30
DB_UNIQUE_NAME=near2'
LOG_ARCHIVE_DEST_STATE_3='ENABLE'

LOG_ARCHIVE_DEST_4='SERVICE=remote ASYNC NOAFFIRM DB_UNIQUE_NAME=remote'
LOG_ARCHIVE_DEST_STATE_4='ENABLE'

* modes
ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE AVAILABILITY;


The Oracle way doesn't allow you to specify that if near1 and near2 are
down then we should continue to SYNC via remote, nor does it allow you
to specify things from user perspective or at transaction level.

You don't need to do it that way, for sure. But we do need to say what
way you would pick, rather than just arguing against me before you've
even discussed it here or off-list.

> I agree we need to talk about, but I don't agree
> that putting in quorum commit will remove the need to design that
> case.

Yes, you need to design for that case. It's not a magic wand.

All I've said is that covering the common cases is easier and more
flexible by choosing transaction-centric style of parameters, and it
also allows user settable behaviour.

I want to do better than Oracle, if possible, using lessons learned. I
don't want to do the same thing because we're copying them or because
we're going down the same conceptual dead end they went down. We should
try to avoid doing something obvious and aim a little higher.

-- Simon Riggs           www.2ndQuadrant.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Heikki Linnakangas
Дата:
Сообщение: Re: Synchronization levels in SR
Следующее
От: Tom Lane
Дата:
Сообщение: Re: mapping object names to role IDs