Re: Synchronization levels in SR

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Synchronization levels in SR
Дата
Msg-id 4BFD5282.7030501@enterprisedb.com
обсуждение исходный текст
Ответ на Re: Synchronization levels in SR  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Synchronization levels in SR  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Re: Synchronization levels in SR  (Simon Riggs <simon@2ndQuadrant.com>)
Re: Synchronization levels in SR  (Jan Wieck <JanWieck@Yahoo.com>)
Список pgsql-hackers
On 26/05/10 18:31, Robert Haas wrote:
> And frankly, I don't think it's possible for quorum commit to reduce
> the number of parameters.  Even if we have that feature available, not
> everyone will want to use it.  And the people who don't will
> presumably need whatever parameters they would have needed if quorum
> commit hadn't been available in the first place.

Agreed, quorum commit is not a panacea.

For example, suppose that you have two servers, master and a standby, 
and you want transactions to be synchronously committed to both, so that 
in the event of a meteor striking the master, you don't lose any 
transactions that have been replied to the client as committed.

Now you want to set up a temporary replica of the master at a 
development server, for testing purposes. If you set quorum to 2, your 
development server becomes critical infrastructure, which is not what 
you want. If you set quorum to 1, it also becomes critical 
infrastructure, because it's possible that a transaction has been 
replicated to the test server but not the real production standby, and a 
meteor strikes.

Per-standby settings would let you express that, but not OTOH the quorum 
behavior where you require N out of M to acknowledge the commit before 
returning to client.

There's really no limit to how complex a setup can be. For example, 
imagine that you have two data centers, with two servers in each. You 
want to replicate the master to all four servers, but for commit to 
return to the client, it's enough that the transaction has been 
replicated to one server in each data center. How do you express that in 
the config file? And it would be nice to have per-transaction control 
too, like with synchronous_commit...

So this is a tradeoff between
* flexibility, how complex a setup you can express?
* code complexity, how complicated is it to implement?
* user-friendliness, how easy is it to configure?

One way out of this is to implement something very simple in PostgreSQL, 
and build external WAL proxying tools in pgfoundry that allow you to 
cascade and disseminate the WAL in as complex scenarios as you want.

>> Your reply has again avoided the subject of how we would handle failure
>> modes with per-standby settings. That is important.
>
> I don't think anyone is avoiding that, we just haven't discussed it.
> The thing is, I don't think quorum commit actually does anything to
> address that problem.  If I have a master and a standby configured for
> sync rep and the standby goes down, we have to decide what impact that
> has on the master.  If I have a master and two standbys configured for
> sync rep with quorum commit such that I only need an ack from one of
> them, and they both go down, we still have to decide what impact that
> has on the master.  I agree we need to talk about, but I don't agree
> that putting in quorum commit will remove the need to design that
> case.

Right, failure modes need to be discussed, but how quorum commit or 
whatnot is configured is irrelevant to that.

No-one has come up with a scheme on how to abort a transaction if you 
don't get a reply from a synchronous standby (or all standbys or a 
quorum of standbys). Until someone does, a commit on the master will 
have to always succeed. The "synchronous" aspect will provide a 
guarantee that if a standby is connected, any transaction in the master 
will become visible (or fsync'd or just streamed to, depending on the 
level) on the standby too before it's acknowledged as committed to the 
client, nothing more, nothing less.

One way to do that would be to refrain from flushing the commit record 
to disk on the master until the standby has acknowledged it. The 
downside is that the master is in a very severe state at that point: 
until you flush the WAL, you can buffer only a small amount WAL traffic 
until you run out of wal_buffers, stalling all write activity in the 
master, with backends waiting. You can't even shut down the server 
cleanly. But if you value your transaction integrity much higher than 
availability, maybe that's what you want.

PS. I whole-heartedly agree with Simon's concern upthread that if we 
allow a standby to specify in its config file that it wants to be a 
synchronous standby, that's a bit dangerous because connecting such a 
standby to the master will suddenly make all commits on the master a lot 
slower. Adding a synchronous standby should require some action in the 
master, since it affects the behavior on master.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: [PATCH] Add XMLEXISTS function from the SQL/XML standard
Следующее
От: "Kevin Grittner"
Дата:
Сообщение: Re: Synchronization levels in SR