Re: Issues with Quorum Commit

Поиск
Список
Период
Сортировка
От Dimitri Fontaine
Тема Re: Issues with Quorum Commit
Дата
Msg-id m21v82gunq.fsf@2ndQuadrant.fr
обсуждение исходный текст
Ответ на Re: Issues with Quorum Commit  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Ответы Re: Issues with Quorum Commit  (Markus Wanner <markus@bluegap.ch>)
Re: Issues with Quorum Commit  (Aidan Van Dyk <aidan@highrise.ca>)
Список pgsql-hackers
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Either that, or you configure your system for asynchronous replication
> first, and flip the switch to synchronous only after the standby has caught
> up. Setting up the first standby happens only once when you initially set up
> the system, or if you're recovering from a catastrophic loss of the
> standby.

Or if the standby is lagging and the master wal_keep_segments is not
sized big enough. Is that a catastrophic loss of the standby too?

>> It's all about the standard case you're building, sync rep, and how to
>> manage errors. In most cases I want flexibility. Alert says standby is
>> down, you lost your durability requirements, so now I'm building a new
>> standby. Does it mean my applications are all off and the master
>> refusing to work?
>
> Yes. That's why you want to have at least two standbys if you care about
> availability. Or if durability isn't that important to you after all, use
> asynchronous replication.

Agreed, that's a nice simple use case.

Another one is to say that I want sync rep when the standby is
available, but I don't have the budget for more. So I prefer a good
alerting system and low-budget-no-guarantee when the standby is down,
that's my risk evaluation.

> Of course, if in the heat of the moment the admin is willing to forge ahead
> without the standby, he can temporarily change the configuration in the
> master. If you want the standby to be rebuilt automatically, you can even
> incorporate that configuration change in the scripts too. The important
> point is that you or your scripts are in control, and you know at all times
> whether you can trust the standby or not. If the master makes such decisions
> automatically, you don't know if the standby is trustworthy (ie. guaranteed
> up-to-date) or not.

My proposal is that the master has the information to make the decision,
and the behavior is something you setup. Default to security, so wait
forever and block the applications, but could be set to ignore standby
that have not at least reached this state.

I don't see that you can make everybody happy without a knob here, and I
don't see how we can deliver one without a clear state diagram of the
standby possible current states and transitions.

The other alternative is to just don't care and accept the timeout as
being an option with the quorum, so that you just don't wait for the
quorum if so you want. It's much more dynamic and dangerous, but with a
good alerting system it'll be very popular I guess.

> I don't see anything wrong with having tools for admins to deal with the
> unexpected. I'm not sure overriding individual transactions is very useful
> though, more likely you'll want to take the whole server offline, or you
> want to change the config to allow all transactions to continue without the
> synchronous standby.

The question then is, should the new configuration alter running
transactions? My implicit was that I don't think so, and then I need
another facility, such as
 SELECT pg_cancel_quorum_wait(procpid)   FROM pg_stat_activity  WHERE waiting_quorum;

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Heikki Linnakangas
Дата:
Сообщение: Re: Issues with Quorum Commit
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: Issues with Quorum Commit