Re: Issues with Quorum Commit

Поиск
Список
Период
Сортировка
От Greg Smith
Тема Re: Issues with Quorum Commit
Дата
Msg-id 4CAE5B5B.2090600@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: Issues with Quorum Commit  (Markus Wanner <markus@bluegap.ch>)
Ответы Re: Issues with Quorum Commit  ("Joshua D. Drake" <jd@commandprompt.com>)
Re: Issues with Quorum Commit  (Fujii Masao <masao.fujii@gmail.com>)
Re: Issues with Quorum Commit  (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
Re: Issues with Quorum Commit  (Markus Wanner <markus@bluegap.ch>)
Re: Issues with Quorum Commit  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Markus Wanner wrote:
> So far I've been under the impression that Simon already has the code
> for quorum_commit k = 1.
>
> What I'm opposing to is the timeout "feature", which I consider to be
> additional code, unneeded complexity and foot-gun.
>   

Additional code?  Yes.  Foot-gun?  Yes.  Timeout should be disabled by 
default so that you get wait forever unless you ask for something 
different?  Probably.  Unneeded?  This is where we don't agree anymore.  
The example that Josh Berkus just sent to the list is a typical example 
of what I expect people to do here.  They'll use Sync Rep to maximize 
the odds a system failure doesn't cause any transaction loss.  They'll 
use good quality hardware on the master so it's unlikely to fail.  But 
when the database finds the standby unreachable, and it's left with the 
choice between either degrading into async rep or coming to a complete 
halt, you must give people the option of choosing to degrade instead 
after a timeout.  Let them set off the red flashing lights, sound the 
alarms, and pray the master doesn't go down until you can fix the 
problem.  But the choice to allow uptime concerns to win over the normal 
sync rep preferences, that's a completely valid business decision people 
will absolutely want to make in a way opposite of your personal 
preference here.

I don't see this as needing any implementation any more complicated than 
the usual way such timeouts are handled.  Note how long you've been 
trying to reach the standby.  Default to -1 for forever.  And if you hit 
the timeout, mark the standby as degraded and force them to do a proper 
resync when they disconnect.  Once that's done, then they can re-enter 
sync rep mode again, via the same process a new node would have done so.

-- 
Greg Smith, 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services and Support  www.2ndQuadrant.us
Author, "PostgreSQL 9.0 High Performance"    Pre-ordering at:
https://www.packtpub.com/postgresql-9-0-high-performance/book



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Smith
Дата:
Сообщение: Re: O_DSYNC broken on MacOS X?
Следующее
От: Tom Lane
Дата:
Сообщение: Re: I: About "Our CLUSTER implementation is pessimal" patch