Re: Issues with Quorum Commit

Поиск

Список

Период

Сортировка

От	Greg Smith
Тема	Re: Issues with Quorum Commit
Дата	7 октября 2010 г. 16:42:36
Msg-id	4CADF853.30701@2ndquadrant.com обсуждение исходный текст
Ответ на	Re: Issues with Quorum Commit (Markus Wanner <markus@bluegap.ch>)
Ответы	Re: Issues with Quorum Commit (Markus Wanner <markus@bluegap.ch>) Re: Issues with Quorum Commit (Josh Berkus <josh@agliodbs.com>)
Список	pgsql-hackers

Дерево обсуждения

Markus Wanner wrote:
> I think that's a pretty special case, because the "good alerting system"
> is at least as expensive as another server that just persistently stores
> and ACKs incoming WAL.
>   

The cost of hardware capable of running a database server is a large 
multiple of what you can build an alerting machine for.  I have two 
systems that are approaching the trash heap just at my house, relative 
to the main work I do, but that are fully capable of running an alerting 
system.  Building a production quality database server requires a more 
significant investment:  high quality disks, ECC RAM, battery-backed 
RAID controller, etc.  Relative to what the hardware in a database 
server costs, what you need to build an alerting system is almost free.  
Oh:  and most businesses that are complicated enough to need a serious 
database server already have them, so they actually cost nothing beyond 
the software setup time to point them toward the databases, too.

> Why does one ever want the guarantee that sync replication gives to only
> hold true up to one failure, if a better guarantee doesn't cost anything
> extra? (Note that a "good alerting system" is impossible to achieve with
> only two servers. You need a third device anyway).
>   

I do not disagree with your theory or reasoning.  But as a practical 
matter, I'm afraid the true cost of the better guarantee you're 
suggesting here is additional code complexity that will likely cause 
this feature to miss 9.1 altogether.  As far as I'm concerned, this 
whole diversion into the topic of quorum commit is only consuming 
resources away from targeting something achievable in the time frame of 
a single release.

> Sync replication between really just two servers is asking for trouble
> and certainly not worth the savings in hardware cost. Better invest in a
> good UPS and redundant power supplies for a single server.
>   

I wish I could give you the long list of data recovery projects I've 
worked on over the last few years, so you could really appreciate how 
much what you're saying here is exactly the opposite of the reality 
here.  You cannot make a single server reliable enough to survive all of 
the things that Murphy's Law will inflict upon it, at any price.  For 
most of the businesses I work with who want sync rep, data is not 
considered safe until the second copy is on storage miles away from the 
original, because they know this too.

Personal anecdote I can share:  I used to have an important project 
related to stock trading where I kept my backup system about 50 miles 
away from me.  I was aiming for constant availability, while still being 
able to drive to the other server if needed for disaster recovery.  
Guess what?  Even those two turned out not to be nearly independent 
enough; see http://en.wikipedia.org/wiki/Northeast_Blackout_of_2003 for 
details of how I lost both of those at the same time for days.  Silly 
me, I'd only spread them across two adjacent states with different power 
providers!  Not nearly good enough to avoid a correlated failure.

-- 
Greg Smith, 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services and Support  www.2ndQuadrant.us

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Robert Haas
Дата: 07 октября 2010 г., 16:26:16
Сообщение: Re: O_DSYNC broken on MacOS X?

Следующее

От: Vincenzo Romano
Дата: 07 октября 2010 г., 17:09:04
Сообщение: Re: On Scalability

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Issues with Quorum Commit

Предыдущее

Следующее