Re: Synchronous Standalone Master Redoux

Поиск
Список
Период
Сортировка
От Shaun Thomas
Тема Re: Synchronous Standalone Master Redoux
Дата
Msg-id 4FFECF44.4010409@optionshouse.com
обсуждение исходный текст
Ответ на Re: Synchronous Standalone Master Redoux  (Daniel Farina <daniel@heroku.com>)
Ответы Re: Synchronous Standalone Master Redoux  (Aidan Van Dyk <aidan@highrise.ca>)
Re: Synchronous Standalone Master Redoux  (Bruce Momjian <bruce@momjian.us>)
Список pgsql-hackers
On 07/12/2012 12:31 AM, Daniel Farina wrote:

> But RAID-1 as nominally seen is a fundamentally different problem,
> with much tinier differences in latency, bandwidth, and connectivity.
> Perhaps useful for study, but to suggest the problem is *that* similar
> I think is wrong.

Well, yes and no. One of the reasons I brought up DRBD was because it's 
basically RAID-1 over a network interface. It's not without overhead, 
but a few basic pgbench tests show it's still 10-15% faster than a 
synchronous PG setup for two servers in the same rack. Greg Smith's 
tests show that beyond a certain point, a synchronous PG setup 
effectively becomes untenable simply due to network latency in the 
protocol implementation. In reality, it probably wouldn't be usable 
beyond two servers in different datacenters in the same city.

RAID-1 was the model for DRBD, but I brought it up only because it's 
pretty much the definition of a synchronous commit that degrades 
gracefully. I'd even suggest it's more important in a network context 
than for RAID-1, because you're far more likely to get sync 
interruptions due to network issues than you are for a disk to fail.

> But, putting that aside, why not write a piece of middleware that
> does precisely this, or whatever you want? It can live on the same
> machine as Postgres and ack synchronous commit when nobody is home,
> and notify (e.g. page) you in the most precise way you want if nobody
> is home "for a while".

You're right that there are lots of ways to kinda get this ability, 
they're just not mature enough or capable enough to really matter. 
Tailing the log to watch for secondary disconnect is too slow. Monit or 
Nagios style checks are too slow and unreliable. A custom-built 
middle-layer (a master-slave plugin for Pacemaker, for example) is too 
slow. All of these would rely on some kind of check interval. Set that 
too high, and we get 10,000xn missed transactions for n seconds. Too 
low, and we'd increase the likelihood of false positives and unnecessary 
detachments.

If it's possible through a PG 9.x extension, that'd probably be the way 
to *safely* handle it as a bolt-on solution. If the original author of 
the patch can convert it to such a beast, we'd install it approximately 
five seconds after it finished compiling.

So far as transaction durability is concerned... we have a continuous 
background rsync over dark fiber for archived transaction logs, DRBD for 
block-level sync, filesystem snapshots for our backups, a redundant 
async DR cluster, an offsite backup location, and a tape archival 
service stretching back for seven years. And none of that will cause the 
master to stop processing transactions unless the master itself dies and 
triggers a failover.

Using PG sync in its current incarnation would introduce an extra 
failure scenario that wasn't there before. I'm pretty sure we're not the 
only ones avoiding it for exactly that reason. Our queue discards 
messages it can't fulfil within ten seconds and then throws an error for 
each one. We need to decouple the secondary as quickly as possible if it 
becomes unresponsive, and there's really no way to do that without 
something in the database, one way or another.

-- 
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
sthomas@optionshouse.com



______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ronan Dunklau
Дата:
Сообщение: Re: PG9.2 and FDW query planning.
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Schema version management