Re: Synchronous Standalone Master Redoux

Поиск
Список
Период
Сортировка
От Jose Ildefonso Camargo Tolosa
Тема Re: Synchronous Standalone Master Redoux
Дата
Msg-id CAETJ_S8FbJNRgcZMoomQvzyrYxEquhF91ghpkQQyoEZyhUp99w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Synchronous Standalone Master Redoux  (Hampus Wessman <hampus@hampuswessman.se>)
Список pgsql-hackers
Hi Hampus,

On Fri, Jul 13, 2012 at 2:42 AM, Hampus Wessman <hampus@hampuswessman.se> wrote:
> Hi all,
>
> Here are some (slightly too long) thoughts about this.

Nah, not that long.

>
> Shaun Thomas skrev 2012-07-12 22:40:
>
>> On 07/12/2012 12:02 PM, Bruce Momjian wrote:
>>
>>> Well, the problem also exists if add it as an internal database
>>> feature --- how long do we wait to consider the standby dead, how do
>>> we inform administrators, etc.
>>
>>
>> True. Though if there is no secondary connected, either because it's not
>> there yet, or because it disconnected, that's an easy check. It's the
>> network lag/stall detection that's tricky.
>
>
> It is indeed tricky to detect this. If you don't get an (immediate) reply
> from the secondary (and you never do!), then all you can do is wait and
> *eventually* (after how long? 250ms? 10s?) assume that there is no
> connection between them. The conclusion may very well be wrong sometimes. A
> second problem is that we still don't know if this is caused by some kind of
> network problems or if it's caused by the secondary not running. It's
> perfectly possible that both servers are working, but just can't communicate
> at the moment.

How about: same logic as it currently uses to detect when the
"designated" synchronous standby is no longer there, and move on to
the next one on the synchronous_standby_names?

The rule to *know* that a standby went away is already there.

>
> The thing is that what we do next (at least if our data is important and why
> otherwise use synchronous replication of any kind...) depends on what *did*
> happen. Assume that we have two database servers. At any time we need at
> most one primary database to be running. Without that requirement our data
> can get messed up completely... If HA is important to us, we may choose to

Not necessarily, but true: that's why you use to kill the (failing?)
node on promotion of the standby, just in case.

> do a failover to the secondary (and live without replication for the moment)
> if the primary fails. With synchronous repliction, we can do this without
> losing any data. If the secondary also dies, then we do lose data (and we'll
> know it!), but it might be an acceptable risk. If the secondary isn't
> permanently damaged, then we might even be able to get the data back after
> some down time. Ok, so that's one way to reconfigure the database servers on
> a failure. If the secondary fails instead, then we can do similarly and
> remove it from the "cluster" (or in other words, disable synchronous
> replication to the secondary). Again, we don't lose any data by doing this.

Right, but you have to monitor the standby too! ie: more work on the
pacemaker side..... and non-trivial work, for example, just blowing
away the standby won't do any good here, as for the master: you can
just power it off, promote the standby, and be done with it!, if the
standby fails: you have to modify master's config, and reload configs
there... more code: more chances of failure.

> We're taking a certain risk, however. We can't safely do a failover to the
> secondary anymore... So if the primary fails now, then the only way not to
> lose data is to hope that we can get it back from the failed machine (the
> failure may be temporary).
>
> There's also the third possibility, of course, that the two servers are both
> up and running, but they can't communicate over the network at the moment
> (this is, by the way, a difference from RAID, I guess). What do we do then?

Kill the "failing" node, just in case, in this case, without the
"extra" work of monitoring standby, you would just make the standby
kill the master before promoting the standby.

> Well, we still need at most one primary database server. We'll have to
> (somehow, which doesn't matter as much) decide which database to keep and
> consider the other one "down". Then we can just do as above (with all the

This is arbitrary, we usually just assume the master to be failing
when the standby is healthy (from the standby point of view).

> same implications!). Is it always a good idea to keep the primary? No! What
> if you (as a stupid example) pull the network cable from the primary (or
> maybe turn off a switch so that it's isolated from most of the network)? In

That means that you failed to have redundant connectivity to the
standby (that is a must on clusters), yes, redundant switch too: with
"smart switches" on the <US$100 range now, there is no much excuse for
not having 2 switches connecting your cluster (but, if you have just 2
nodes, you just need 2 network interfaces, and 2 network cables).

> that case you probably want the secondary to take over instead. At least if
> you value service availability. At this point we can still do a safe
> failover too.
>
> My point here is that if HA is important to you, then you may very well want
> to disable synchronous replication on a failure to avoid down time, but this
> has to be integrated with your overall failover / cluster management
> solution. Just having the primary automatically disable synchronous

That's not a trivial matter, you have to monitor the standby, and make
changes on the master configuration.

> replication doesn't seem overly useful to me... If you're using synchronous
> replication to begin with, you probably want to *know* if you may have lost
> data or not. Otherwise, you will have to assume that you did and then you

Right, and you would know, when the standby node (or service) goes
down, the monitoring system can inform you.. but it doesn't have to
change master's config.

> could frankly have been running async replication all along. If you do

No, you can't, because the 99.9% of the time when standby is healthy
and connected, you are at risk of losing transactions if you run async
replication.

> integrate it with your failover solution, then you can keep track of when
> it's safe to do a failover and when it's not, however, and decide how to
> handle each case.

Of course you can, but it is more complex, and likely slower.  For
example, if master detects that standby disconnected: TCP connection
was closed, it can just fallback to async while it comes back, then
pass through the "catch-up" process when it comes back, and go back to
sync.  The monitor will likely take, at the very least, 1 second (up
to 30 seconds, on most configurations) to realize, make the change,
and then reload master's config.

See, the main problem here is that, with current PostgreSQL behavior,
you have doubled the chances of service disruption: if master fails,
there is the time the cluster takes to note it, and bring standby up
(and kill master, likely), AND if standby fails, there is the time the
cluster takes to note it, change configs on master, and reload.

>
> How you decide what to do with the servers on failures isn't that important
> here, really. You can probably run e.g. Pacemaker on 3+ machines and have it
> check for quorums to accomplish this. That's a good approach at least. You
> can still have only 2 database servers (for cost reasons), if you want.
> PostgreSQL could have all this built-in, but I don't think it sounds overly
> useful to only be able to disable synchronous replication on the primary
> after a timeout. Then you can never safely do a failover to the secondary,
> because you can't be sure synchronous replication was active on the failed
> primary...

Or have a mixed cluster of application servers and DB servers, and
have them support each other for quorum.

And no, not after a timeout: immediately  if TCP socket is closed, or
with the same logic as it "switches" to other sync standby otherwise.

--
Ildefonso Camargo
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: initdb and fsync
Следующее
От: Jose Ildefonso Camargo Tolosa
Дата:
Сообщение: Re: Synchronous Standalone Master Redoux