Josh Berkus <josh@agliodbs.com> writes:
> HOWEVER, we've already kind of set up an indeterminate situation with
> allowing sync rep groups and candidate sync rep servers. Consider this:
> 1. Master server A is configured with sync replica B and candidate sync
> replica C
> 2. A rolling power/network failure event occurs, which causes B and C to
> go down sometime before A, and all of them to go down before the
> application does.
> 3. On restore, only C is restorable; both A and B are a total loss.
> Again, we have no way to know whether or not C was in sync replication
> when it went down. If C went down before B, then we've lost data; if B
> went down before C, we haven't. But we can't find out. *This* is where
> it would be useful to have C log whenever it went into (or out of)
> synchronous mode.
Good point, but C can't solve this for you just by logging. If C was the
first to go down, it has no way to know whether A and B committed more
transactions before dying; and it's unlikely to have logged its own crash,
either.
More fundamentally, if you want to survive the failure of M out of N
nodes, you need a sync configuration that guarantees data is on at least
M+1 nodes before reporting commit. The above example doesn't meet that,
so it's not surprising that you're screwed.
What we lack, and should work on, is a way for sync mode to have M larger
than one. AFAICS, right now we'll report commit as soon as there's one
up-to-date replica, and some high-reliability cases are going to want
more.
regards, tom lane