Re: Sync Rep Design

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Sync Rep Design
Дата
Msg-id 4D1DCF5A.7070808@enterprisedb.com
обсуждение исходный текст
Ответ на Re: Sync Rep Design  (Simon Riggs <simon@2ndQuadrant.com>)
Ответы Re: Sync Rep Design  (Simon Riggs <simon@2ndQuadrant.com>)
Re: Sync Rep Design  (Hannu Krosing <hannu@2ndquadrant.com>)
Список pgsql-hackers
On 31.12.2010 13:48, Simon Riggs wrote:
> On Fri, 2010-12-31 at 12:06 +0200, Heikki Linnakangas wrote:
>
>> Regarding the rest of the proposal, I would still prefer the UI
>> discussed here:
>>
>> http://archives.postgresql.org/message-id/4CAE030A.2060701@enterprisedb.com
>>
>> It ought to be the same amount of work to implement, and provides the
>> same feature set, but makes administration a bit easier by being able to
>> name the standbys. Also, I dislike the idea of having the standby
>> specify that it's a synchronous standby that the master has to wait for.
>> Behavior on the master should be configured on the master.
>
> Good point; I've added the people on the copy list from that post. This
> question is they key, so please respond after careful thought on my
> points below.
>
> There are ways to blend together the two approaches, discussed later,
> though first we need to look at the reasons behind my proposals.
>
> I see significant real-world issues with configuring replication using
> multiple named servers, as described in the link above:

All of these points only apply to specifying *multiple* named servers in 
the synchronous_standbys='...' list. That's certainly a more complicated 
scenario, and the configuration is more complicated as a result. With 
your proposal, it's not possible in the first place.

Multiple synchronous standbys probably isn't needed by most people, so 
I'm fine with leaving that out for now, keeping the design the same 
otherwise. I included it in the proposal because it easily falls out of 
the design. So, if you're worried about the complexities of multiple 
synchronous standbys, let's keep the UI exactly the same as what I 
described in the link above, but only allow one name in the 
synchronous_standbys setting, instead of a list.

> 3. Administrative complexity just jumped a huge amount.
>
> (a) If you add or remove servers to the config you need to respecify all
> the parameters, which need to be specific to the exact set of servers.

Hmm, this could be alleviated by allowing the master to have a name too. 
All the configs could then be identical, except for the unique name for 
each server. For example, for a configuration with three servers that 
are all synchronous with each other, each server would have 
"synchronous_standbys='server1, server2, server3'" in the config file. 
The master would simply ignore the entry for itself.

> (b) After failover, the list of synchronous_standbys needs to be
> re-specified, yet what is the correct list of servers? The only way to
> make that config work is with complex middleware that automatically
> generates new config files.

It depends on what you want. I think you're envisioning that the 
original server is taken out of the system and not waited for, meaning 
that you accept a lower level of persistence after failover. Yes, then 
you need to change the config. Or more likely you prepare the config 
file in the standby that way to begin with.

> I don't think that is "the same amount of
> work to implement", its an order of magnitude harder overall.

I meant it's the same amount of work to implement the feature in 
PostgreSQL. No doubt that maintaining such a setup in production is more 
complicated.

> 5. Requesting sync from more than one server performs poorly, since you
> must wait for additional servers. If there are sporadic or systemic
> network performance issues you will be badly hit by them. Monitoring
> that just got harder also. First-response-wins is more robust in the
> case of volatile resources since it implies responsiveness to changing
> conditions.
>
> 6. You just lost the ability to control performance on the master, with
> a userset. Performance is a huge issue with sync rep. If you can't
> control it, you'll simply turn it off. Having a feature that we daren't
> ever use because it performs poorly helps nobody. This is not a tick-box
> in our marketing checklist, I want it to be genuinely real-world usable.

You could make synchronous_standbys a user-settable GUC, just like your 
proposed boolean switch. You could then control on a per-transaction 
basis which servers you want to wait to respond. Although perhaps it 
would be more user-friendly to just have an additional boolean GUC, 
similar to synchronous_commit=on/off. Or maybe synchronous_commit is 
enough to control that.

> I suppose we might regard the feature set I am proposing as being the
> same as making synchronous_standbys a USERSET parameter, and allowing
> just two options:
> "none" - allowing the user to specify async if they wish it
> "*" - allowing people to specify that syncing to *any* standby is
> acceptable
>
> We can blend the two approaches together, if we wish, by having two
> parameters (plus server naming)
>    synchronous_replication = on | off (USERSET)
>    synchronous_standbys = '...'
> If synchronous_standbys is not set and synchronous_replication = on then
> we sync to any standby. If  synchronous_replication = off then we use
> async replication, whatever synchronous_standbys is set to.
> If synchronous_standbys is set, then we use sync rep to all listed
> servers.

Sounds good.

I still don't like the synchronous_standbys='' and 
synchronous_replication=on combination, though. IMHO that still amounts 
to letting the standby control the behavior on master, and it makes it 
impossible to temporarily add an asynchronous standby to the mix. I 
could live with it, you wouldn't be forced to use it that way after all, 
but I would still prefer to throw an error on that combination. Or at 
least document the pitfalls and recommend always naming the standbys.

> My proposal amounts to "lets add synchronous_standbys as a parameter in
> 9.2". If you really think that we need that functionality in this
> release, lets get the basic stuff added now and then fold in those ideas
> on top afterwards. If we do that, I will help. However, my only
> insistence is that we explain the above points very clearly in the docs
> to specifically dissuade people from using those features for typical
> cases.

Huh, wait, if you leave out synchronous_standbys, that's a completely 
different UI again. I think we've finally reached agreement on how this 
should be configured, let's stick to that, please.

(I would be fine with limiting synchronous_standbys to just one server 
in this release though.)

> If you wondered why I ignored your post previously, its because I
> understood that Fujii's post of 15 Oct, one week later, effectively
> accepted my approach, albeit with two additional parameters. That is the
> UI that I had been following.
> http://archives.postgresql.org/pgsql-hackers/2010-10/msg01009.php

That thread makes no mention of how to specify which standbys are 
synchronous and which are not. It's about specifying the timeout and 
whether to wait for a disconnected standby. Yeah, Fujii-san's proposal 
seems reasonable for configuring that.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Sync Rep Design
Следующее
От: Magnus Hagander
Дата:
Сообщение: Re: Old git repo