Re: Additional options for Sync Replication
От | Fujii Masao |
---|---|
Тема | Re: Additional options for Sync Replication |
Дата | |
Msg-id | AANLkTi=2XJsa3thNK4YngmY3+noTj0-O_Gn1n+RvbxmJ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Additional options for Sync Replication (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: Additional options for Sync Replication
|
Список | pgsql-hackers |
On Tue, Mar 29, 2011 at 8:57 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Mar 29, 2011 at 3:49 AM, Dimitri Fontaine > <dimitri@2ndquadrant.fr> wrote: >>> It would be better to just support it (recv|fsync|apply), >>> or no syncrep at all. Syncrep is incomplete without it. >> >> Agreed. > > I have trouble viewing the idea that it would be better not to ship > sync rep at all than to add more features to it as a serious proposal. > Presumably, anyone who is sad that sync rep doesn't have all of the > options they might want would be even sadder to hear that we went > through a whole development cycle and ended up with nothing at all. > Even if we did agree to take this patch, there will certainly be more > features that someone might want and not have, such as the ability to > sync with multiple standbys at once. > >> More than that, I think we should evaluate this patch on a cost/benefit >> ratio, rather than trying to apply to it all those procedural fences >> that we don't have, and that we don't have the size to benefit from. > > As a community, we've adopted a development plan that proceeds in > cycles. For the last several releases, we have had four CommitFests > in each release cycle, followed by a feature freeze and eventually by > beta and final release. It's certainly a valid question to ask how > well that procedure has served us. It does not seem likely to me that > we can continue to produce quality releases if we don't at some point > cut off the flow of new features into the source tree and work on > stabilizing the code we've already got, and I believe the point for > that was agreed by a large number of developers who sat in a room at > PGCon last year to be on or about February 15th. We ended up > extending that by a couple of weeks, to make sure that we had a > process that was FAIR: we didn't want patches that had been in the > pipeline for a very long time to get postponed to 9.2 because no > committer had had a chance to work on them yet. However, we also > bumped MANY patches to 9.2 because they weren't in sufficiently good > shape soon enough. If we accept this patch now because a bunch of > people say they really, really want it, isn't that unfair to the > people to whom we've already said "sorry, the deadline has passed"? > > Of course, there is always going to be some gray area. I argued for > committing the replication_timeout patch because I believe the fact > that we haven't got that feature is almost a bug - it interferes > significantly with the usability of replication in general, and it > will be an even more serious problem with sync rep, where a hung > standby connection will not only mean that nothing is replicating but > also that no write transactions can be processed at all. However, you > could make the opposite argument - that it's really a new feature - > and therefore we ought not to commit it. So far no one has taken that > position, but it's certainly a reasonable argument. Likewise, there > is ongoing discussion on the collations thread about which of those > changes are necessary for this release, and which ones are things that > ought to be postponed to a future release. I haven't gotten too > involved in those discussions because I don't really understand the > underlying issues, but I think that's an important discussion. I'm very excited about new options, especially recv. But I agree with Robert and Heikki because what the patch provides looks like new feature rather than bug fix. And I think that we still require some discussions of the design; how far transactions must wait for sync rep in recv mode? In the patch, they wait for WAL to be written in the standby, but I think that they should wait until walreceiver has recieved WAL instead. That would increase the performance of sync rep. Anyway, I don't think now is time to discuss about such a design except for bug fix. I like those additional options, but I believe that sync rep which we worked out is still useful without them. Replication timeout looks like a bug fix rather than new feature. Without that, walsender might unexpectedly remain for a while when the standby crashes or the network outage happens. As Robert said, sync rep can get stuck for a long while because of such a remaining walsender. What's the worse is that when hot_standby_feedback is enabled, such a remaining walsender would prevent oldest xmin from advancing and interfere with vacuuming on the master. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
В списке pgsql-hackers по дате отправления: