Re: max_standby_delay considered harmful
| От | Simon Riggs | 
|---|---|
| Тема | Re: max_standby_delay considered harmful | 
| Дата | |
| Msg-id | 1273431179.3936.1077.camel@ebony обсуждение исходный текст | 
| Ответ на | Re: max_standby_delay considered harmful (Tom Lane <tgl@sss.pgh.pa.us>) | 
| Ответы | Re: max_standby_delay considered harmful | 
| Список | pgsql-hackers | 
On Sat, 2010-05-08 at 20:57 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On Sunday 09 May 2010 01:34:18 Bruce Momjian wrote: > >> I think everyone agrees the current code is unusable, per Heikki's > >> comment about a WAL file arriving after a period of no WAL activity, and > >> look how long it took our group to even understand why that fails so > >> badly. > > > To be honest its not *that* hard to simply make sure generating wal regularly > > to combat that. While it surely aint a nice workaround its not much of a > > problem either. > > Well, that's dumping a kluge onto users; but really that isn't the > point. What we have here is a badly designed and badly implemented > feature, and we need to not ship it like this so as to not > institutionalize a bad design. No, you have it backwards. HS was designed to work with SR. SR unfortunately did not deliver any form of monitoring, and in doing so the keepalive that it was known HS needed was left out, although it had been on the todo list for some time. Luckily Greg and I argued to have some monitoring added and my code was used to provide barest minimum monitoring for SR, yet not enough to help HS. Of course, if one team doesn't deliver for whatever reason then others must take up the slack, if they can: no complaints. Since I personally didn't know this was going to be the case until after freeze, it is very late to resolve this situation sensibly and time has been against us. It's much harder for me to reach into the depths of another person's work and see how to add necessary mechanisms, especially when I'm working elsewhere. Even if I had done, it's likely that I would have been blocked with the "great idea, next release" response as already used on this thread. Without doubt the current mechanism suffers from the issues you mention, though the current state is not the result of bad design, merely inaction and lack of integration. We could resolve the current state in many ways, if we chose. Bruce has used the word crippleware for the current state. Raising a problem and then blocking solutions is the best way I know to cripple a release. It should be clear that I've done my best to avoid this situation and have been active on both SR and HS. Had I not acted as I have done to date, SR would at this point slurp CPU like a bandit and be unmonitorable, both fatal flaws in production. I point this out not to argue, but to set the record straight. IMHO your assignment of blame is misplaced and your comments about poor design do not reflect how we arrived at the current state. -- Simon Riggs www.2ndQuadrant.com
В списке pgsql-hackers по дате отправления: