Re: Sync Rep: First Thoughts on Code
От | Simon Riggs |
---|---|
Тема | Re: Sync Rep: First Thoughts on Code |
Дата | |
Msg-id | 1228245678.20796.410.camel@hp_dx2400_1 обсуждение исходный текст |
Ответ на | Sync Rep: First Thoughts on Code (Simon Riggs <simon@2ndQuadrant.com>) |
Список | pgsql-hackers |
On Tue, 2008-12-02 at 11:08 -0800, Jeff Davis wrote: > On Tue, 2008-12-02 at 13:09 +0000, Simon Riggs wrote: > > > Is it dangerous to abort the transaction with replication continued when > > > the timeout occurs? I think that the WAL consistency between two servers > > > might be broken. Because the WAL writing and sending are done concurrently, > > > and the backend might already write the WAL to disk on the primary when > > > waiting for walsender. > > > > The issue I see is that we might want to keep wal_sender_delay small so > > that transaction times are not increased. But we also want > > wal_sender_delay high so that replication never breaks. It seems better > > to have the action on wal_sender_delay configurable if we have an > > unsteady network (like the internet). Marcus made some comments on line > > dropping that seem relevant here; we should listen to his experience. > > > > Hmmm, dangerous? Well assuming we're linking commits with replication > > sends then it sounds it. We might end up committing to disk and then > > deciding to abort instead. But remember we don't remove the xid from > > procarray or mark the result in clog until the flush is over, so it is > > possible. But I think we should discuss this in more detail when the > > main patch is committed. > > > > What is the "it" in "it is possible"? It seems like there's still a > problem window in there. Marking a transaction aborted after we have written a commit record, but before we have removed it from proc array and marked in clog. We'd need a special kind of WAL record to do that. > Even if that could be made safe, in the event of a real network failure, > you'd just wait the full timeout every transaction, because it still > thinks it's replicating. True, but I did suggest having two timeouts. There is considerable reason to reduce the timeout as well as reason to increase it - at the same time. Anyway, lets wait for some user experience following commit. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
В списке pgsql-hackers по дате отправления: