Обсуждение: Re: [PATCHES] Reviewers Guide to Deferred Transactions/Transaction Guarantee
"Simon Riggs" <simon@2ndquadrant.com> writes: > transaction_guarantee.v11.patch I can't help feeling that this is enormously overcomplicated. The "DFC" in particular seems to not be worth its overhead. Why wouldn't we simply track the newest commit record at all times, and then whenever the wal writer wakes up, it would write/fsync that far (or write/fsync all completed WAL pages, if there's no new commit record to worry about)? I see the concern about not letting clog pages go to disk before the corresponding WAL data is flushed, but that could be handled much more simply: just force a flush through the newest commit record before any write of a clog page. Those writes are infrequent enough (every 32K transactions or one checkpoint) that this seems not a serious problem. The other interesting issue is not letting hint-bit updates get to disk in advance of the WAL flush, but I don't see a need to track those at a per-transaction level: just advance page LSN to latest commit record any time a hint bit is updated. The commit will likely be flushed before we'd be interested in writing the buffer out anyway. Moreover, the way you are doing it creates a conflict in that the DFC has to guarantee to remember every unflushed transaction, whereas it really needs to be just an approximate cache for its performance to be good. AFAIK there is no need to associate any forced flush with multixacts; there is no state saved across crashes for those anyway. I don't see a point in allowing the WAL writer to be disabled --- I believe it will be a performance win just like the bgwriter, independently of whether transaction_guarantee is used or not, by helping to keep down the number of dirty WAL buffers. That in turn allows some other simplifications, like not needing an assign hook for transaction_guarantee. I disagree with your desire to remove the fsync parameter. It may have less use than before with this feature, but that doesn't mean it has none. > 3. Should the WALWriter also do the wal_buffers half-full write at the > start of XLogInsert() ? That should go away entirely; to me the main point of the separate wal-writer process is to take over responsibility for not letting too many dirty wal buffers accumulate. regards, tom lane
Re: [PATCHES] Reviewers Guide to Deferred Transactions/Transaction Guarantee
От
"Zeugswetter Andreas ADI SD"
Дата:
I agree with Tom's reasoning about the suggested simplifications, sorry. > > 3. Should the WALWriter also do the wal_buffers half-full write at the > > start of XLogInsert() ? > > That should go away entirely; to me the main point of the > separate wal-writer process is to take over responsibility > for not letting too many dirty wal buffers accumulate. That also sounds a lot simpler, but I think Bruce wanted to be able to give some time guarantee to the not waiting for fsync txns. When a commit only half-filled the page and no more WAL comes in for a long time, there is only WALWriter to do the IO. The WALWriter would need to only flush a half-full page after timeout iff it contains a commit record. One more question on autocommit: Do we wait for a flush for an autocommitted DML ? Seems we generally should not. Andreas
On Thu, 2007-04-12 at 15:56 -0400, Tom Lane wrote: > "Simon Riggs" <simon@2ndquadrant.com> writes: > > transaction_guarantee.v11.patch Thanks for the review. > I can't help feeling that this is enormously overcomplicated. I agree with all but one of your comments, see below. > The "DFC" in particular seems to not be worth its overhead. Why wouldn't > we simply track the newest commit record at all times, and then whenever > the wal writer wakes up, it would write/fsync that far (or write/fsync > all completed WAL pages, if there's no new commit record to worry > about)? > The other interesting issue is not letting hint-bit updates get to disk > in advance of the WAL flush, but I don't see a need to track those at > a per-transaction level: just advance page LSN to latest commit record > any time a hint bit is updated. The commit will likely be flushed > before we'd be interested in writing the buffer out anyway. Moreover, > the way you are doing it creates a conflict in that the DFC has to > guarantee to remember every unflushed transaction, whereas it really > needs to be just an approximate cache for its performance to be good. I've spent a few hours thinking on this and I'm happy with it now. The lure of removing that much code is too strong to resist; its certainly easier to remove code after freeze than it is to add it. Advancing the LSN too far was a worry of mine, but we have the code now to cope if that shows to be a problem in testing. So lets strip that out. > I see the concern about not letting clog pages go to disk before the > corresponding WAL data is flushed, but that could be handled much more > simply: just force a flush through the newest commit record before any > write of a clog page. Those writes are infrequent enough (every 32K > transactions or one checkpoint) that this seems not a serious problem. This bit I'm not that happy with. You're right its fairly infrequent, but the clog pages are typically written when we extend the clog. That happens while holding XidGenLock and ProcArrayLock, so holding those across an additional (and real) I/O is going to make that blockage worse. We've been to great pains in other places to remove logjams and we know that the follow-on effects of logjams are not swift to clear when the system is running at full load on multiple CPU systems. The code to implement this is pretty clean: a few extra lines in clog/slru and bubbled-up API changes. I was actually thinking of adding something to the bgwriter to clean the LRU block of the clog, if it was dirty, once per cycle, to further reduce the possibility of I/O at that point. > AFAIK there is no need to associate any forced flush with multixacts; > there is no state saved across crashes for those anyway. Agreed. > I don't see a point in allowing the WAL writer to be disabled --- > I believe it will be a performance win just like the bgwriter, > independently of whether transaction_guarantee is used or not, > by helping to keep down the number of dirty WAL buffers. That in > turn allows some other simplifications, like not needing an assign hook > for transaction_guarantee. That would be pleasant. The other changes make hint bit setting need a LWlock request, so I wanted to include a way of saying "I never ever want to use transaction_guarantee = off". I see the beauty of your suggestion and agree. So keep the parameter, but let it default to 100ms? Range 10-1000ms? > I disagree with your desire to remove the fsync parameter. It may have > less use than before with this feature, but that doesn't mean it has > none. OK > > 3. Should the WALWriter also do the wal_buffers half-full write at the > > start of XLogInsert() ? > > That should go away entirely; to me the main point of the separate > wal-writer process is to take over responsibility for not letting too > many dirty wal buffers accumulate. Yes I'll make the agreed changes by next Wed/Thurs. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Simon Riggs wrote: > > That should go away entirely; to me the main point of the separate > > wal-writer process is to take over responsibility for not letting too > > many dirty wal buffers accumulate. > > Yes > > > I'll make the agreed changes by next Wed/Thurs. I have seen no patch yet with the agreed changes. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Thu, 2007-04-26 at 21:14 -0400, Bruce Momjian wrote: > Simon Riggs wrote: > > > That should go away entirely; to me the main point of the separate > > > wal-writer process is to take over responsibility for not letting too > > > many dirty wal buffers accumulate. > > > > Yes > > > > > > I'll make the agreed changes by next Wed/Thurs. > > I have seen no patch yet with the agreed changes. True, will be at least a few more days yet. I've got a few questions I'll post later today. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Simon Riggs wrote: > On Thu, 2007-04-26 at 21:14 -0400, Bruce Momjian wrote: > > Simon Riggs wrote: > > > > That should go away entirely; to me the main point of the separate > > > > wal-writer process is to take over responsibility for not letting too > > > > many dirty wal buffers accumulate. > > > > > > Yes > > > > > > > > > I'll make the agreed changes by next Wed/Thurs. > > > > I have seen no patch yet with the agreed changes. > > True, will be at least a few more days yet. > > I've got a few questions I'll post later today. Again, I have seen no follup patch for this. I would keep it for 8.4 but I am unsure what the issue even is, so I am deleting this message. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Simon Riggs wrote: > > > 3. Should the WALWriter also do the wal_buffers half-full write at the > > > start of XLogInsert() ? > > > > That should go away entirely; to me the main point of the separate > > wal-writer process is to take over responsibility for not letting too > > many dirty wal buffers accumulate. > > Yes > > > I'll make the agreed changes by next Wed/Thurs. Ah, here is the item Simon was talking about. Simon, when are we getting the updated patch? If not soon, the entire patch will be kept for 8.4. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Fri, 2007-04-13 at 16:09 +0100, Simon Riggs wrote: > I'll make the agreed changes by next Wed/Thurs. I am actively working on this now, after some delays because of other calls on my time. The suggested changes have needed more rework than I estimated, touching most lines of the patch, but I don't see any problems in changing it as agreed. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com