Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion("!((itemid)->lp_flags
От | Bruce Momjian |
---|---|
Тема | Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion("!((itemid)->lp_flags |
Дата | |
Msg-id | 200510311756.j9VHuj510045@candle.pha.pa.us обсуждение исходный текст |
Ответ на | Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",) (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",)
|
Список | pgsql-patches |
Good analysis. I guess the question is what patch would we put into a subrelease? If you go for a new state code, rather than a separate boolean, does it reduce the size of the patch? --------------------------------------------------------------------------- Tom Lane wrote: > I wrote: > > OK, I think I see it. The problem is that the code in slru.c is careful > > about not modifying state when it doesn't hold the proper lock, but not > > so careful about not *inspecting* state without the proper lock. > > ... > > I'm still thinking about how to make a real fix without introducing > > another lock/unlock cycle --- we can do that if we have to, but I think > > maybe it's possible to fix it without. > > Attached is a proposed patch to fix up slru.c so that it's not playing > any games by either reading or writing shared state without holding > the ControlLock for the SLRU set. > > The main problem with the existing code, as I now see it, was a poor > choice of representation of page state: we had states CLEAN, DIRTY, and > WRITE_IN_PROGRESS, which meant that re-dirtying a page while a write was > in progress required setting the state back to DIRTY, which hid the fact > that indeed a write was in progress. So the I/O code was designed to > cope with not knowing whether another write was already in progress. We > can make it a whole lot cleaner by changing the state representation so > that we can tell the difference --- then, we can know before releasing > the ControlLock whether we need to write the page or just wait for > someone else to finish writing it. And that means we can do all the > state-manipulation before releasing the lock. > > I could have just added an additional state WRITE_IN_PROGRESS_REDIRTIED, > or some such, but it seemed notationally cleaner to create a separate > "page_dirty" boolean, and reduce the number of states to four (empty, > read-in-progress, valid, write-in-progress). This lets outside code > such as clog.c just set "page_dirty = true" rather than doing a complex > bit of logic to change the state value properly. > > The patch is a bit bulky because of the representation change, but the > key changes are localized in SimpleLruReadPage and SimpleLruWritePage. > > I think this code is a whole lot cleaner than it was before, but it's > a bit of a large change to be making so late in the 8.1 cycle (not to > mention that we really ought to back-patch similar changes all the way > back, because the bug exists as far back as 7.2). I am going to take > another look to see if there is a less invasive change that will fix > the problem at some performance cost; if so, we might want to do it that > way in 8.1 and back branches. > > Any comments on the patch, or thoughts on how to proceed? > > regards, tom lane > > Content-Description: slru-race-1.patch [ Type application/octet-stream treated as attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
В списке pgsql-patches по дате отправления:
Предыдущее
От: Tom LaneДата:
Сообщение: Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",)
Следующее
От: Tom LaneДата:
Сообщение: Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",)