Обсуждение: infinite loop in _bt_getstackbuf
A colleague at EnterpriseDB today ran into a situation on PostgreSQL 9.3.5 where the server went into an infinite loop while attempting a VACUUM FREEZE; it couldn't escape _bt_getstackbuf(), and it couldn't be killed with ^C. I think we should add a check for interrupts into that loop somewhere; and possibly make some attempt to notice if we've been iterating for longer than, say, the lifetime of the universe until now. The fundamental structure of that function is an infinite loop. We break out of that loop when BTEntrySame(item, &stack->bts_btentry) or P_RIGHTMOST(opaque) and I'm sure that it's correct to think that, in theory, one of those things will eventually happen. But the index could be corrupted, most obviously by having a page where opaque->btpo_next points pack to the current block number. If that happens, you need an immediate shutdown (or some clever gdb hackery) to terminate the VACUUM. That's unfortunate and unnecessary. It also looks likes something we can fix, at a minimum by adding a CHECK_FOR_INTERRUPTS() at the top of that loop, or in some function that it calls, like _bt_getbuf(), so that if it goes into an infinite loop, it can at least be killed. We could also onsider adding a check at the bottom of the loop, just before setting blkno = opaque->btpo_next, that those values are unequal. If they are, elog(). Clearly it's possible to have a cycle of length >1, and such a check wouldn't catch that, but it might still be worth checking for the trivial case. Or, we could try to put an upper bound on the number of iterations that are reasonable and error out if we exceed that value. That might be tricky, though; it's not obvious to me that there's any comfortably small upper bound. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas wrote: > A colleague at EnterpriseDB today ran into a situation on PostgreSQL > 9.3.5 where the server went into an infinite loop while attempting a > VACUUM FREEZE; it couldn't escape _bt_getstackbuf(), and it couldn't > be killed with ^C. I think we should add a check for interrupts into > that loop somewhere; Our design principle in this area is that all loops should have CHECK_FOR_INTERRUPTS() calls somewhere, so that even if data is horribly corrupted you can get out of it. (Trivial loops where the exit condition cannot possibly fail don't need to apply --- surely we don't need to cope with hardware that makes i+1 go back to i-1 or whatever.) Therefore I don't think you need to argue very hard in order to add more interrupt checks if you see a loop that's missing them. For example, Andres was saying to me just this morning that GetMultiXactIdMembers is lacking one such check, and I was considering pushing a patch to add it without any discussion. > and possibly make some attempt to notice if we've > been iterating for longer than, say, the lifetime of the universe > until now. This I'm not so sure about. Adding extra logic in all nontrivial loops to detect whether they have been running for "too long" is likely to cause too much overhead. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Robert Haas wrote: >> A colleague at EnterpriseDB today ran into a situation on PostgreSQL >> 9.3.5 where the server went into an infinite loop while attempting a >> VACUUM FREEZE; it couldn't escape _bt_getstackbuf(), and it couldn't >> be killed with ^C. I think we should add a check for interrupts into >> that loop somewhere; > Our design principle in this area is that all loops should have > CHECK_FOR_INTERRUPTS() calls somewhere, so that even if data is horribly > corrupted you can get out of it. FWIW, I concur with Alvaro that adding a CHECK_FOR_INTERRUPTS() needn't require much discussion. Given the lack of prior complaints about this loop, I'm not sure I see the need to work harder than that; corruption of this sort must be quite rare. regards, tom lane
On Thu, Oct 30, 2014 at 03:52:01PM -0400, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > Robert Haas wrote: > >> A colleague at EnterpriseDB today ran into a situation on PostgreSQL > >> 9.3.5 where the server went into an infinite loop while attempting a > >> VACUUM FREEZE; it couldn't escape _bt_getstackbuf(), and it couldn't > >> be killed with ^C. I think we should add a check for interrupts into > >> that loop somewhere; > > > Our design principle in this area is that all loops should have > > CHECK_FOR_INTERRUPTS() calls somewhere, so that even if data is horribly > > corrupted you can get out of it. > > FWIW, I concur with Alvaro that adding a CHECK_FOR_INTERRUPTS() needn't > require much discussion. +1 > Given the lack of prior complaints about this > loop, I'm not sure I see the need to work harder than that; corruption > of this sort must be quite rare. Looks like _bt_getstackbuf() is always called with some buffer lock held, so CHECK_FOR_INTERRUPTS() alone would not help: http://www.postgresql.org/message-id/flat/16519.1401395152@sss.pgh.pa.us
Noah Misch <noah@leadboat.com> writes: > Looks like _bt_getstackbuf() is always called with some buffer lock held, so > CHECK_FOR_INTERRUPTS() alone would not help: > http://www.postgresql.org/message-id/flat/16519.1401395152@sss.pgh.pa.us Oooh, good point. I never followed up on that idea, but we would have to in order to fix the case Robert is on about. regards, tom lane
On Thu, Oct 30, 2014 at 11:45 PM, Noah Misch <noah@leadboat.com> wrote: >> Given the lack of prior complaints about this >> loop, I'm not sure I see the need to work harder than that; corruption >> of this sort must be quite rare. > > Looks like _bt_getstackbuf() is always called with some buffer lock held, so > CHECK_FOR_INTERRUPTS() alone would not help: > > http://www.postgresql.org/message-id/flat/16519.1401395152@sss.pgh.pa.us That's the insert path. What about the vacuum path? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Oct 31, 2014 at 10:29:53AM -0400, Robert Haas wrote: > On Thu, Oct 30, 2014 at 11:45 PM, Noah Misch <noah@leadboat.com> wrote: > >> Given the lack of prior complaints about this > >> loop, I'm not sure I see the need to work harder than that; corruption > >> of this sort must be quite rare. > > > > Looks like _bt_getstackbuf() is always called with some buffer lock held, so > > CHECK_FOR_INTERRUPTS() alone would not help: > > > > http://www.postgresql.org/message-id/flat/16519.1401395152@sss.pgh.pa.us > > That's the insert path. What about the vacuum path? I am not aware of an occasion where the vacuum path will call _bt_getstackbuf() without already holding some buffer lock.
On Thu, Oct 30, 2014 at 10:46 AM, Robert Haas <robertmhaas@gmail.com> wrote: > (9.3.5 problem report) I think I saw a similar issue, by a 9.3.5 instance that was affected by the "in pg_upgrade, remove pg_multixact files left behind by initdb" issue (I ran the remediation recommended in the 9.3.5 release notes). Multiple anti-wraparound vacuums were stuck following a PITR. I resolved this (as far as I can tell) by killing the autovacuum workers, and manually running VACUUM FREEZE. I have yet to do any root cause analysis, but I think I could reproduce the problem. > The fundamental structure of that function is an infinite loop. We > break out of that loop when BTEntrySame(item, &stack->bts_btentry) or > P_RIGHTMOST(opaque) and I'm sure that it's correct to think that, in > theory, one of those things will eventually happen. Not in theory - only in practice. L&Y specifically state: "We wish to point out here that our algorithms do not prevent the possibility of livelock (where one process rrms indefinitely). This can happen if a process never terminates because it keeps having to follow link pointers created by other processes. This might happen in the case of a process being run on a (relatively) very slow processor in a multiprocessor system". > But the index > could be corrupted, most obviously by having a page where > opaque->btpo_next points pack to the current block number. If that > happens, you need an immediate shutdown (or some clever gdb hackery) > to terminate the VACUUM. That's unfortunate and unnecessary. Merlin reported a bug that looked exactly like this. Hardware failure may now explain the problem. > It also looks likes something we can fix, at a minimum by adding a > CHECK_FOR_INTERRUPTS() at the top of that loop, or in some function > that it calls, like _bt_getbuf(), so that if it goes into an infinite > loop, it can at least be killed. I think that it might be a good idea to have circular _bt_moveright() moves (the direct offender in Merlin's case, which has very similar logic to your _bt_getstackbuf() problem case) detected. I'm pretty sure that it's exceptional for there to be more than 2 or 3 retries in _bt_moveright(). It would probably be fine to consider the possibility that we'll never finish once we get past 5 retries or something like that. We'd then start keeping track of blocks visited, and raise an error when a page was visited a second time. -- Peter Geoghegan
On Thu, Jan 15, 2015 at 5:46 PM, Peter Geoghegan <pg@heroku.com> wrote: > I think that it might be a good idea to have circular _bt_moveright() > moves (the direct offender in Merlin's case, which has very similar > logic to your _bt_getstackbuf() problem case) detected. I'm pretty > sure that it's exceptional for there to be more than 2 or 3 retries in > _bt_moveright(). It would probably be fine to consider the possibility > that we'll never finish once we get past 5 retries or something like > that. We'd then start keeping track of blocks visited, and raise an > error when a page was visited a second time. Yeah, I could go for that. Possibly somebody might object that it's a lot of code that will never get tested in normal operation, but as this problem doesn't seem to be strictly theoretical I'm not sure I subscribe to that objection. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company