Обсуждение: Re: [BUGS] Out of memory error causes Abort, Abort tries to allocate memory
I wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> Jeff Davis wrote: > * smgrGetPendingDeletes() calls palloc() > * palloc() fails, resulting in ERROR, causing infinite recursion >> Hmm, maybe we could have AbortTransaction switch to ErrorContext, which >> has some preallocated space, before calling RecordTransactionAbort (or >> maybe have RecordTransactionAbort itself do it). > Seems like it'd be smarter to try to free some memory before we push > forward with transaction abort. ErrorContext has only a limited amount > of space ... I've been thinking more about this problem. There are two basic strategies we could follow to ensure that AbortTransaction has some room to work in: A: Try to free space before we start the actual abort. B: Keep some reserved space for AbortTransaction to use. (It seems untenable to try not to ever alloc any memory at all during AbortTransaction.) I'm not sure that either of these can be a 100% bulletproof solution. As long as there is state data we daren't throw away until after AbortTransaction, plan A doesn't help if we've filled memory with that type of data. And plan B doesn't help if AbortTransaction needs more memory than we reserved; which seems possible for any acceptable level of reserved space (eg consider cases with many thousands of subtransactions or deleted files ... we have to build a very large XLOG Abort record then). I think our best answer is probably to do some of each. For (A) it seems that we should try to flush the pending-trigger-event list and any executor state data that may be hanging around; those are the things that seem both easy to delete and likely to be pretty large. For (B) there's basically a choice of whether to try to re-use ErrorContext, or create a separate context used only for the purposes of running AbortTransaction. The separate context would avoid any possibility of entanglement between what are really different subsystems, but OTOH it seems a bit wasteful. Comments? regards, tom lane
On Wed, 2006-11-22 at 13:06 -0500, Tom Lane wrote: > > Seems like it'd be smarter to try to free some memory before we push > > forward with transaction abort. ErrorContext has only a limited amount > > of space ... > > I've been thinking more about this problem. There are two basic > strategies we could follow to ensure that AbortTransaction has some > room to work in: > A: Try to free space before we start the actual abort. > B: Keep some reserved space for AbortTransaction to use. > (It seems untenable to try not to ever alloc any memory at all during > AbortTransaction.) I'm not sure that either of these can be a 100% > bulletproof solution. As long as there is state data we daren't throw When I was trying to devise a "bulletproof" solution, it seemed the only way would be to reserve space, but to increase the reserved space whenever the state changed such that an AbortTransaction would need extra memory. This didn't seem worth the accounting effort. I made an attempt, but gave up (I spent all my time in gdb keeping track of which memory context I was in). If there are a limited number of areas that increase potential AbortTransaction memory usage, it's a possibility I suppose. Otherwise, I don't know how we'd expect code authors to know whether their code increases memory requirements for AbortTransaction. > away until after AbortTransaction, plan A doesn't help if we've filled > memory with that type of data. And plan B doesn't help if > AbortTransaction needs more memory than we reserved; which seems > possible for any acceptable level of reserved space (eg consider cases > with many thousands of subtransactions or deleted files ... we have to > build a very large XLOG Abort record then). > > I think our best answer is probably to do some of each. For (A) it > seems that we should try to flush the pending-trigger-event list and > any executor state data that may be hanging around; those are the things That seems like a relatively easy way to eliminate most of the problem. It might not be 100% bulletproof, but it will drastically reduce the chances of causing problems in "normal" situations. I'd certainly be happy with this fix. > that seem both easy to delete and likely to be pretty large. For (B) > there's basically a choice of whether to try to re-use ErrorContext, > or create a separate context used only for the purposes of running > AbortTransaction. The separate context would avoid any possibility of > entanglement between what are really different subsystems, but OTOH it > seems a bit wasteful. > Wasteful how? Do you mean that it would clutter the code, or that it would cause unnecessary overhead? Regards,Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes: > When I was trying to devise a "bulletproof" solution, it seemed the only > way would be to reserve space, but to increase the reserved space > whenever the state changed such that an AbortTransaction would need > extra memory. This didn't seem worth the accounting effort. Yeah. The problem I saw with it is that it's only "bulletproof" to the extent that we get that accounting exactly right, and keep it so over time. I think that assumption is sufficiently fragile that the extra safety would be illusory. There's a lot of stuff that happens during AbortTransaction :-( >> there's basically a choice of whether to try to re-use ErrorContext, >> or create a separate context used only for the purposes of running >> AbortTransaction. The separate context would avoid any possibility of >> entanglement between what are really different subsystems, but OTOH it >> seems a bit wasteful. > Wasteful how? Do you mean that it would clutter the code, or that it > would cause unnecessary overhead? Well, it'd be an extra however-many-KB of memory for each backend that would mostly go unused. The rest of a backend's working memory pretty much pulls its weight, but a TransactionAbortContext wouldn't. OTOH, maybe these days a few dozen KB isn't worth worrying about. regards, tom lane
I wrote: > Jeff Davis <pgsql@j-davis.com> writes: >> Wasteful how? Do you mean that it would clutter the code, or that it >> would cause unnecessary overhead? > Well, it'd be an extra however-many-KB of memory for each backend that > would mostly go unused. The rest of a backend's working memory pretty > much pulls its weight, but a TransactionAbortContext wouldn't. OTOH, > maybe these days a few dozen KB isn't worth worrying about. After further thought I concluded that overloading ErrorContext for this purpose is way too risky, so I've committed changes that create a separate context for AbortTransaction to use: http://archives.postgresql.org/pgsql-committers/2006-11/msg00199.php I think the patch would apply cleanly to 8.1 but have not got time to check it now. Less sure about older branches. Are we excited about trying to back-patch this? I think it'd really need rather more testing before I'd want to stick it into the stable branches ... regards, tom lane
On Wed, 2006-11-22 at 21:41 -0500, Tom Lane wrote: > I wrote: > > Jeff Davis <pgsql@j-davis.com> writes: > >> Wasteful how? Do you mean that it would clutter the code, or that it > >> would cause unnecessary overhead? > > > Well, it'd be an extra however-many-KB of memory for each backend that > > would mostly go unused. The rest of a backend's working memory pretty > > much pulls its weight, but a TransactionAbortContext wouldn't. OTOH, > > maybe these days a few dozen KB isn't worth worrying about. > > After further thought I concluded that overloading ErrorContext for this > purpose is way too risky, so I've committed changes that create a > separate context for AbortTransaction to use: > http://archives.postgresql.org/pgsql-committers/2006-11/msg00199.php > > I think the patch would apply cleanly to 8.1 but have not got time to > check it now. Less sure about older branches. Are we excited about > trying to back-patch this? I think it'd really need rather more testing > before I'd want to stick it into the stable branches ... > Everything in the patch makes sense to me, and it certainly solved my test case. I'm just fine with the fix only in 8.2. I can work around it until I can upgrade. Thanks!Jeff Davis