Обсуждение: Re: [BUGS] Out of memory error causes Abort, Abort tries to allocate memory

Поиск
Список
Период
Сортировка

Re: [BUGS] Out of memory error causes Abort, Abort tries to allocate memory

От
Tom Lane
Дата:
I wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Jeff Davis wrote:
> * smgrGetPendingDeletes() calls palloc()
> * palloc() fails, resulting in ERROR, causing infinite recursion

>> Hmm, maybe we could have AbortTransaction switch to ErrorContext, which
>> has some preallocated space, before calling RecordTransactionAbort (or
>> maybe have RecordTransactionAbort itself do it).

> Seems like it'd be smarter to try to free some memory before we push
> forward with transaction abort.  ErrorContext has only a limited amount
> of space ...

I've been thinking more about this problem.  There are two basic
strategies we could follow to ensure that AbortTransaction has some
room to work in:
A: Try to free space before we start the actual abort.
B: Keep some reserved space for AbortTransaction to use.
(It seems untenable to try not to ever alloc any memory at all during
AbortTransaction.)  I'm not sure that either of these can be a 100%
bulletproof solution.  As long as there is state data we daren't throw
away until after AbortTransaction, plan A doesn't help if we've filled
memory with that type of data.  And plan B doesn't help if
AbortTransaction needs more memory than we reserved; which seems
possible for any acceptable level of reserved space (eg consider cases
with many thousands of subtransactions or deleted files ... we have to
build a very large XLOG Abort record then).

I think our best answer is probably to do some of each.  For (A) it
seems that we should try to flush the pending-trigger-event list and
any executor state data that may be hanging around; those are the things
that seem both easy to delete and likely to be pretty large.  For (B)
there's basically a choice of whether to try to re-use ErrorContext,
or create a separate context used only for the purposes of running
AbortTransaction.  The separate context would avoid any possibility of
entanglement between what are really different subsystems, but OTOH it
seems a bit wasteful.

Comments?
        regards, tom lane


Re: [BUGS] Out of memory error causes Abort, Abort tries

От
Jeff Davis
Дата:
On Wed, 2006-11-22 at 13:06 -0500, Tom Lane wrote:
> > Seems like it'd be smarter to try to free some memory before we push
> > forward with transaction abort.  ErrorContext has only a limited amount
> > of space ...
> 
> I've been thinking more about this problem.  There are two basic
> strategies we could follow to ensure that AbortTransaction has some
> room to work in:
> A: Try to free space before we start the actual abort.
> B: Keep some reserved space for AbortTransaction to use.
> (It seems untenable to try not to ever alloc any memory at all during
> AbortTransaction.)  I'm not sure that either of these can be a 100%
> bulletproof solution.  As long as there is state data we daren't throw

When I was trying to devise a "bulletproof" solution, it seemed the only
way would be to reserve space, but to increase the reserved space
whenever the state changed such that an AbortTransaction would need
extra memory. This didn't seem worth the accounting effort. I made an
attempt, but gave up (I spent all my time in gdb keeping track of which
memory context I was in). If there are a limited number of areas that
increase potential AbortTransaction memory usage, it's a possibility I
suppose. Otherwise, I don't know how we'd expect code authors to know
whether their code increases memory requirements for AbortTransaction.

> away until after AbortTransaction, plan A doesn't help if we've filled
> memory with that type of data.  And plan B doesn't help if
> AbortTransaction needs more memory than we reserved; which seems
> possible for any acceptable level of reserved space (eg consider cases
> with many thousands of subtransactions or deleted files ... we have to
> build a very large XLOG Abort record then).
> 
> I think our best answer is probably to do some of each.  For (A) it
> seems that we should try to flush the pending-trigger-event list and
> any executor state data that may be hanging around; those are the things

That seems like a relatively easy way to eliminate most of the problem.
It might not be 100% bulletproof, but it will drastically reduce the
chances of causing problems in "normal" situations. I'd certainly be
happy with this fix.

> that seem both easy to delete and likely to be pretty large.  For (B)
> there's basically a choice of whether to try to re-use ErrorContext,
> or create a separate context used only for the purposes of running
> AbortTransaction.  The separate context would avoid any possibility of
> entanglement between what are really different subsystems, but OTOH it
> seems a bit wasteful.
> 

Wasteful how? Do you mean that it would clutter the code, or that it
would cause unnecessary overhead? 

Regards,Jeff Davis





Re: [BUGS] Out of memory error causes Abort, Abort tries to allocate memory

От
Tom Lane
Дата:
Jeff Davis <pgsql@j-davis.com> writes:
> When I was trying to devise a "bulletproof" solution, it seemed the only
> way would be to reserve space, but to increase the reserved space
> whenever the state changed such that an AbortTransaction would need
> extra memory. This didn't seem worth the accounting effort.

Yeah.  The problem I saw with it is that it's only "bulletproof" to the
extent that we get that accounting exactly right, and keep it so over
time.  I think that assumption is sufficiently fragile that the extra
safety would be illusory.  There's a lot of stuff that happens during
AbortTransaction :-(

>> there's basically a choice of whether to try to re-use ErrorContext,
>> or create a separate context used only for the purposes of running
>> AbortTransaction.  The separate context would avoid any possibility of
>> entanglement between what are really different subsystems, but OTOH it
>> seems a bit wasteful.

> Wasteful how? Do you mean that it would clutter the code, or that it
> would cause unnecessary overhead? 

Well, it'd be an extra however-many-KB of memory for each backend that
would mostly go unused.  The rest of a backend's working memory pretty
much pulls its weight, but a TransactionAbortContext wouldn't.  OTOH,
maybe these days a few dozen KB isn't worth worrying about.
        regards, tom lane


Re: [BUGS] Out of memory error causes Abort, Abort tries to allocate memory

От
Tom Lane
Дата:
I wrote:
> Jeff Davis <pgsql@j-davis.com> writes:
>> Wasteful how? Do you mean that it would clutter the code, or that it
>> would cause unnecessary overhead? 

> Well, it'd be an extra however-many-KB of memory for each backend that
> would mostly go unused.  The rest of a backend's working memory pretty
> much pulls its weight, but a TransactionAbortContext wouldn't.  OTOH,
> maybe these days a few dozen KB isn't worth worrying about.

After further thought I concluded that overloading ErrorContext for this
purpose is way too risky, so I've committed changes that create a
separate context for AbortTransaction to use:
http://archives.postgresql.org/pgsql-committers/2006-11/msg00199.php

I think the patch would apply cleanly to 8.1 but have not got time to
check it now.  Less sure about older branches.  Are we excited about
trying to back-patch this?  I think it'd really need rather more testing
before I'd want to stick it into the stable branches ...
        regards, tom lane


Re: [BUGS] Out of memory error causes Abort, Abort tries

От
Jeff Davis
Дата:
On Wed, 2006-11-22 at 21:41 -0500, Tom Lane wrote:
> I wrote:
> > Jeff Davis <pgsql@j-davis.com> writes:
> >> Wasteful how? Do you mean that it would clutter the code, or that it
> >> would cause unnecessary overhead? 
> 
> > Well, it'd be an extra however-many-KB of memory for each backend that
> > would mostly go unused.  The rest of a backend's working memory pretty
> > much pulls its weight, but a TransactionAbortContext wouldn't.  OTOH,
> > maybe these days a few dozen KB isn't worth worrying about.
> 
> After further thought I concluded that overloading ErrorContext for this
> purpose is way too risky, so I've committed changes that create a
> separate context for AbortTransaction to use:
> http://archives.postgresql.org/pgsql-committers/2006-11/msg00199.php
> 
> I think the patch would apply cleanly to 8.1 but have not got time to
> check it now.  Less sure about older branches.  Are we excited about
> trying to back-patch this?  I think it'd really need rather more testing
> before I'd want to stick it into the stable branches ...
> 

Everything in the patch makes sense to me, and it certainly solved my
test case.

I'm just fine with the fix only in 8.2. I can work around it until I can
upgrade.

Thanks!Jeff Davis