Re: [BUGS] Out of memory error causes Abort, Abort tries to allocate memory

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [BUGS] Out of memory error causes Abort, Abort tries to allocate memory
Дата
Msg-id 2511.1164218816@sss.pgh.pa.us
обсуждение исходный текст
Ответы Re: [BUGS] Out of memory error causes Abort, Abort tries  (Jeff Davis <pgsql@j-davis.com>)
Список pgsql-hackers
I wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Jeff Davis wrote:
> * smgrGetPendingDeletes() calls palloc()
> * palloc() fails, resulting in ERROR, causing infinite recursion

>> Hmm, maybe we could have AbortTransaction switch to ErrorContext, which
>> has some preallocated space, before calling RecordTransactionAbort (or
>> maybe have RecordTransactionAbort itself do it).

> Seems like it'd be smarter to try to free some memory before we push
> forward with transaction abort.  ErrorContext has only a limited amount
> of space ...

I've been thinking more about this problem.  There are two basic
strategies we could follow to ensure that AbortTransaction has some
room to work in:
A: Try to free space before we start the actual abort.
B: Keep some reserved space for AbortTransaction to use.
(It seems untenable to try not to ever alloc any memory at all during
AbortTransaction.)  I'm not sure that either of these can be a 100%
bulletproof solution.  As long as there is state data we daren't throw
away until after AbortTransaction, plan A doesn't help if we've filled
memory with that type of data.  And plan B doesn't help if
AbortTransaction needs more memory than we reserved; which seems
possible for any acceptable level of reserved space (eg consider cases
with many thousands of subtransactions or deleted files ... we have to
build a very large XLOG Abort record then).

I think our best answer is probably to do some of each.  For (A) it
seems that we should try to flush the pending-trigger-event list and
any executor state data that may be hanging around; those are the things
that seem both easy to delete and likely to be pretty large.  For (B)
there's basically a choice of whether to try to re-use ErrorContext,
or create a separate context used only for the purposes of running
AbortTransaction.  The separate context would avoid any possibility of
entanglement between what are really different subsystems, but OTOH it
seems a bit wasteful.

Comments?
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Francois Rigaudie"
Дата:
Сообщение: dblink locked query
Следующее
От: Markus Schiltknecht
Дата:
Сообщение: Integrating Replication into Core