Re: Add the ability to limit the amount of memory that can be allocated to backends.

Поиск
Список
Период
Сортировка
От Andrei Lepikhov
Тема Re: Add the ability to limit the amount of memory that can be allocated to backends.
Дата
Msg-id c6520d47-e584-4287-833c-82779cc166e0@postgrespro.ru
обсуждение исходный текст
Ответ на Re: Add the ability to limit the amount of memory that can be allocated to backends.  (Stephen Frost <sfrost@snowman.net>)
Ответы Re: Add the ability to limit the amount of memory that can be allocated to backends.  (Stephen Frost <sfrost@snowman.net>)
Список pgsql-hackers
On 20/10/2023 05:06, Stephen Frost wrote:
> Greetings,
> 
> * Andrei Lepikhov (a.lepikhov@postgrespro.ru) wrote:
>> On 19/10/2023 02:00, Stephen Frost wrote:
>>> * Andrei Lepikhov (a.lepikhov@postgrespro.ru) wrote:
>>>> On 29/9/2023 09:52, Andrei Lepikhov wrote:
>>>>> On 22/5/2023 22:59, reid.thompson@crunchydata.com wrote:
>>>>>> Attach patches updated to master.
>>>>>> Pulled from patch 2 back to patch 1 a change that was also pertinent
>>>>>> to patch 1.
>>>>> +1 to the idea, have doubts on the implementation.
>>>>>
>>>>> I have a question. I see the feature triggers ERROR on the exceeding of
>>>>> the memory limit. The superior PG_CATCH() section will handle the error.
>>>>> As I see, many such sections use memory allocations. What if some
>>>>> routine, like the CopyErrorData(), exceeds the limit, too? In this case,
>>>>> we could repeat the error until the top PG_CATCH(). Is this correct
>>>>> behaviour? Maybe to check in the exceeds_max_total_bkend_mem() for
>>>>> recursion and allow error handlers to slightly exceed this hard limit?
>>>
>>>> By the patch in attachment I try to show which sort of problems I'm worrying
>>>> about. In some PП_CATCH() sections we do CopyErrorData (allocate some
>>>> memory) before aborting the transaction. So, the allocation error can move
>>>> us out of this section before aborting. We await for soft ERROR message but
>>>> will face more hard consequences.
>>>
>>> While it's an interesting idea to consider making exceptions to the
>>> limit, and perhaps we'll do that (or have some kind of 'reserve' for
>>> such cases), this isn't really any different than today, is it?  We
>>> might have a malloc() failure in the main path, end up in PG_CATCH() and
>>> then try to do a CopyErrorData() and have another malloc() failure.
>>>
>>> If we can rearrange the code to make this less likely to happen, by
>>> doing a bit more work to free() resources used in the main path before
>>> trying to do new allocations, then, sure, let's go ahead and do that,
>>> but that's independent from this effort.
>>
>> I agree that rearranging efforts can be made independently. The code in the
>> letter above was shown just as a demo of the case I'm worried about.
>> IMO, the thing that should be implemented here is a recursion level for the
>> memory limit. If processing the error, we fall into recursion with this
>> limit - we should ignore it.
>> I imagine custom extensions that use PG_CATCH() and allocate some data
>> there. At least we can raise the level of error to FATAL.
> 
> Ignoring such would defeat much of the point of this effort- which is to
> get to a position where we can say with some confidence that we're not
> going to go over some limit that the user has set and therefore not
> allow ourselves to end up getting OOM killed.  These are all the same
> issues that already exist today on systems which don't allow overcommit
> too, there isn't anything new here in regards to these risks, so I'm not
> really keen to complicate this to deal with issues that are already
> there.
> 
> Perhaps once we've got the basics in place then we could consider
> reserving some space for handling such cases..  but I don't think it'll
> actually be very clean and what if we have an allocation that goes
> beyond what that reserved space is anyway?  Then we're in the same spot
> again where we have the choice of either failing the allocation in a
> less elegant way than we might like to handle that error, or risk
> getting outright kill'd by the kernel.  Of those choices, sure seems
> like failing the allocation is the better way to go.

I've got your point.
The only issue I worry about is the uncertainty and clutter that can be 
created by this feature. In the worst case, when we have a complex error 
stack (including the extension's CATCH sections, exceptions in stored 
procedures, etc.), the backend will throw the memory limit error 
repeatedly. Of course, one failed backend looks better than a 
surprisingly killed postmaster, but the mix of different error reports 
and details looks terrible and challenging to debug in the case of 
trouble. So, may we throw a FATAL error if we reach this limit while 
handling an exception?

-- 
regards,
Andrey Lepikhov
Postgres Professional




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Davis
Дата:
Сообщение: Re: [PoC/RFC] Multiple passwords, interval expirations
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: Guiding principle for dropping LLVM versions?