Re: Add the ability to limit the amount of memory that can be allocated to backends.
От | Andrei Lepikhov |
---|---|
Тема | Re: Add the ability to limit the amount of memory that can be allocated to backends. |
Дата | |
Msg-id | c6520d47-e584-4287-833c-82779cc166e0@postgrespro.ru обсуждение исходный текст |
Ответ на | Re: Add the ability to limit the amount of memory that can be allocated to backends. (Stephen Frost <sfrost@snowman.net>) |
Ответы |
Re: Add the ability to limit the amount of memory that can be allocated to backends.
(Stephen Frost <sfrost@snowman.net>)
|
Список | pgsql-hackers |
On 20/10/2023 05:06, Stephen Frost wrote: > Greetings, > > * Andrei Lepikhov (a.lepikhov@postgrespro.ru) wrote: >> On 19/10/2023 02:00, Stephen Frost wrote: >>> * Andrei Lepikhov (a.lepikhov@postgrespro.ru) wrote: >>>> On 29/9/2023 09:52, Andrei Lepikhov wrote: >>>>> On 22/5/2023 22:59, reid.thompson@crunchydata.com wrote: >>>>>> Attach patches updated to master. >>>>>> Pulled from patch 2 back to patch 1 a change that was also pertinent >>>>>> to patch 1. >>>>> +1 to the idea, have doubts on the implementation. >>>>> >>>>> I have a question. I see the feature triggers ERROR on the exceeding of >>>>> the memory limit. The superior PG_CATCH() section will handle the error. >>>>> As I see, many such sections use memory allocations. What if some >>>>> routine, like the CopyErrorData(), exceeds the limit, too? In this case, >>>>> we could repeat the error until the top PG_CATCH(). Is this correct >>>>> behaviour? Maybe to check in the exceeds_max_total_bkend_mem() for >>>>> recursion and allow error handlers to slightly exceed this hard limit? >>> >>>> By the patch in attachment I try to show which sort of problems I'm worrying >>>> about. In some PП_CATCH() sections we do CopyErrorData (allocate some >>>> memory) before aborting the transaction. So, the allocation error can move >>>> us out of this section before aborting. We await for soft ERROR message but >>>> will face more hard consequences. >>> >>> While it's an interesting idea to consider making exceptions to the >>> limit, and perhaps we'll do that (or have some kind of 'reserve' for >>> such cases), this isn't really any different than today, is it? We >>> might have a malloc() failure in the main path, end up in PG_CATCH() and >>> then try to do a CopyErrorData() and have another malloc() failure. >>> >>> If we can rearrange the code to make this less likely to happen, by >>> doing a bit more work to free() resources used in the main path before >>> trying to do new allocations, then, sure, let's go ahead and do that, >>> but that's independent from this effort. >> >> I agree that rearranging efforts can be made independently. The code in the >> letter above was shown just as a demo of the case I'm worried about. >> IMO, the thing that should be implemented here is a recursion level for the >> memory limit. If processing the error, we fall into recursion with this >> limit - we should ignore it. >> I imagine custom extensions that use PG_CATCH() and allocate some data >> there. At least we can raise the level of error to FATAL. > > Ignoring such would defeat much of the point of this effort- which is to > get to a position where we can say with some confidence that we're not > going to go over some limit that the user has set and therefore not > allow ourselves to end up getting OOM killed. These are all the same > issues that already exist today on systems which don't allow overcommit > too, there isn't anything new here in regards to these risks, so I'm not > really keen to complicate this to deal with issues that are already > there. > > Perhaps once we've got the basics in place then we could consider > reserving some space for handling such cases.. but I don't think it'll > actually be very clean and what if we have an allocation that goes > beyond what that reserved space is anyway? Then we're in the same spot > again where we have the choice of either failing the allocation in a > less elegant way than we might like to handle that error, or risk > getting outright kill'd by the kernel. Of those choices, sure seems > like failing the allocation is the better way to go. I've got your point. The only issue I worry about is the uncertainty and clutter that can be created by this feature. In the worst case, when we have a complex error stack (including the extension's CATCH sections, exceptions in stored procedures, etc.), the backend will throw the memory limit error repeatedly. Of course, one failed backend looks better than a surprisingly killed postmaster, but the mix of different error reports and details looks terrible and challenging to debug in the case of trouble. So, may we throw a FATAL error if we reach this limit while handling an exception? -- regards, Andrey Lepikhov Postgres Professional
В списке pgsql-hackers по дате отправления: