Re: BUG #18374: Printing memory contexts on OOM condition might lead to segmentation fault

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BUG #18374: Printing memory contexts on OOM condition might lead to segmentation fault
Дата
Msg-id 3399097.1709501969@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: BUG #18374: Printing memory contexts on OOM condition might lead to segmentation fault  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: BUG #18374: Printing memory contexts on OOM condition might lead to segmentation fault  (Alexander Lakhin <exclusion@gmail.com>)
Re: BUG #18374: Printing memory contexts on OOM condition might lead to segmentation fault  (Alexander Korotkov <aekorotkov@gmail.com>)
Список pgsql-bugs
I wrote:
> I find this in [1]:
> 
>   The C language stack growth does an implicit mremap. If you want absolute
>   guarantees and run close to the edge you MUST mmap your stack for the 
>   largest size you think you will need. For typical stack usage this does
>   not matter much but it's a corner case if you really really care
> 
> Seems like we need to do some more work at startup to enforce that
> we have the amount of stack we think we do, if we're on Linux.

After thinking about that some more, I'm really quite unenthused about
trying to remap the stack for ourselves.  It'd be both platform- and
architecture-dependent, and I'm afraid it'd introduce as many failure
modes as it removes.  (Notably, I'm not sure we could guarantee
there's a guard page below the stack.)  Since we've not seen reports
of this failure from the wild, I doubt it's worth the trouble.

I do think it's probably worth reducing MemoryContextDelete's stack
usage to O(1), just to ensure we can't get into stack trouble during
transaction abort.  That's not hard at all, as attached.

I tried to make MemoryContextResetChildren work similarly, but that
doesn't work because if we're not removing child contexts then we
need extra state to tell which ones we've done already.  For the
same reason my idea for bounding the stack space needed by
MemoryContextStats doesn't seem to work.  We could possibly make it
work if we were willing to add a temporary-use pointer field to all
MemoryContext headers, but I'm unconvinced that'd be a good tradeoff.

            regards, tom lane

diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
index 41f2390fb8..7cc220bca0 100644
--- a/src/backend/utils/mmgr/mcxt.c
+++ b/src/backend/utils/mmgr/mcxt.c
@@ -447,14 +447,37 @@ MemoryContextDelete(MemoryContext context)
 void
 MemoryContextDeleteChildren(MemoryContext context)
 {
+    MemoryContext cur,
+                next;
+
     Assert(MemoryContextIsValid(context));
 
     /*
-     * MemoryContextDelete will delink the child from me, so just iterate as
-     * long as there is a child.
+     * We iterate rather than recursing, so that cleaning up a deep nest of
+     * contexts doesn't require unbounded stack space.  (This avoids possible
+     * failure during transaction cleanup, which would be bad.)  This works
+     * because by the time we traverse back up to a parent context, it will
+     * have no remaining children and will be seen as deletable.
      */
-    while (context->firstchild != NULL)
-        MemoryContextDelete(context->firstchild);
+    cur = context->firstchild;
+    while (cur != NULL)
+    {
+        /* Descend until we find a childless context */
+        if (cur->firstchild != NULL)
+        {
+            cur = cur->firstchild;
+            continue;
+        }
+        /* We can delete cur, but first remember the next deletion target */
+        if (cur->nextchild != NULL)
+            next = cur->nextchild;
+        else if (cur->parent != context)
+            next = cur->parent;
+        else
+            next = NULL;
+        MemoryContextDelete(cur);
+        cur = next;
+    }
 }
 
 /*

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #18374: Printing memory contexts on OOM condition might lead to segmentation fault
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker