Re: ERROR: too many dynamic shared memory segments

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: ERROR: too many dynamic shared memory segments
Дата
Msg-id CAEepm=0kADK5inNf_KuemjX=HQ=PuTP0DykM--fO5jS5ePVFEA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: ERROR: too many dynamic shared memory segments  (Jakub Glapa <jakub.glapa@gmail.com>)
Ответы Re: ERROR: too many dynamic shared memory segments  (Dilip Kumar <dilipbalaut@gmail.com>)
Re: ERROR: too many dynamic shared memory segments  (Dilip Kumar <dilipbalaut@gmail.com>)
Re: ERROR: too many dynamic shared memory segments  (Jakub Glapa <jakub.glapa@gmail.com>)
Список pgsql-general
On Tue, Nov 28, 2017 at 10:05 AM, Jakub Glapa <jakub.glapa@gmail.com> wrote:
> As for the crash. I dug up the initial log and it looks like a segmentation
> fault...
>
> 2017-11-23 07:26:53 CET:192.168.10.83(35238):user@db:[30003]: ERROR:  too
> many dynamic shared memory segments

Hmm.  Well this error can only occur in dsm_create() called without
DSM_CREATE_NULL_IF_MAXSEGMENTS.  parallel.c calls it with that flag
and dsa.c doesn't (perhaps it should, not sure, but that'd just change
the error message), so that means this the error arose from dsa.c
trying to get more segments.  That would be when Parallel Bitmap Heap
Scan tried to allocate memory.

I hacked my copy of PostgreSQL so that it allows only 5 DSM slots and
managed to reproduce a segv crash by trying to run concurrent Parallel
Bitmap Heap Scans.  The stack looks like this:
 * frame #0: 0x00000001083ace29
postgres`alloc_object(area=0x0000000000000000, size_class=10) + 25 at
dsa.c:1433   frame #1: 0x00000001083acd14
postgres`dsa_allocate_extended(area=0x0000000000000000, size=72,
flags=4) + 1076 at dsa.c:785   frame #2: 0x0000000108059c33
postgres`tbm_prepare_shared_iterate(tbm=0x00007f9743027660) + 67 at
tidbitmap.c:780   frame #3: 0x0000000108000d57
postgres`BitmapHeapNext(node=0x00007f9743019c88) + 503 at
nodeBitmapHeapscan.c:156   frame #4: 0x0000000107fefc5b
postgres`ExecScanFetch(node=0x00007f9743019c88,
accessMtd=(postgres`BitmapHeapNext at nodeBitmapHeapscan.c:77),
recheckMtd=(postgres`BitmapHeapRecheck at nodeBitmapHeapscan.c:710)) +
459 at execScan.c:95   frame #5: 0x0000000107fef983
postgres`ExecScan(node=0x00007f9743019c88,
accessMtd=(postgres`BitmapHeapNext at nodeBitmapHeapscan.c:77),
recheckMtd=(postgres`BitmapHeapRecheck at nodeBitmapHeapscan.c:710)) +
147 at execScan.c:162   frame #6: 0x00000001080008d1
postgres`ExecBitmapHeapScan(pstate=0x00007f9743019c88) + 49 at
nodeBitmapHeapscan.c:735

(lldb) f 3
frame #3: 0x0000000108000d57
postgres`BitmapHeapNext(node=0x00007f9743019c88) + 503 at
nodeBitmapHeapscan.c:156  153 * dsa_pointer of the iterator state which will be used by  154 * multiple processes to
iteratejointly.  155 */
 
-> 156 pstate->tbmiterator = tbm_prepare_shared_iterate(tbm);  157 #ifdef USE_PREFETCH  158 if (node->prefetch_maximum
>0)  159
 
(lldb) print tbm->dsa
(dsa_area *) $3 = 0x0000000000000000
(lldb) print node->ss.ps.state->es_query_dsa
(dsa_area *) $5 = 0x0000000000000000
(lldb) f 17
frame #17: 0x000000010800363b
postgres`ExecGather(pstate=0x00007f9743019320) + 635 at
nodeGather.c:220  217 * Get next tuple, either from one of our workers, or by running the plan  218 * ourselves.  219
*/
-> 220 slot = gather_getnext(node);  221 if (TupIsNull(slot))  222 return NULL;  223
(lldb) print *node->pei
(ParallelExecutorInfo) $8 = { planstate = 0x00007f9743019640 pcxt = 0x00007f97450001b8 buffer_usage =
0x0000000108b7e218instrumentation = 0x0000000108b7da38 area = 0x0000000000000000 param_exec = 0 finished = '\0' tqueue
=0x0000000000000000 reader = 0x0000000000000000
 
}
(lldb) print *node->pei->pcxt
warning: could not load any Objective-C class information. This will
significantly reduce the quality of type information available.
(ParallelContext) $9 = { node = {   prev = 0x000000010855fb60   next = 0x000000010855fb60 } subid = 1 nworkers = 0
nworkers_launched= 0 library_name = 0x00007f9745000248 "postgres" function_name = 0x00007f9745000268
"ParallelQueryMain"error_context_stack = 0x0000000000000000 estimator = (space_for_chunks = 180352, number_of_keys =
19)seg = 0x0000000000000000 private_memory = 0x0000000108b53038 toc = 0x0000000108b53038 worker = 0x0000000000000000
 
}

I think there are two failure modes: one of your sessions showed the
"too many ..." error (that's good, ran out of slots and said so and
our error machinery worked as it should), and another crashed with a
segfault, because it tried to use a NULL "area" pointer (bad).  I
think this is a degenerate case where we completely failed to launch
parallel query, but we ran the parallel query plan anyway and this
code thinks that the DSA is available.  Oops.

-- 
Thomas Munro
http://www.enterprisedb.com


В списке pgsql-general по дате отправления:

Предыдущее
От: Jakub Glapa
Дата:
Сообщение: Re: ERROR: too many dynamic shared memory segments
Следующее
От: Robert Lakes
Дата:
Сообщение: Setting a serial column with serial object that has a name that isbuilt dynamically