Re: BUG #18334: Segfault when running a query with parallel workers
От | Marcin Barczyński |
---|---|
Тема | Re: BUG #18334: Segfault when running a query with parallel workers |
Дата | |
Msg-id | CAP3o3Pcv+Mo0Vmo_A8Ev7mOU1qwdSFvxSBMhL3-axESsruhTNw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #18334: Segfault when running a query with parallel workers (Thomas Munro <thomas.munro@gmail.com>) |
Ответы |
Re: BUG #18334: Segfault when running a query with parallel workers
|
Список | pgsql-bugs |
Hi Thomas, On Sun, Feb 11, 2024 at 10:31 PM Thomas Munro <thomas.munro@gmail.com> wrote: > Could you please show EXPLAIN ANALYZE for the query? In gdb from that > core, can you please show "info proc mappings", and in frame 0 "print > *area", and in frame 1, "print *tuple" and "print *hashtable"? I'm sorry for my late reply. It happened again, and I'm pasting info you requested from core. PostgreSQL 13.15. Stack trace: #0 0x000056134d5bb011 in dsa_free (area=0x56134e07d718, dp=<optimized out>) at utils/mmgr/./build/../src/backend/utils/mmgr/dsa.c:840 840 utils/mmgr/./build/../src/backend/utils/mmgr/dsa.c: No such file or directory. (gdb) bt #0 0x000056134d5bb011 in dsa_free (area=0x56134e07d718, dp=<optimized out>) at utils/mmgr/./build/../src/backend/utils/mmgr/dsa.c:840 #1 0x000056134d2d6a0c in ExecHashTableDetachBatch (hashtable=hashtable@entry=0x56134e154540) at executor/./build/../src/backend/executor/nodeHash.c:3181 #2 0x000056134d2d821a in ExecParallelHashJoinNewBatch (hjstate=0x56134e087b48) at executor/./build/../src/backend/executor/nodeHashjoin.c:1131 #3 ExecHashJoinImpl (parallel=<optimized out>, pstate=<optimized out>) at executor/./build/../src/backend/executor/nodeHashjoin.c:590 #4 ExecParallelHashJoin (pstate=<optimized out>) at executor/./build/../src/backend/executor/nodeHashjoin.c:637 #5 0x000056134d2bbffd in ExecProcNodeInstr (node=0x56134e087b48) at executor/./build/../src/backend/executor/execProcnode.c:467 #6 0x000056134d2b1bbd in ExecProcNode (node=0x56134e087b48) at executor/./build/../src/include/executor/executor.h:248 #7 ExecutePlan (execute_once=<optimized out>, dest=0x56134dfe1fe8, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x56134e087b48, estate=0x56134e087858) at executor/./build/../src/backend/executor/execMain.c:1632 #8 standard_ExecutorRun (queryDesc=0x56134e0783e0, direction=<optimized out>, count=0, execute_once=<optimized out>) at executor/./build/../src/backend/executor/execMain.c:350 #9 0x00007f3a734c9f25 in pgss_ExecutorRun (queryDesc=0x56134e0783e0, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at ./build/../contrib/pg_stat_statements/pg_stat_statements.c:1045 #10 0x00007f3a771296d2 in explain_ExecutorRun (queryDesc=0x56134e0783e0, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at ./build/../contrib/auto_explain/auto_explain.c:334 #11 0x000056134d2b8729 in ExecutorRun (execute_once=true, count=<optimized out>, direction=ForwardScanDirection, queryDesc=0x56134e0783e0) at executor/./build/../src/backend/executor/execMain.c:292 #12 ParallelQueryMain (seg=seg@entry=0x56134df98db8, toc=toc@entry=0x7f321dfa4000) at executor/./build/../src/backend/executor/execParallel.c:1448 #13 0x000056134d1767ce in ParallelWorkerMain (main_arg=<optimized out>) at access/transam/./build/../src/backend/access/transam/parallel.c:1494 #14 0x000056134d3b981a in StartBackgroundWorker () at postmaster/./build/../src/backend/postmaster/bgworker.c:890 #15 0x000056134d3c963e in do_start_bgworker (rw=<optimized out>) at postmaster/./build/../src/backend/postmaster/postmaster.c:5896 #16 maybe_start_bgworkers () at postmaster/./build/../src/backend/postmaster/postmaster.c:6121 #17 0x000056134d3c988d in sigusr1_handler (postgres_signal_arg=<optimized out>) at postmaster/./build/../src/backend/postmaster/postmaster.c:5281 #18 <signal handler called> #19 0x00007f3a761ac59d in __GI___select (nfds=nfds@entry=8, readfds=readfds@entry=0x7fff97c44720, writefds=writefds@entry=0x0, exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x7fff97c44680) at ../sysdeps/unix/sysv/linux/select.c:69 #20 0x000056134d3caa16 in ServerLoop () at postmaster/./build/../src/backend/postmaster/postmaster.c:1706 #21 0x000056134d3cc725 in PostmasterMain (argc=5, argv=<optimized out>) at postmaster/./build/../src/backend/postmaster/postmaster.c:1415 #22 0x000056134d0e0377 in main (argc=5, argv=0x56134de8d300) at main/./build/../src/backend/main/main.c:210 (gdb) info proc mappings Mapped address spaces: Start Addr End Addr Size Offset objfile 0x56134cfab000 0x56134d068000 0xbd000 0x0 /usr/lib/postgresql/13/bin/postgres 0x56134d068000 0x56134d60b000 0x5a3000 0xbd000 /usr/lib/postgresql/13/bin/postgres 0x56134d60b000 0x56134d827000 0x21c000 0x660000 /usr/lib/postgresql/13/bin/postgres 0x56134d827000 0x56134d845000 0x1e000 0x87b000 /usr/lib/postgresql/13/bin/postgres 0x56134d845000 0x56134d854000 0xf000 0x899000 /usr/lib/postgresql/13/bin/postgres 0x7f2e9599e000 0x7f2f1599e000 0x80000000 0x0 /dev/shm/PostgreSQL.940706000 (gdb) print *area $1 = {control = 0x7f321dfa4500, mapping_pinned = false, segment_maps = {{segment = 0x0, mapped_address = 0x7f321dfa4500 "", header = 0x7f321dfa4500, fpm = 0x7f321dfa5d20, pagemap = 0x7f321dfa6168}, {segment = 0x56134dfa1ec8, mapped_address = 0x7f3216cd8000 "", header = 0x7f3216cd8000, fpm = 0x7f3216cd8038, pagemap = 0x7f3216cd8480}, { segment = 0x56134dfa1f18, mapped_address = 0x7f31f6bd7000 "", header = 0x7f31f6bd7000, fpm = 0x7f31f6bd7038, pagemap = 0x7f31f6bd7480}, {segment = 0x56134dfa2078, mapped_address = 0x7f30d60a6000 "", header = 0x7f30d60a6000, fpm = 0x7f30d60a6038, pagemap = 0x7f30d60a6480}, {segment = 0x56134dfa2118, mapped_address = 0x7f30d58a6000 "", header = 0x7f30d58a6000, fpm = 0x7f30d58a6038, pagemap = 0x7f30d58a6480}, {segment = 0x56134dfa20c8, mapped_address = 0x7f30d5ca6000 "", header = 0x7f30d5ca6000, fpm = 0x7f30d5ca6038, pagemap = 0x7f30d5ca6480}, {segment = 0x56134dfa2168, mapped_address = 0x7f30d50a6000 "", header = 0x7f30d50a6000, fpm = 0x7f30d50a6038, pagemap = 0x7f30d50a6480}, { segment = 0x56134dfa21b8, mapped_address = 0x7f30d449e000 "", header = 0x7f30d449e000, fpm = 0x7f30d449e038, pagemap = 0x7f30d449e480}, {segment = 0x56134dfa2208, mapped_address = 0x7f30d2c90000 "", header = 0x7f30d2c90000, fpm = 0x7f30d2c90038, pagemap = 0x7f30d2c90480}, {segment = 0x56134dfa2258, mapped_address = 0x7f30cfc76000 "", header = 0x7f30cfc76000, fpm = 0x7f30cfc76038, pagemap = 0x7f30cfc76480}, {segment = 0x56134ee12048, mapped_address = 0x7f307599e000 "", header = 0x7f307599e000, fpm = 0x7f307599e038, pagemap = 0x7f307599e480}, {segment = 0x56134ee11ff8, mapped_address = 0x7f307b9d0000 "", header = 0x7f307b9d0000, fpm = 0x7f307b9d0038, pagemap = 0x7f307b9d0480}, { segment = 0x56134ee11fa8, mapped_address = 0x7f3087a32000 "", header = 0x7f3087a32000, fpm = 0x7f3087a32038, pagemap = 0x7f3087a32480}, {segment = 0x56134dfa2dd8, mapped_address = 0x7f309faf4000 "", header = 0x7f309faf4000, fpm = 0x7f309faf4038, pagemap = 0x7f309faf4480}, {segment = 0x56134dfa1fb8, mapped_address = 0x7f30d62d3000 "", header = 0x7f30d62d3000, fpm = 0x7f30d62d3038, pagemap = 0x7f30d62d3480}, {segment = 0x56134dfa1f68, mapped_address = 0x7f31365d5000 "", header = 0x7f31365d5000, fpm = 0x7f31365d5038, pagemap = 0x7f31365d5480}, {segment = 0x56134ee12098, mapped_address = 0x7f306599e000 "", header = 0x7f306599e000, fpm = 0x7f306599e038, pagemap = 0x7f306599e480}, { segment = 0x56134ee120e8, mapped_address = 0x7f305599e000 "", header = 0x7f305599e000, fpm = 0x7f305599e038, pagemap = 0x7f305599e480}, {segment = 0x56134ee12138, mapped_address = 0x7f303599e000 "", header = 0x7f303599e000, fpm = 0x7f303599e038, pagemap = 0x7f303599e480}, {segment = 0x56134ee12188, mapped_address = 0x7f301599e000 "", header = 0x7f301599e000, fpm = 0x7f301599e038, pagemap = 0x7f301599e480}, {segment = 0x56134ee121d8, mapped_address = 0x7f2fd599e000 "", header = 0x7f2fd599e000, fpm = 0x7f2fd599e038, pagemap = 0x7f2fd599e480}, {segment = 0x56134ee12228, mapped_address = 0x7f2f9599e000 "", header = 0x7f2f9599e000, fpm = 0x7f2f9599e038, pagemap = 0x7f2f9599e480}, { segment = 0x56134ee12278, mapped_address = 0x7f2f1599e000 "", header = 0x7f2f1599e000, fpm = 0x7f2f1599e038, pagemap = 0x7f2f1599e480}, {segment = 0x56134ee122c8, mapped_address = 0x7f2e9599e000 "", header = 0x7f2e9599e000, fpm = 0x7f2e9599e038, pagemap = 0x7f2e9599e480}, {segment = 0x0, mapped_address = 0x0, header = 0x0, fpm = 0x0, pagemap = 0x0} <repeats 1000 times>}, high_segment_index = 23, freed_segment_counter = 0} (gdb) frame 1 (gdb) print *hashtable $2 = {nbuckets = 67108864, log2_nbuckets = 26, nbuckets_original = 67108864, nbuckets_optimal = 67108864, log2_nbuckets_optimal = 26, buckets = {unshared = 0x7f31f6cd8000, shared = 0x7f31f6cd8000}, keepNulls = false, skewEnabled = false, skewBucket = 0x0, skewBucketLen = 0, nSkewBuckets = 0, skewBucketNums = 0x0, nbatch = 1, curbatch = 0, nbatch_original = 1, nbatch_outstart = 1, growEnabled = true, totalTuples = 65785362, partialTuples = 5057580, skewTuples = 0, innerBatchFile = 0x0, outerBatchFile = 0x0, outer_hashfunctions = 0x56134e1e04b8, inner_hashfunctions = 0x56134e1e0508, hashStrict = 0x56134e1e0558, collations = 0x56134e1e0570, spaceUsed = 0, spaceAllowed = 13958643712, spacePeak = 0, spaceUsedSkew = 0, spaceAllowedSkew = 279172874, hashCxt = 0x56134e1e03a0, batchCxt = 0x56134e1e23b0, chunks = 0x0, current_chunk = 0x0, area = 0x56134e07d718, parallel_state = 0x7f321dfa4400, batches = 0x56134e1e07f8, current_chunk_shared = 0} This is the code where crashed happened https://github.com/postgres/postgres/blob/8e5faba4b918ba6142339c8f55eaa4eb99776a03/src/backend/utils/mmgr/dsa.c#L835-L840: /* Locate the object, span and pool. */ segment_map = get_segment_by_index(area, DSA_EXTRACT_SEGMENT_NUMBER(dp)); pageno = DSA_EXTRACT_OFFSET(dp) / FPM_PAGE_SIZE; span_pointer = segment_map->pagemap[pageno]; span = dsa_get_address(area, span_pointer); superblock = dsa_get_address(area, span->start); (gdb) print *segment_map $4 = {segment = 0x56134dfa2dd8, mapped_address = 0x7f309faf4000 "", header = 0x7f309faf4000, fpm = 0x7f309faf4038, pagemap = 0x7f309faf4480} (gdb) print pageno $5 = 196979 (gdb) print span_pointer $6 = 0 It looks that if `span_pointer` is 0, `span` is NULL and `span->start` causes a segfault. `span_pointer` is 0 because all `segment_map->pagemap` are zeros: (gdb) print segment_map->pagemap[0] $10 = 0 (gdb) print segment_map->pagemap[1] $11 = 0 (gdb) print segment_map->pagemap[2] $12 = 0 (gdb) print segment_map->pagemap[265] $14 = 0 (gdb) print segment_map->pagemap[187387] $15 = 0 (gdb) print segment_map->pagemap[196979] $16 = 0 Regards, Marcin Barczyński
В списке pgsql-bugs по дате отправления:
Предыдущее
От: PG Bug reporting formДата:
Сообщение: BUG #18475: pg_dump: "Error Segmentation fault"
Следующее
От: PG Bug reporting formДата:
Сообщение: BUG #18476: Debian Install Docs have confusing code block structure