Re: pg13.2: invalid memory alloc request size NNNN

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: pg13.2: invalid memory alloc request size NNNN
Дата
Msg-id c27089c6-3a23-1769-6ec9-9012fef5d3b1@enterprisedb.com
обсуждение исходный текст
Ответ на pg13.2: invalid memory alloc request size NNNN  (Justin Pryzby <pryzby@telsasoft.com>)
Ответы Re: pg13.2: invalid memory alloc request size NNNN  (Justin Pryzby <pryzby@telsasoft.com>)
Re: pg13.2: invalid memory alloc request size NNNN  (Greg Stark <stark@mit.edu>)
Список pgsql-hackers

On 2/12/21 2:48 AM, Justin Pryzby wrote:
> ts=# \errverbose
> ERROR:  XX000: invalid memory alloc request size 18446744073709551613
> 
> #0  pg_re_throw () at elog.c:1716
> #1  0x0000000000a33b12 in errfinish (filename=0xbff20e "mcxt.c", lineno=959, funcname=0xbff2db <__func__.6684>
"palloc")at elog.c:502
 
> #2  0x0000000000a6760d in palloc (size=18446744073709551613) at mcxt.c:959
> #3  0x00000000009fb149 in text_to_cstring (t=0x2aaae8023010) at varlena.c:212
> #4  0x00000000009fbf05 in textout (fcinfo=0x2094538) at varlena.c:557
> #5  0x00000000006bdd50 in ExecInterpExpr (state=0x2093990, econtext=0x20933d8, isnull=0x7fff5bf04a87) at
execExprInterp.c:1112
> #6  0x00000000006d4f18 in ExecEvalExprSwitchContext (state=0x2093990, econtext=0x20933d8, isNull=0x7fff5bf04a87) at
../../../src/include/executor/executor.h:316
> #7  0x00000000006d4f81 in ExecProject (projInfo=0x2093988) at ../../../src/include/executor/executor.h:350
> #8  0x00000000006d5371 in ExecScan (node=0x20932c8, accessMtd=0x7082e0 <SeqNext>, recheckMtd=0x708385 <SeqRecheck>)
atexecScan.c:238
 
> #9  0x00000000007083c2 in ExecSeqScan (pstate=0x20932c8) at nodeSeqscan.c:112
> #10 0x00000000006d1b00 in ExecProcNodeInstr (node=0x20932c8) at execProcnode.c:466
> #11 0x00000000006e742c in ExecProcNode (node=0x20932c8) at ../../../src/include/executor/executor.h:248
> #12 0x00000000006e77de in ExecAppend (pstate=0x2089208) at nodeAppend.c:267
> #13 0x00000000006d1b00 in ExecProcNodeInstr (node=0x2089208) at execProcnode.c:466
> #14 0x000000000070964f in ExecProcNode (node=0x2089208) at ../../../src/include/executor/executor.h:248
> #15 0x0000000000709795 in ExecSort (pstate=0x2088ff8) at nodeSort.c:108
> #16 0x00000000006d1b00 in ExecProcNodeInstr (node=0x2088ff8) at execProcnode.c:466
> #17 0x00000000006d1ad1 in ExecProcNodeFirst (node=0x2088ff8) at execProcnode.c:450
> #18 0x00000000006dec36 in ExecProcNode (node=0x2088ff8) at ../../../src/include/executor/executor.h:248
> #19 0x00000000006df079 in fetch_input_tuple (aggstate=0x2088a20) at nodeAgg.c:589
> #20 0x00000000006e1fad in agg_retrieve_direct (aggstate=0x2088a20) at nodeAgg.c:2368
> #21 0x00000000006e1bfd in ExecAgg (pstate=0x2088a20) at nodeAgg.c:2183
> #22 0x00000000006d1b00 in ExecProcNodeInstr (node=0x2088a20) at execProcnode.c:466
> #23 0x00000000006d1ad1 in ExecProcNodeFirst (node=0x2088a20) at execProcnode.c:450
> #24 0x00000000006c6ffa in ExecProcNode (node=0x2088a20) at ../../../src/include/executor/executor.h:248
> #25 0x00000000006c966b in ExecutePlan (estate=0x2032f48, planstate=0x2088a20, use_parallel_mode=false,
operation=CMD_SELECT,sendTuples=true, numberTuples=0, direction=ForwardScanDirection, dest=0xbb3400 <donothingDR>,
 
>      execute_once=true) at execMain.c:1632
> 
> #3  0x00000000009fb149 in text_to_cstring (t=0x2aaae8023010) at varlena.c:212
> 212             result = (char *) palloc(len + 1);
> 
> (gdb) l
> 207             /* must cast away the const, unfortunately */
> 208             text       *tunpacked = pg_detoast_datum_packed(unconstify(text *, t));
> 209             int                     len = VARSIZE_ANY_EXHDR(tunpacked);
> 210             char       *result;
> 211
> 212             result = (char *) palloc(len + 1);
> 
> (gdb) p len
> $1 = -4
> 
> This VM had some issue early today and I killed the VM, causing PG to execute
> recovery.  I'm tentatively blaming that on zfs, so this could conceivably be a
> data error (although recovery supposedly would have resolved it).  I just
> checked and data_checksums=off.
> 

This seems very much like a corrupted varlena header - length (-4) is 
clearly bogus, and it's what triggers the problem, because that's what 
wraps around to 18446744073709551613 (which is 0xFFFFFFFFFFFFFFFD).

This has to be a value stored in a table, not some intermediate value 
created during execution. So I don't think the exact query matters. Can 
you try doing something like pg_dump, which has to detoast everything?

The question is whether this is due to the VM getting killed in some 
strange way (what VM system is this, how is the storage mounted?) or 
whether the recovery is borked and failed to do the right thing.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Isaac Morland
Дата:
Сообщение: Trigger execution role
Следующее
От: Matthias van de Meent
Дата:
Сообщение: Re: Improvements and additions to COPY progress reporting