Обсуждение: crash 11.5~

Поиск
Список
Период
Сортировка

crash 11.5~

От
Justin Pryzby
Дата:
A daily report crashed repeatedly this morning running pg11.4.
I compiled 11.5 and reproduced it there, too, so I'm including backtrace with
-O0.

I'm trying to dig further into it, but it seems to be crashing under load, but
not when I try to narrow down to a single report, which seem to run to
completion when run individually.

postmaster[6750]: segfault at 7f4a527545bc ip 00000000004834ae sp 00007ffd547e2760 error 4 in postgres[400000+6e6000]
postmaster[29786]: segfault at 7f49a968d000 ip 00007f49b24abb0f sp 00007ffd547e2268 error 4 in libc-2.12.so
(deleted)[7f49b2422000+18b000]

Core was generated by `postgres: telsasoft ts 192.168.122.11(35454) SELECT                    '.
Program terminated with signal 11, Segmentation fault.

#0  0x00000000004877df in slot_deform_tuple (slot=0xe204638, natts=3) at heaptuple.c:1465
#1  0x0000000000487f5b in slot_getsomeattrs (slot=0xe204638, attnum=3) at heaptuple.c:1673
#2  0x00000000006d5aae in ExecInterpExpr (state=0xe257948, econtext=0xe204578, isnull=0x7ffff5f38077) at
execExprInterp.c:443
#3  0x00000000006ed221 in ExecEvalExprSwitchContext (state=0xe257948, econtext=0xe204578, isNull=0x7ffff5f38077) at
../../../src/include/executor/executor.h:313
#4  0x00000000006ed30e in ExecQual (state=0xe257948, econtext=0xe204578) at
../../../src/include/executor/executor.h:382
#5  0x00000000006ed5f4 in ExecScan (node=0x252edc8, accessMtd=0x71c54c <SeqNext>, recheckMtd=0x71c623 <SeqRecheck>) at
execScan.c:190
#6  0x000000000071c670 in ExecSeqScan (pstate=0x252edc8) at nodeSeqscan.c:129
#7  0x00000000006eb855 in ExecProcNodeInstr (node=0x252edc8) at execProcnode.c:461
#8  0x00000000006f6555 in ExecProcNode (node=0x252edc8) at ../../../src/include/executor/executor.h:247
#9  0x00000000006f6a05 in ExecAppend (pstate=0xe20e7b0) at nodeAppend.c:294
#10 0x00000000006eb855 in ExecProcNodeInstr (node=0xe20e7b0) at execProcnode.c:461
#11 0x0000000000708a96 in ExecProcNode (node=0xe20e7b0) at ../../../src/include/executor/executor.h:247
#12 0x0000000000709b0d in ExecHashJoinOuterGetTuple (outerNode=0xe20e7b0, hjstate=0xe20e4d8, hashvalue=0x7ffff5f3827c)
atnodeHashjoin.c:821
 
#13 0x0000000000708f8d in ExecHashJoinImpl (pstate=0xe20e4d8, parallel=false) at nodeHashjoin.c:355
#14 0x00000000007094ca in ExecHashJoin (pstate=0xe20e4d8) at nodeHashjoin.c:565
#15 0x00000000006eb855 in ExecProcNodeInstr (node=0xe20e4d8) at execProcnode.c:461
#16 0x000000000071d9da in ExecProcNode (node=0xe20e4d8) at ../../../src/include/executor/executor.h:247
#17 0x000000000071db26 in ExecSort (pstate=0xe20e3c0) at nodeSort.c:107
#18 0x00000000006eb855 in ExecProcNodeInstr (node=0xe20e3c0) at execProcnode.c:461
#19 0x00000000006eb826 in ExecProcNodeFirst (node=0xe20e3c0) at execProcnode.c:445
#20 0x00000000006f74c5 in ExecProcNode (node=0xe20e3c0) at ../../../src/include/executor/executor.h:247
#21 0x00000000006f79bb in fetch_input_tuple (aggstate=0xe20df70) at nodeAgg.c:406
#22 0x00000000006f9d11 in agg_retrieve_direct (aggstate=0xe20df70) at nodeAgg.c:1755
#23 0x00000000006f98fa in ExecAgg (pstate=0xe20df70) at nodeAgg.c:1570
#24 0x00000000006eb855 in ExecProcNodeInstr (node=0xe20df70) at execProcnode.c:461
#25 0x00000000006eb826 in ExecProcNodeFirst (node=0xe20df70) at execProcnode.c:445
#26 0x0000000000708a96 in ExecProcNode (node=0xe20df70) at ../../../src/include/executor/executor.h:247
#27 0x0000000000709b0d in ExecHashJoinOuterGetTuple (outerNode=0xe20df70, hjstate=0x2778580, hashvalue=0x7ffff5f3865c)
atnodeHashjoin.c:821
 
#28 0x0000000000708f8d in ExecHashJoinImpl (pstate=0x2778580, parallel=false) at nodeHashjoin.c:355
#29 0x00000000007094ca in ExecHashJoin (pstate=0x2778580) at nodeHashjoin.c:565
#30 0x00000000006eb855 in ExecProcNodeInstr (node=0x2778580) at execProcnode.c:461
#31 0x00000000006eb826 in ExecProcNodeFirst (node=0x2778580) at execProcnode.c:445
#32 0x0000000000702a9b in ExecProcNode (node=0x2778580) at ../../../src/include/executor/executor.h:247
#33 0x0000000000702d12 in MultiExecPrivateHash (node=0x27783a8) at nodeHash.c:164
#34 0x0000000000702c8c in MultiExecHash (node=0x27783a8) at nodeHash.c:114
#35 0x00000000006eb8f7 in MultiExecProcNode (node=0x27783a8) at execProcnode.c:501
#36 0x0000000000708e2a in ExecHashJoinImpl (pstate=0x2777210, parallel=false) at nodeHashjoin.c:290
#37 0x00000000007094ca in ExecHashJoin (pstate=0x2777210) at nodeHashjoin.c:565
#38 0x00000000006eb855 in ExecProcNodeInstr (node=0x2777210) at execProcnode.c:461
#39 0x00000000006eb826 in ExecProcNodeFirst (node=0x2777210) at execProcnode.c:445
#40 0x000000000071d9da in ExecProcNode (node=0x2777210) at ../../../src/include/executor/executor.h:247
#41 0x000000000071db26 in ExecSort (pstate=0x2776ce0) at nodeSort.c:107
#42 0x00000000006eb855 in ExecProcNodeInstr (node=0x2776ce0) at execProcnode.c:461
#43 0x00000000006eb826 in ExecProcNodeFirst (node=0x2776ce0) at execProcnode.c:445
#44 0x00000000006f74c5 in ExecProcNode (node=0x2776ce0) at ../../../src/include/executor/executor.h:247
#45 0x00000000006f79bb in fetch_input_tuple (aggstate=0x2776ec8) at nodeAgg.c:406
#46 0x00000000006f9d11 in agg_retrieve_direct (aggstate=0x2776ec8) at nodeAgg.c:1755
#47 0x00000000006f98fa in ExecAgg (pstate=0x2776ec8) at nodeAgg.c:1570
#48 0x00000000006eb855 in ExecProcNodeInstr (node=0x2776ec8) at execProcnode.c:461
#49 0x00000000006eb826 in ExecProcNodeFirst (node=0x2776ec8) at execProcnode.c:445
#50 0x00000000006e0129 in ExecProcNode (node=0x2776ec8) at ../../../src/include/executor/executor.h:247
#51 0x00000000006e2abd in ExecutePlan (estate=0x27769d8, planstate=0x2776ec8, use_parallel_mode=false,
operation=CMD_SELECT,sendTuples=true, numberTuples=0, direction=ForwardScanDirection, dest=0xa66bef0,
execute_once=true)at execMain.c:1723
 
#52 0x00000000006e0710 in standard_ExecutorRun (queryDesc=0xdeb5f38, direction=ForwardScanDirection, count=0,
execute_once=true)at execMain.c:364
 
#53 0x00007fd37e757633 in pgss_ExecutorRun (queryDesc=0xdeb5f38, direction=ForwardScanDirection, count=0,
execute_once=true)at pg_stat_statements.c:892
 
#54 0x00007fd37e2d86ef in explain_ExecutorRun (queryDesc=0xdeb5f38, direction=ForwardScanDirection, count=0,
execute_once=true)at auto_explain.c:281
 
#55 0x00000000006e0511 in ExecutorRun (queryDesc=0xdeb5f38, direction=ForwardScanDirection, count=0, execute_once=true)
atexecMain.c:305
 
#56 0x00000000008cfebe in PortalRunSelect (portal=0x21d4bf8, forward=true, count=0, dest=0xa66bef0) at pquery.c:932
#57 0x00000000008cfb4c in PortalRun (portal=0x21d4bf8, count=9223372036854775807, isTopLevel=true, run_once=true,
dest=0xa66bef0,altdest=0xa66bef0, completionTag=0x7ffff5f39100 "") at pquery.c:773
 
#58 0x00000000008c9b22 in exec_simple_query (query_string=0xe2fbd80 "--BEGIN SQL\nSELECT site_office AS
site_gran,\n\tsite_location||'.'||sect_numAS bs,\n\tgsm_site_name_to_sect_name(site_name, sect_num, sect_name) AS
sitename,\n\tperiodAS start_time,\n\t(maybe_div(sum(rrc_su"...)
 
    at postgres.c:1145
#59 0x00000000008cddf3 in PostgresMain (argc=1, argv=0x218fef8, dbname=0x215da48 "ts", username=0x218fed0 "telsasoft")
atpostgres.c:4182
 
#60 0x000000000082a098 in BackendRun (port=0x21867d0) at postmaster.c:4358
#61 0x0000000000829806 in BackendStartup (port=0x21867d0) at postmaster.c:4030
#62 0x0000000000825cab in ServerLoop () at postmaster.c:1707
#63 0x00000000008255dd in PostmasterMain (argc=3, argv=0x215b9a0) at postmaster.c:1380
#64 0x000000000074ba30 in main (argc=3, argv=0x215b9a0) at main.c:228
(gdb) 

#0  0x00000000004877df in slot_deform_tuple (slot=0xe204638, natts=3) at heaptuple.c:1465
        thisatt = 0x5265db0
        tuple = 0xe838d40
        tupleDesc = 0x5265d20
        values = 0xe204698
        isnull = 0xe2057e8
        tup = 0x7fd3eb1b8058
        hasnulls = false
        attnum = 1
        tp = 0x7fd3eb1b80bd <Address 0x7fd3eb1b80bd out of bounds>
        off = 583369983
        bp = 0x7fd3eb1b806f <Address 0x7fd3eb1b806f out of bounds>
        slow = true
#1  0x0000000000487f5b in slot_getsomeattrs (slot=0xe204638, attnum=3) at heaptuple.c:1673
        tuple = 0xe838d40
        attno = 3
        __func__ = "slot_getsomeattrs"
#2  0x00000000006d5aae in ExecInterpExpr (state=0xe257948, econtext=0xe204578, isnull=0x7ffff5f38077) at
execExprInterp.c:443
        op = 0xe257530
        resultslot = 0x0
        innerslot = 0x0
        outerslot = 0x0
        scanslot = 0xe204638
        dispatch_table = {0x6d5a0d, 0x6d5a32, 0x6d5a61, 0x6d5a93, 0x6d5ac5, 0x6d5b74, 0x6d5c23, 0x6d5cd2, 0x6d5da3,
0x6d5e74,0x6d5f45, 0x6d5f7c, 0x6d6056, 0x6d6130, 0x6d620a, 0x6d627e, 0x6d6347, 0x6d638f, 0x6d640e, 0x6d64ec, 0x6d6523,
0x6d655a,0x6d6568, 0x6d65e9, 0x6d6657, 0x6d6665, 
 
          0x6d66e6, 0x6d6754, 0x6d6790, 0x6d6821, 0x6d6856, 0x6d68b4, 0x6d6915, 0x6d6986, 0x6d69ca, 0x6d6a14, 0x6d6a4b,
0x6d6a82,0x6d6acb, 0x6d6b3b, 0x6d6bab, 0x6d6bf4, 0x6d6c2b, 0x6d6c62, 0x6d6ca1, 0x6d6dc3, 0x6d6e2e, 0x6d704c, 0x6d7161,
0x6d726f,0x6d7376, 0x6d73a6, 0x6d73d6, 0x6d7406, 
 
          0x6d7436, 0x6d746d, 0x6d749d, 0x6d7625, 0x6d7724, 0x6d7754, 0x6d778b, 0x6d77c2, 0x6d77f9, 0x6d7862, 0x6d7892,
0x6d78c2,0x6d6d32, 0x6d7959, 0x6d7989, 0x6d78f2, 0x6d7929, 0x6d79b9, 0x6d79e9, 0x6d7a98, 0x6d7ac8, 0x6d7b77, 0x6d7bae,
0x6d7be5,0x6d7c40, 0x6d7cf5, 0x6d7d9a, 0x6d7ead, 
 
          0x6d7f52, 0x6d80b4, 0x6d8261, 0x6d8298, 0x6d82cf}
#3  0x00000000006ed221 in ExecEvalExprSwitchContext (state=0xe257948, econtext=0xe204578, isNull=0x7ffff5f38077) at
../../../src/include/executor/executor.h:313
        retDatum = 140737319764160
        oldContext = 0x27768c0
#4  0x00000000006ed30e in ExecQual (state=0xe257948, econtext=0xe204578) at
../../../src/include/executor/executor.h:382
        ret = 3018996923
        isnull = false
#5  0x00000000006ed5f4 in ExecScan (node=0x252edc8, accessMtd=0x71c54c <SeqNext>, recheckMtd=0x71c623 <SeqRecheck>) at
execScan.c:190
        slot = 0xe204638
        econtext = 0xe204578
        qual = 0xe257948
        projInfo = 0xe257498
#6  0x000000000071c670 in ExecSeqScan (pstate=0x252edc8) at nodeSeqscan.c:129
---Type <return> to continue, or q <return> to quit---
        node = 0x252edc8
#7  0x00000000006eb855 in ExecProcNodeInstr (node=0x252edc8) at execProcnode.c:461
        result = 0x702ace
#8  0x00000000006f6555 in ExecProcNode (node=0x252edc8) at ../../../src/include/executor/executor.h:247
No locals.
#9  0x00000000006f6a05 in ExecAppend (pstate=0xe20e7b0) at nodeAppend.c:294
        subnode = 0x252edc8
        result = 0xe258508
        node = 0xe20e7b0
#10 0x00000000006eb855 in ExecProcNodeInstr (node=0xe20e7b0) at execProcnode.c:461
        result = 0x20e257058
#11 0x0000000000708a96 in ExecProcNode (node=0xe20e7b0) at ../../../src/include/executor/executor.h:247
No locals.
#12 0x0000000000709b0d in ExecHashJoinOuterGetTuple (outerNode=0xe20e7b0, hjstate=0xe20e4d8, hashvalue=0x7ffff5f3827c)
atnodeHashjoin.c:821
 
        hashtable = 0xe8387c8
        curbatch = 0
        slot = 0x0
#13 0x0000000000708f8d in ExecHashJoinImpl (pstate=0xe20e4d8, parallel=false) at nodeHashjoin.c:355
        node = 0xe20e4d8
        outerNode = 0xe20e7b0
        hashNode = 0xe258720
        joinqual = 0x0
        otherqual = 0x0
        econtext = 0xe20e6f0
        hashtable = 0xe8387c8
        outerTupleSlot = 0xe257058
        hashvalue = 3018996923
        batchno = 0
        parallel_state = 0x0
        __func__ = "ExecHashJoinImpl"
#14 0x00000000007094ca in ExecHashJoin (pstate=0xe20e4d8) at nodeHashjoin.c:565
No locals.
#15 0x00000000006eb855 in ExecProcNodeInstr (node=0xe20e4d8) at execProcnode.c:461
        result = 0x20c4
#16 0x000000000071d9da in ExecProcNode (node=0xe20e4d8) at ../../../src/include/executor/executor.h:247
No locals.
#17 0x000000000071db26 in ExecSort (pstate=0xe20e3c0) at nodeSort.c:107
        plannode = 0xa6309c0
        outerNode = 0xe20e4d8
---Type <return> to continue, or q <return> to quit---
        tupDesc = 0xe2598a0
        node = 0xe20e3c0
        estate = 0x27769d8
        dir = ForwardScanDirection
        tuplesortstate = 0x2766998
        slot = 0xe25b8b8
#18 0x00000000006eb855 in ExecProcNodeInstr (node=0xe20e3c0) at execProcnode.c:461
        result = 0x4785d0
#19 0x00000000006eb826 in ExecProcNodeFirst (node=0xe20e3c0) at execProcnode.c:445
No locals.
#20 0x00000000006f74c5 in ExecProcNode (node=0xe20e3c0) at ../../../src/include/executor/executor.h:247
No locals.
#21 0x00000000006f79bb in fetch_input_tuple (aggstate=0xe20df70) at nodeAgg.c:406
        slot = 0xe20e240
#22 0x00000000006f9d11 in agg_retrieve_direct (aggstate=0xe20df70) at nodeAgg.c:1755
        node = 0xa6315d0
        econtext = 0xe20e300
        tmpcontext = 0xe20e1a8
        peragg = 0x252e198
        pergroups = 0xe7cdd40
        outerslot = 0x7ffff5f38520
        firstSlot = 0xe263e70
        result = 0x0
        hasGroupingSets = false
        numGroupingSets = 1
        currentSet = 222632768
        nextSetSize = 0
        numReset = 1
        i = 1
#23 0x00000000006f98fa in ExecAgg (pstate=0xe20df70) at nodeAgg.c:1570
        node = 0xe20df70
        result = 0x0
#24 0x00000000006eb855 in ExecProcNodeInstr (node=0xe20df70) at execProcnode.c:461
        result = 0x4785d0
#25 0x00000000006eb826 in ExecProcNodeFirst (node=0xe20df70) at execProcnode.c:445
No locals.
#26 0x0000000000708a96 in ExecProcNode (node=0xe20df70) at ../../../src/include/executor/executor.h:247
No locals.
#27 0x0000000000709b0d in ExecHashJoinOuterGetTuple (outerNode=0xe20df70, hjstate=0x2778580, hashvalue=0x7ffff5f3865c)
atnodeHashjoin.c:821
 
#28 0x0000000000708f8d in ExecHashJoinImpl (pstate=0x2778580, parallel=false) at nodeHashjoin.c:355
        node = 0x2778580
        outerNode = 0xe20df70
        hashNode = 0xe7e2560
        joinqual = 0x0
        otherqual = 0x0
        econtext = 0x2778798
        hashtable = 0xe837df0
        outerTupleSlot = 0x4040
        hashvalue = 0
        batchno = 10926240
        parallel_state = 0x0
        __func__ = "ExecHashJoinImpl"
#29 0x00000000007094ca in ExecHashJoin (pstate=0x2778580) at nodeHashjoin.c:565
No locals.
#30 0x00000000006eb855 in ExecProcNodeInstr (node=0x2778580) at execProcnode.c:461
        result = 0x4785d0
#31 0x00000000006eb826 in ExecProcNodeFirst (node=0x2778580) at execProcnode.c:445
No locals.
#32 0x0000000000702a9b in ExecProcNode (node=0x2778580) at ../../../src/include/executor/executor.h:247
No locals.
#33 0x0000000000702d12 in MultiExecPrivateHash (node=0x27783a8) at nodeHash.c:164
        outerNode = 0x2778580
        hashkeys = 0xe7ebf78
        hashtable = 0xe837cd8
        slot = 0x80000000001
        econtext = 0x27784c0
        hashvalue = 0
#34 0x0000000000702c8c in MultiExecHash (node=0x27783a8) at nodeHash.c:114
No locals.
#35 0x00000000006eb8f7 in MultiExecProcNode (node=0x27783a8) at execProcnode.c:501
        result = 0x4
        __func__ = "MultiExecProcNode"
#36 0x0000000000708e2a in ExecHashJoinImpl (pstate=0x2777210, parallel=false) at nodeHashjoin.c:290
        node = 0x2777210
        outerNode = 0x27774e8
---Type <return> to continue, or q <return> to quit---
        hashNode = 0x27783a8
        joinqual = 0x0
        otherqual = 0x0
        econtext = 0x2777428
        hashtable = 0xe837cd8
        outerTupleSlot = 0x7ffff5f388ac
        hashvalue = 0
        batchno = 0
        parallel_state = 0x0
        __func__ = "ExecHashJoinImpl"
#37 0x00000000007094ca in ExecHashJoin (pstate=0x2777210) at nodeHashjoin.c:565
No locals.
#38 0x00000000006eb855 in ExecProcNodeInstr (node=0x2777210) at execProcnode.c:461
        result = 0x4785d0
#39 0x00000000006eb826 in ExecProcNodeFirst (node=0x2777210) at execProcnode.c:445
No locals.
#40 0x000000000071d9da in ExecProcNode (node=0x2777210) at ../../../src/include/executor/executor.h:247
No locals.
#41 0x000000000071db26 in ExecSort (pstate=0x2776ce0) at nodeSort.c:107
        plannode = 0xa632c18
        outerNode = 0x2777210
        tupDesc = 0xe7e6980
        node = 0x2776ce0
        estate = 0x27769d8
        dir = ForwardScanDirection
        tuplesortstate = 0x24feec8
        slot = 0x219bfa0
#42 0x00000000006eb855 in ExecProcNodeInstr (node=0x2776ce0) at execProcnode.c:461
        result = 0x4785d0
#43 0x00000000006eb826 in ExecProcNodeFirst (node=0x2776ce0) at execProcnode.c:445
No locals.
#44 0x00000000006f74c5 in ExecProcNode (node=0x2776ce0) at ../../../src/include/executor/executor.h:247
No locals.
#45 0x00000000006f79bb in fetch_input_tuple (aggstate=0x2776ec8) at nodeAgg.c:406
        slot = 0x2776df8
#46 0x00000000006f9d11 in agg_retrieve_direct (aggstate=0x2776ec8) at nodeAgg.c:1755
        node = 0xa633eb0
        econtext = 0x2777150
        tmpcontext = 0x2776bf0
---Type <return> to continue, or q <return> to quit---
        peragg = 0xe824520
        pergroups = 0xe832e88
        outerslot = 0x7ffff5f38b40
        firstSlot = 0xe81f6b8
        result = 0x7ffff5f38b10
        hasGroupingSets = false
        numGroupingSets = 1
        currentSet = 243398904
        nextSetSize = 0
        numReset = 1
        i = 1
#47 0x00000000006f98fa in ExecAgg (pstate=0x2776ec8) at nodeAgg.c:1570
        node = 0x2776ec8
        result = 0x0
#48 0x00000000006eb855 in ExecProcNodeInstr (node=0x2776ec8) at execProcnode.c:461
        result = 0x4785d0
#49 0x00000000006eb826 in ExecProcNodeFirst (node=0x2776ec8) at execProcnode.c:445
No locals.
#50 0x00000000006e0129 in ExecProcNode (node=0x2776ec8) at ../../../src/include/executor/executor.h:247
No locals.
#51 0x00000000006e2abd in ExecutePlan (estate=0x27769d8, planstate=0x2776ec8, use_parallel_mode=false,
operation=CMD_SELECT,sendTuples=true, numberTuples=0, direction=ForwardScanDirection, dest=0xa66bef0,
execute_once=true)at execMain.c:1723
 
        slot = 0x150
        current_tuple_count = 0
#52 0x00000000006e0710 in standard_ExecutorRun (queryDesc=0xdeb5f38, direction=ForwardScanDirection, count=0,
execute_once=true)at execMain.c:364
 
        estate = 0x27769d8
        operation = CMD_SELECT
        dest = 0xa66bef0
        sendTuples = true
        oldcontext = 0xdeb5e20
        __func__ = "standard_ExecutorRun"
#53 0x00007fd37e757633 in pgss_ExecutorRun (queryDesc=0xdeb5f38, direction=ForwardScanDirection, count=0,
execute_once=true)at pg_stat_statements.c:892
 
        save_exception_stack = 0x7ffff5f38dd0
        save_context_stack = 0x0
        local_sigjmp_buf = {{__jmpbuf = {233529144, 9176610699338068304, 4687312, 140737319770032, 0, 0,
9176610699377914192,9151536278965596496}, __mask_was_saved = 0, __saved_mask = {__val = {4294967302, 243497128,
4295032832,140737319767408, 7255973, 140737319767408, 68726688595, 
 
                41380312, 174276272, 174278840, 41381576, 0, 41381576, 16, 174278840, 140737319767696}}}}
#54 0x00007fd37e2d86ef in explain_ExecutorRun (queryDesc=0xdeb5f38, direction=ForwardScanDirection, count=0,
execute_once=true)at auto_explain.c:281
 
        save_exception_stack = 0x7ffff5f38fa0
        save_context_stack = 0x0
        local_sigjmp_buf = {{__jmpbuf = {233529144, 9176610699451314512, 4687312, 140737319770032, 0, 0,
9176610699340165456,9151535941362266448}, __mask_was_saved = 0, __saved_mask = {__val = {10926707, 24, 41371824,
41371824,238009728, 41372168, 8600905475, 35241888, 336, 243497664, 
 
---Type <return> to continue, or q <return> to quit---
                41380032, 243498000, 233528864, 336, 174505712, 140737319767760}}}}
#55 0x00000000006e0511 in ExecutorRun (queryDesc=0xdeb5f38, direction=ForwardScanDirection, count=0, execute_once=true)
atexecMain.c:305
 
No locals.
#56 0x00000000008cfebe in PortalRunSelect (portal=0x21d4bf8, forward=true, count=0, dest=0xa66bef0) at pquery.c:932
        queryDesc = 0xdeb5f38
        direction = ForwardScanDirection
        nprocessed = 233528864
        __func__ = "PortalRunSelect"
#57 0x00000000008cfb4c in PortalRun (portal=0x21d4bf8, count=9223372036854775807, isTopLevel=true, run_once=true,
dest=0xa66bef0,altdest=0xa66bef0, completionTag=0x7ffff5f39100 "") at pquery.c:773
 
        save_exception_stack = 0x7ffff5f39240
        save_context_stack = 0x0
        local_sigjmp_buf = {{__jmpbuf = {174505712, 9176610698715214160, 4687312, 140737319770032, 0, 0,
9176610699432440144,-9176589804060397232}, __mask_was_saved = 0, __saved_mask = {__val = {233529080, 2405784,
21306251184,128, 13, 616, 174505656, 140737319768192, 10970883, 480, 112, 
 
                174505712, 34999312, 174505824, 233528864, 112}}}}
        result = false
        nprocessed = 4687312
        saveTopTransactionResourceOwner = 0x2197a20
        saveTopTransactionContext = 0x27748b0
        saveActivePortal = 0x0
        saveResourceOwner = 0x2197a20
        savePortalContext = 0x0
        saveMemoryContext = 0x27748b0
        __func__ = "PortalRun"
#58 0x00000000008c9b22 in exec_simple_query (query_string=0xe2fbd80 "--BEGIN SQL\nSELECT site_office AS
site_gran,\n\tsite_location||'.'||sect_numAS bs,\n\tgsm_site_name_to_sect_name(site_name, sect_num, sect_name) AS
sitename,\n\tperiodAS start_time,\n\t(maybe_div(sum(rrc_su"...)
 
    at postgres.c:1145
        parsetree = 0xd4f1c38
        portal = 0x21d4bf8
        snapshot_set = true
        commandTag = 0xc5ed4d "SELECT"
        completionTag =
"\000\000\000\000\002\000\000\000\n\000\000\000\000\000\000\000P\221\363\365\377\177\000\000\017s\245\000\000\000\000\000\036\020\000\000\000\000\000\000\000\f\026\002\030\020\000\000\200\275/\016\000\000\000\000(\r\026\002\006\000\000"
        querytree_list = 0xe230d58
        plantree_list = 0xa66beb8
        receiver = 0xa66bef0
        format = 0
        dest = DestRemote
        oldcontext = 0x27748b0
        parsetree_list = 0xd4f1c98
        parsetree_item = 0xd4f1c70
        save_log_statement_stats = false
        was_logged = false
---Type <return> to continue, or q <return> to quit---
        use_implicit_block = false
        msec_str =
"\220\221\363\365\377\177\000\000&C\245\000\000\000\000\000\006\000\000\000\030\020\000\000\200\275/\016\000\000\000"
        __func__ = "exec_simple_query"
#59 0x00000000008cddf3 in PostgresMain (argc=1, argv=0x218fef8, dbname=0x215da48 "ts", username=0x218fed0 "telsasoft")
atpostgres.c:4182
 
        query_string = 0xe2fbd80 "--BEGIN SQL\nSELECT site_office AS site_gran,\n\tsite_location||'.'||sect_num AS
bs,\n\tgsm_site_name_to_sect_name(site_name,sect_num, sect_name) AS sitename,\n\tperiod AS
start_time,\n\t(maybe_div(sum(rrc_su"...
        firstchar = 81
        input_message = {data = 0xe2fbd80 "--BEGIN SQL\nSELECT site_office AS
site_gran,\n\tsite_location||'.'||sect_numAS bs,\n\tgsm_site_name_to_sect_name(site_name, sect_num, sect_name) AS
sitename,\n\tperiodAS start_time,\n\t(maybe_div(sum(rrc_su"..., len = 4121, maxlen = 8192, 
 
          cursor = 4121}
        local_sigjmp_buf = {{__jmpbuf = {35192568, 9176610698734088528, 4687312, 140737319770032, 0, 0,
9176610698788614480,-9176589805133483696}, __mask_was_saved = 1, __saved_mask = {__val = {0, 0, 4494597, 0, 0, 0,
10923965,1024, 35138464, 140737319768848, 10928134, 35138464, 34999312, 
 
                34999312, 35138440, 8}}}}
        send_ready_for_query = false
        disable_idle_in_transaction_timeout = false
        __func__ = "PostgresMain"
#60 0x000000000082a098 in BackendRun (port=0x21867d0) at postmaster.c:4358
        av = 0x218fef8
        maxac = 2
        ac = 1
        secs = 618513625
        usecs = 637903
        i = 1
        __func__ = "BackendRun"
#61 0x0000000000829806 in BackendStartup (port=0x21867d0) at postmaster.c:4030
        bn = 0x2185ef0
        pid = 0
        __func__ = "BackendStartup"
#62 0x0000000000825cab in ServerLoop () at postmaster.c:1707
        port = 0x21867d0
        i = 0
        rmask = {fds_bits = {8, 0 <repeats 15 times>}}
        selres = 1
        now = 1565198425
        readmask = {fds_bits = {56, 0 <repeats 15 times>}}
        nSockets = 6
        last_lockfile_recheck_time = 1565198368
        last_touch_time = 1565197325
        __func__ = "ServerLoop"
#63 0x00000000008255dd in PostmasterMain (argc=3, argv=0x215b9a0) at postmaster.c:1380
        opt = -1
        status = 0
---Type <return> to continue, or q <return> to quit---
        userDoption = 0x217d500 "/var/lib/pgsql/11/data"
        listen_addr_saved = true
        i = 64
        output_config_variable = 0x0
        __func__ = "PostmasterMain"
#64 0x000000000074ba30 in main (argc=3, argv=0x215b9a0) at main.c:228
        do_check_root = true



Re: crash 11.5~ (and 11.4)

От
Justin Pryzby
Дата:
I checked this still happens with max_parallel_workers_per_gather=0.
Now, I just reproduced using SELECT * FROM that table:

(gdb) p thisatt->attrelid
$4 = 2015128626

ts=# SELECT 2015128626::regclass;
regclass | child.huawei_umts_ucell_201908

(gdb) p thisatt->attnum 
$1 = 2
(gdb) p attnum # For earlier crash this is 1....
$2 = 1

ts=# \dt+ child.huawei_umts_ucell_201908
 child  | huawei_umts_ucell_201908 | table | telsasoft | 612 MB | 

\d+ child.huawei_umts_ucell_201908
 bsc6900ucell                               | text                        |           | not null |         | extended |
            | 
 
 ne_name                                    | text                        |           | not null |         | extended |
            | 
 
[...]
Partition of: huawei_umts_ucell_metrics FOR VALUES FROM ('2019-08-01 00:00:00') TO ('2019-09-01 00:00:00')
Partition constraint: ((start_time IS NOT NULL) AND (start_time >= '2019-08-01 00:00:00'::timestamp without time zone)
AND(start_time < '2019-09-01 00:00:00'::timestamp without time zone))
 
Indexes:
    "huawei_umts_ucell_201908_unique_idx" UNIQUE, btree (start_time, device_id, bsc6900ucell, ne_name,
interval_seconds)
    "huawei_umts_ucell_201908_idx" brin (start_time) WITH (autosummarize='true')
    "huawei_umts_ucell_201908_site_idx" btree (site_id)
Check constraints:
    "huawei_umts_ucell_201908_start_time_check" CHECK (start_time >= '2019-08-01 00:00:00'::timestamp without time zone
ANDstart_time < '2019-09-01 00:00:00'::timestamp without time zone)
 
Statistics objects:
    "child"."huawei_umts_ucell_201908" (ndistinct) ON bsc6900ucell, ne_name, start_time, interval_seconds, device_id
FROMchild.huawei_umts_ucell_201908
 
Options: autovacuum_analyze_threshold=2, autovacuum_analyze_scale_factor=0.005

ts=# SELECT COUNT(1) FROM pg_attribute WHERE attrelid='child.huawei_umts_ucell_201908'::regclass;
count | 560

ts=# SELECT * FROM pg_attribute WHERE attrelid='child.huawei_umts_ucell_201908'::regclass AND attnum>=0 ORDER BY
attnum;

attrelid      | 2015128626
attname       | bsc6900ucell
atttypid      | 25
attstattarget | -1
attlen        | -1
attnum        | 1
attndims      | 0
attcacheoff   | -1
atttypmod     | -1
attbyval      | f
attstorage    | x
attalign      | i
attnotnull    | t
atthasdef     | f
atthasmissing | f
attidentity   | 
attisdropped  | f
attislocal    | f
attinhcount   | 1
attcollation  | 100
attacl        | 
attoptions    | 
attfdwoptions | 
attmissingval | 

The only interesting thing about the parent/siblings is that the previous month
was partitioned with "daily" granularity.  I adjusted our threshold for that,
so that August is partitioned monthly:

    [...]
    child.huawei_umts_ucell_20190730 FOR VALUES FROM ('2019-07-30 00:00:00') TO ('2019-07-31 00:00:00'),
    child.huawei_umts_ucell_20190731 FOR VALUES FROM ('2019-07-31 00:00:00') TO ('2019-08-01 00:00:00'),
    child.huawei_umts_ucell_201908 FOR VALUES FROM ('2019-08-01 00:00:00') TO ('2019-09-01 00:00:00')

Program terminated with signal 11, Segmentation fault.
#0  0x00000000004877df in slot_deform_tuple (slot=0x22b5860, natts=554) at heaptuple.c:1465
1465                                    off = att_align_pointer(off, thisatt->attalign, -1,

(gdb) bt
#0  0x00000000004877df in slot_deform_tuple (slot=0x22b5860, natts=554) at heaptuple.c:1465
#1  0x0000000000487e4b in slot_getallattrs (slot=0x22b5860) at heaptuple.c:1626
#2  0x000000000048aab9 in printtup (slot=0x22b5860, self=0x2161b28) at printtup.c:392
#3  0x00000000008d0346 in RunFromStore (portal=0x21d5638, direction=ForwardScanDirection, count=0, dest=0x2161b28) at
pquery.c:1106
#4  0x00000000008cfe88 in PortalRunSelect (portal=0x21d5638, forward=true, count=0, dest=0x2161b28) at pquery.c:928
#5  0x00000000008cfb4c in PortalRun (portal=0x21d5638, count=9223372036854775807, isTopLevel=true, run_once=true,
dest=0x2161b28,altdest=0x2161b28, completionTag=0x7ffff5f39100 "") at pquery.c:773
 
#6  0x00000000008c9b22 in exec_simple_query (query_string=0x2160d28 "FETCH FORWARD 999 FROM _psql_cursor") at
postgres.c:1145
#7  0x00000000008cddf3 in PostgresMain (argc=1, argv=0x218bfb0, dbname=0x218beb0 "ts", username=0x218be90 "pryzbyj") at
postgres.c:4182
#8  0x000000000082a098 in BackendRun (port=0x2181ac0) at postmaster.c:4358
#9  0x0000000000829806 in BackendStartup (port=0x2181ac0) at postmaster.c:4030
#10 0x0000000000825cab in ServerLoop () at postmaster.c:1707
#11 0x00000000008255dd in PostmasterMain (argc=3, argv=0x215b9a0) at postmaster.c:1380
#12 0x000000000074ba30 in main (argc=3, argv=0x215b9a0) at main.c:228

#0  0x00000000004877df in slot_deform_tuple (slot=0x22b5860, natts=554) at heaptuple.c:1465
        thisatt = 0x22dec50
        tuple = 0x22b58a0
        tupleDesc = 0x22debc0
        values = 0x22b58c0
        isnull = 0x22b6a10
        tup = 0x263f9a0
        hasnulls = false
        attnum = 1
        tp = 0x263fa05
"\374\023\026\213^s#\347(\235=\326\321\067\032\245\321B\026}܋FS\375\244\003\065\336\277;\252O\006\065\320\353\211}F\237\373B\243\357J~\270\"\230ƣ\024xǍ\334\377\202\277S\031\375\351\003\220{\004"
        off = 583369983
        bp = 0x263f9b7
"\270\027$U}\232\246\235\004\255\331\033\006Qp\376E\316h\376\363\247\366Նgy7\311E\224~F\274\023ϋ%\216,\221\331@\024\363\233\070\275\004\254L\217t\262X\227\352\346\347\371\070\321ш\221\350fc\316\r\356\351h\275\213\230\360\203\374\023\026\213^s#\347(\235=\326\321\067\032\245\321B\026}܋FS\375\244\003\065\336\277;\252O\006\065\320\353\211}F\237\373B\243\357J~\270\"\230ƣ\024xǍ\334\377\202\277S\031\375\351\003\220{\004"
        slow = true
#1  0x0000000000487e4b in slot_getallattrs (slot=0x22b5860) at heaptuple.c:1626
        tdesc_natts = 554
        attnum = 554
        tuple = 0x22b58a0
        __func__ = "slot_getallattrs"

#2  0x000000000048aab9 in printtup (slot=0x22b5860, self=0x2161b28) at printtup.c:392
        typeinfo = 0x22debc0
        myState = 0x2161b28
        oldcontext = 0xa8e29a
        buf = 0x2161b78
        natts = 554
        i = 0
---Type <return> to continue, or q <return> to quit---
#3  0x00000000008d0346 in RunFromStore (portal=0x21d5638, direction=ForwardScanDirection, count=0, dest=0x2161b28) at
pquery.c:1106
        oldcontext = 0x21931a0
        ok = true
        forward = true
        current_tuple_count = 879
        slot = 0x22b5860
#4  0x00000000008cfe88 in PortalRunSelect (portal=0x21d5638, forward=true, count=0, dest=0x2161b28) at pquery.c:928
        queryDesc = 0x0
        direction = ForwardScanDirection
        nprocessed = 35205816
        __func__ = "PortalRunSelect"
#5  0x00000000008cfb4c in PortalRun (portal=0x21d5638, count=9223372036854775807, isTopLevel=true, run_once=true,
dest=0x2161b28,altdest=0x2161b28, completionTag=0x7ffff5f39100 "") at pquery.c:773
 
        save_exception_stack = 0x7ffff5f39240
        save_context_stack = 0x0
        local_sigjmp_buf = {{__jmpbuf = {35003176, 9176610698715214160, 4687312, 140737319770032, 0, 0,
9176610699432440144,-9176589804060397232}, __mask_was_saved = 0, __saved_mask = {__val = {
 
                36636288, 4352, 35002176, 1112, 1176, 0, 35003120, 140737319768192, 10970883, 35488536, 112, 35003176,
34999312,35003288, 35205536, 112}}}}
 
        result = false
        nprocessed = 4687312
        saveTopTransactionResourceOwner = 0x2187b88
        saveTopTransactionContext = 0x21f5d90
        saveActivePortal = 0x0
        saveResourceOwner = 0x2187b88
        savePortalContext = 0x0
        saveMemoryContext = 0x21f5d90
        __func__ = "PortalRun"
#6  0x00000000008c9b22 in exec_simple_query (query_string=0x2160d28 "FETCH FORWARD 999 FROM _psql_cursor") at
postgres.c:1145
[...]





Re: crash 11.5~ (and 11.4)

От
Justin Pryzby
Дата:
Just found this, although I'm not sure what to do about it.  If it's corrupt
table data, I can restore from backup.

ts=# VACUUM FREEZE VERBOSE child.huawei_umts_ucell_201908;
INFO:  00000: aggressively vacuuming "child.huawei_umts_ucell_201908"
LOCATION:  lazy_scan_heap, vacuumlazy.c:502
ERROR:  XX001: found xmin 73850277 from before relfrozenxid 111408920
LOCATION:  heap_prepare_freeze_tuple, heapam.c:6853

I confirmed I updated to 11.4 immediately on its release:
[pryzbyj@database ~]$ ls -ltc /usr/pgsql-11/bin/postgres
-rwxr-xr-x. 1 root root 7291736 Jun 20 07:09 /usr/pgsql-11/bin/postgres

That table would've been created shortly after midnight on Aug 1 when we loaded
first data for the month.  So it was created and processed only on pg11.4,
although the parent has probably been around since pg_upgrade last October.

Here's usable looking bt

#0  heap_prepare_freeze_tuple (tuple=0x7fd3d8fa0058, relfrozenxid=111408920, 
    relminmxid=1846, cutoff_xid=111658731, cutoff_multi=1846, frz=0x223e930, 
    totally_frozen_p=0x7ffff5f3848f) at heapam.c:6832
        changed = 211
        xmax_already_frozen = 127
        xmin_frozen = false
        freeze_xmax = false
        xid = 0
        __func__ = "heap_prepare_freeze_tuple"

#1  0x00000000006c8e6e in lazy_scan_heap (onerel=0x7fd37e4f8fa0, options=13, 
    vacrelstats=0x223e790, Irel=0x223e8d8, nindexes=3, aggressive=true)
    at vacuumlazy.c:1151
        tuple_totally_frozen = false
        itemid = 0x7fd3d8f9e918
        buf = 168313
        page = 0x7fd3d8f9e900 "\020:"
        offnum = 1
        maxoff = 3
        hastup = true
        nfrozen = 0
        freespace = 35907664
        all_frozen = true
        tupgone = false
        prev_dead_count = 0
        all_visible_according_to_vm = true
        all_visible = true
        has_dead_tuples = false
        visibility_cutoff_xid = 73850277
        nblocks = 75494
        blkno = 34915
        tuple = {t_len = 2212, t_self = {ip_blkid = {bi_hi = 0, 
              bi_lo = 34915}, ip_posid = 1}, t_tableOid = 2015128626, 
          t_data = 0x7fd3d8fa0058}
        relname = 0x7fd37e4f6ac8 "huawei_umts_ucell_201908"
        relfrozenxid = 111408920
        relminmxid = 1846
        empty_pages = 0
        vacuumed_pages = 0
        next_fsm_block_to_vacuum = 0
        num_tuples = 1
        live_tuples = 1
        tups_vacuumed = 0
        nkeep = 0
        nunused = 0
        indstats = 0x223e850
        i = 32512
        ru0 = {tv = {tv_sec = 1565204981, tv_usec = 609217}, ru = {ru_utime = {
              tv_sec = 13, tv_usec = 865892}, ru_stime = {tv_sec = 1, 
              tv_usec = 960701}, ru_maxrss = 136988, ru_ixrss = 0, 
            ru_idrss = 0, ru_isrss = 0, ru_minflt = 48841, ru_majflt = 1, 
            ru_nswap = 0, ru_inblock = 196152, ru_oublock = 292656, 
            ru_msgsnd = 0, ru_msgrcv = 0, ru_nsignals = 0, ru_nvcsw = 849, 
            ru_nivcsw = 6002}}
        vmbuffer = 4267
        next_unskippable_block = 34916
        skipping_blocks = false
        frozen = 0x223e930
        buf = {data = 0x0, len = -168590068, maxlen = 32767, 
          cursor = -168589904}
        initprog_index = {0, 1, 5}
        initprog_val = {1, 75494, 21968754}
        __func__ = "lazy_scan_heap"
#2  0x00000000006c77c7 in lazy_vacuum_rel (onerel=0x7fd37e4f8fa0, options=13, 
    params=0x7ffff5f38a70, bstrategy=0x22b8cf8) at vacuumlazy.c:265
        vacrelstats = 0x223e790
        Irel = 0x223e8d8
        nindexes = 3
        ru0 = {tv = {tv_sec = 140733193388036, tv_usec = 281474985219646}, 
          ru = {ru_utime = {tv_sec = 12933152, tv_usec = 35123496}, 
            ru_stime = {tv_sec = 140737319765996, tv_usec = 4687312}, 
            ru_maxrss = 0, ru_ixrss = -1887226286461968708, 
            ru_idrss = 35351240, ru_isrss = 35249392, 
            ru_minflt = 140737319765776, ru_majflt = 35249392, ru_nswap = 20, 
            ru_inblock = 0, ru_oublock = 140737319765936, 
            ru_msgsnd = 10990471, ru_msgrcv = 140546333970336, 
            ru_nsignals = 35249264, ru_nvcsw = 4, ru_nivcsw = 4687312}}
        starttime = 0
        secs = 72057594037927936
        usecs = 2015128626
        read_rate = 6.9533474784178538e-310
        write_rate = 6.9439115263673584e-310
        aggressive = true
        scanned_all_unfrozen = false
        xidFullScanLimit = 111658735
        mxactFullScanLimit = 1846
        new_rel_pages = 0
        new_rel_allvisible = 36424984
        new_live_tuples = 6.9533474784194348e-310
        new_frozen_xid = 9080197
        new_min_multi = 0
        __func__ = "lazy_vacuum_rel"
#3  0x00000000006c72b3 in vacuum_rel (relid=2015128626, relation=0x2161820, 
    options=13, params=0x7ffff5f38a70) at vacuum.c:1557
        lmode = 4
        onerel = 0x7fd37e4f8fa0
        onerelid = {relId = 2015128626, dbId = 16564}
        toast_relid = 2015128673
        save_userid = 18712
        save_sec_context = 0
        save_nestlevel = 2
        rel_lock = true
        __func__ = "vacuum_rel"
#4  0x00000000006c5939 in vacuum (options=13, relations=0x22b8e70, 
    params=0x7ffff5f38a70, bstrategy=0x22b8cf8, isTopLevel=true)
    at vacuum.c:340
        vrel = 0x22b8e10
        cur = 0x22b8e48
        save_exception_stack = 0x7ffff5f38c20
        save_context_stack = 0x0
        local_sigjmp_buf = {{__jmpbuf = {35002584, 9176610699604406608, 
              4687312, 140737319770032, 0, 0, 9176610699520520528, 
              -9176587885370396336}, __mask_was_saved = 0, __saved_mask = {
              __val = {35906192, 140737319766400, 10964897, 6853, 35905936, 
                208, 35890568, 21474836680, 35890776, 140737319766560, 
                10927612, 208, 35890288, 35890288, 35890504, 35890544}}}}
        in_vacuum = true
        stmttype = 0xc03841 "VACUUM"
        in_outer_xact = false
        use_own_xacts = true
        __func__ = "vacuum"
#5  0x00000000006c54e0 in ExecVacuum (vacstmt=0x2161910, isTopLevel=true)
    at vacuum.c:141
        params = {freeze_min_age = 0, freeze_table_age = 0, 
          multixact_freeze_min_age = 0, multixact_freeze_table_age = 0, 
          is_wraparound = false, log_min_duration = -1}
        __func__ = "ExecVacuum"


errfinish one seems to be not useful?

(gdb) bt
#0  errfinish (dummy=0) at elog.c:415
#1  0x000000000054aefd in ShowTransactionStateRec (
    str=0xafed12 "StartTransaction", s=0xf9efa0) at xact.c:5157
#2  0x000000000054ad5c in ShowTransactionState (
    str=0xafed12 "StartTransaction") at xact.c:5130
#3  0x00000000005470c8 in StartTransaction () at xact.c:1961
#4  0x0000000000547db9 in StartTransactionCommand () at xact.c:2734
#5  0x00000000008cbe65 in start_xact_command () at postgres.c:2500
#6  0x00000000008c9772 in exec_simple_query (
    query_string=0x2160d28 "VACUUM FREEZE VERBOSE child.huawei_umts_ucell_201908;") at postgres.c:948
#7  0x00000000008cddf3 in PostgresMain (argc=1, argv=0x218c768, 
    dbname=0x218c5d0 "ts", username=0x218c5b0 "pryzbyj") at postgres.c:4182
#8  0x000000000082a098 in BackendRun (port=0x21818d0) at postmaster.c:4358
#9  0x0000000000829806 in BackendStartup (port=0x21818d0) at postmaster.c:4030
#10 0x0000000000825cab in ServerLoop () at postmaster.c:1707
#11 0x00000000008255dd in PostmasterMain (argc=3, argv=0x215b9a0)
    at postmaster.c:1380
#12 0x000000000074ba30 in main (argc=3, argv=0x215b9a0) at main.c:228



#0  errfinish (dummy=0) at elog.c:415
        edata = 0xfe7b80
        elevel = 0
        oldcontext = 0x215df90
        econtext = 0xa4d11a
        __func__ = "errfinish"
#1  0x000000000054aefd in ShowTransactionStateRec (
    str=0xafed12 "StartTransaction", s=0xf9efa0) at xact.c:5157
        buf = {data = 0x22462d8 "", len = 0, maxlen = 1024, cursor = 0}
        __func__ = "ShowTransactionStateRec"
#2  0x000000000054ad5c in ShowTransactionState (
    str=0xafed12 "StartTransaction") at xact.c:5130
No locals.
#3  0x00000000005470c8 in StartTransaction () at xact.c:1961
        s = 0xf9efa0
        vxid = {backendId = 5, localTransactionId = 16}
#4  0x0000000000547db9 in StartTransactionCommand () at xact.c:2734
        s = 0xf9efa0
        __func__ = "StartTransactionCommand"
#5  0x00000000008cbe65 in start_xact_command () at postgres.c:2500
No locals.

I looked at this briefly:

ts=# SELECT * FROM page_header(get_raw_page('child.huawei_umts_ucell_201908', 0));
lsn       | 3A49/66F44310
checksum  | 0
flags     | 4
lower     | 36
upper     | 1432
special   | 8192
pagesize  | 8192
version   | 4
prune_xid | 0

In case someone wants me to look at bits and pieces, I saved a copy of:
relfilenode         | 2015128626



Re: crash 11.5~ (and 11.4)

От
Andres Freund
Дата:
Hi,

On 2019-08-07 14:29:28 -0500, Justin Pryzby wrote:
> Just found this, although I'm not sure what to do about it.  If it's corrupt
> table data, I can restore from backup.
> 
> ts=# VACUUM FREEZE VERBOSE child.huawei_umts_ucell_201908;
> INFO:  00000: aggressively vacuuming "child.huawei_umts_ucell_201908"
> LOCATION:  lazy_scan_heap, vacuumlazy.c:502
> ERROR:  XX001: found xmin 73850277 from before relfrozenxid 111408920
> LOCATION:  heap_prepare_freeze_tuple, heapam.c:6853

Ugh :(

We really need to add a error context to vacuumlazy that shows which
block is being processed.


> I confirmed I updated to 11.4 immediately on its release:
> [pryzbyj@database ~]$ ls -ltc /usr/pgsql-11/bin/postgres
> -rwxr-xr-x. 1 root root 7291736 Jun 20 07:09 /usr/pgsql-11/bin/postgres

Any chance you might not have restarted postgres at that time?


> That table would've been created shortly after midnight on Aug 1 when we loaded
> first data for the month.  So it was created and processed only on pg11.4,
> although the parent has probably been around since pg_upgrade last October.

I don't think there's a way the parent could play a role.


> Here's usable looking bt
> 
> #0  heap_prepare_freeze_tuple (tuple=0x7fd3d8fa0058, relfrozenxid=111408920, 
>     relminmxid=1846, cutoff_xid=111658731, cutoff_multi=1846, frz=0x223e930, 
>     totally_frozen_p=0x7ffff5f3848f) at heapam.c:6832
>         changed = 211
>         xmax_already_frozen = 127
>         xmin_frozen = false
>         freeze_xmax = false
>         xid = 0
>         __func__ = "heap_prepare_freeze_tuple"

Hm. That's a backtrace to what precisely? Are you sure it's the the
erroring call to heap_prepare_freeze_tuple? Because I think that's at
the function's start - which is why some of the stack variables have
bogus contents.

I think you'd need to set the breakpoint to heapam.c:6850 to be sure to
catch the error (while the error message heapam.c:6853, that's because
the line macro in some compilers expands to the end of the statement).


> errfinish one seems to be not useful?
> 
> (gdb) bt
> #0  errfinish (dummy=0) at elog.c:415
> #1  0x000000000054aefd in ShowTransactionStateRec (
>     str=0xafed12 "StartTransaction", s=0xf9efa0) at xact.c:5157
> #2  0x000000000054ad5c in ShowTransactionState (
>     str=0xafed12 "StartTransaction") at xact.c:5130
> #3  0x00000000005470c8 in StartTransaction () at xact.c:1961
> #4  0x0000000000547db9 in StartTransactionCommand () at xact.c:2734
> #5  0x00000000008cbe65 in start_xact_command () at postgres.c:2500
> #6  0x00000000008c9772 in exec_simple_query (
>     query_string=0x2160d28 "VACUUM FREEZE VERBOSE child.huawei_umts_ucell_201908;") at postgres.c:948
> #7  0x00000000008cddf3 in PostgresMain (argc=1, argv=0x218c768, 
>     dbname=0x218c5d0 "ts", username=0x218c5b0 "pryzbyj") at postgres.c:4182
> #8  0x000000000082a098 in BackendRun (port=0x21818d0) at postmaster.c:4358
> #9  0x0000000000829806 in BackendStartup (port=0x21818d0) at postmaster.c:4030
> #10 0x0000000000825cab in ServerLoop () at postmaster.c:1707
> #11 0x00000000008255dd in PostmasterMain (argc=3, argv=0x215b9a0)
>     at postmaster.c:1380
> #12 0x000000000074ba30 in main (argc=3, argv=0x215b9a0) at main.c:228

That's probably just the errfinish for a debug message. Might be easier
to set a breakpoint to pg_re_throw().


> I looked at this briefly:
> 
> ts=# SELECT * FROM page_header(get_raw_page('child.huawei_umts_ucell_201908', 0));
> lsn       | 3A49/66F44310
> checksum  | 0
> flags     | 4
> lower     | 36
> upper     | 1432
> special   | 8192
> pagesize  | 8192
> version   | 4
> prune_xid | 0

That's not necessarily the target block though, right?  It'd be useful
to get this for the block with the corruption, and perhaps one
before/after.  If the backtrace I commented on at the top is correct,
the relevant tuple was at "blkno = 34915" (visible a frame or two
up).

Also heap_page_items() for those pages would be useful - you might want
to not include the t_data column, as it has raw tuple data.


Could you show
* pg_controldata
* SELECT oid, datname, datfrozenxid, datminmxid FROM pg_database;

That should be easy to redact if necessary. It'd be useful to know which
database the corrupted table is in.

It'd also be helpful to get something roughly like
SELECT
    oid, oid::regclass,
    relkind, reltoastrelid,
    relfrozenxid, age(relfrozenxid),
    relminmxid, mxid_age(relminmxid),
    relpages,
    (SELECT txid_current())
FROM pg_class
WHERE relfrozenxid <> 0;

But I'm not sure if including the tablenames would be a problematic for
your case. Possibly a bit hard to infer "usage" correlations without,
but probably still worthwhile.

Greetings,

Andres Freund



Re: crash 11.5~ (and 11.4)

От
Justin Pryzby
Дата:
On Wed, Aug 07, 2019 at 04:51:54PM -0700, Andres Freund wrote:
> Hi,
> 
> On 2019-08-07 14:29:28 -0500, Justin Pryzby wrote:
> > Just found this, although I'm not sure what to do about it.  If it's corrupt
> > table data, I can restore from backup.

In the meantime, I've renamed+uninherited the table and restored from backup,
with the broken table preserved.

However, the previous days' backup did this, so I used Monday's backup.

[pryzbyj@database ~]$ time sudo -u postgres pg_restore /srv/cdrperfbackup/ts/2019-08-06/curtables/ -t
huawei_umts_ucell_201908--verbose -d ts --host /tmp
 
pg_restore: connecting to database for restore
pg_restore: creating TABLE "child.huawei_umts_ucell_201908"
pg_restore: INFO:  partition constraint for table "huawei_umts_ucell_201908" is implied by existing constraints
pg_restore: processing data for table "child.huawei_umts_ucell_201908"
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 62710; 0 2015128626 TABLE DATA huawei_umts_ucell_201908 telsasoft
pg_restore: [archiver (db)] COPY failed for table "huawei_umts_ucell_201908": ERROR:  invalid byte sequence for
encoding"UTF8": 0xe7 0x28 0x9d
 
CONTEXT:  COPY huawei_umts_ucell_201908, line 104746

Also, I found core with this BT from my own manual invocation of ANALYZE.

(gdb) bt
#0  0x00000039674324f5 in raise () from /lib64/libc.so.6
#1  0x0000003967433cd5 in abort () from /lib64/libc.so.6
#2  0x0000000000a3c4b2 in ExceptionalCondition (conditionName=0xaad9e7 "!(j > attnum)", errorType=0xaad84c
"FailedAssertion",fileName=0xaad840 "heaptuple.c", lineNumber=582) at assert.c:54
 
#3  0x00000000004856a7 in nocachegetattr (tuple=0x1c116b08, attnum=1, tupleDesc=0x23332b0) at heaptuple.c:582
#4  0x000000000062981d in std_fetch_func (stats=0x22700c8, rownum=54991, isNull=0x7ffff5f3844f) at analyze.c:1718
#5  0x000000000062aa99 in compute_scalar_stats (stats=0x22700c8, fetchfunc=0x62942b <std_fetch_func>,
samplerows=120000,totalrows=227970) at analyze.c:2370
 
#6  0x00000000006270db in do_analyze_rel (onerel=0x7fd37e2afa50, options=2, params=0x7ffff5f38a70, va_cols=0x0,
acquirefunc=0x627e9e<acquire_sample_rows>, relpages=75494, inh=false,
 
    in_outer_xact=false, elevel=13) at analyze.c:579
#7  0x0000000000626616 in analyze_rel (relid=2015128626, relation=0x2161820, options=2, params=0x7ffff5f38a70,
va_cols=0x0,in_outer_xact=false, bstrategy=0x21f7848) at analyze.c:310
 
#8  0x00000000006c59b2 in vacuum (options=2, relations=0x21f79c0, params=0x7ffff5f38a70, bstrategy=0x21f7848,
isTopLevel=true)at vacuum.c:357
 
#9  0x00000000006c54e0 in ExecVacuum (vacstmt=0x2161910, isTopLevel=true) at vacuum.c:141
#10 0x00000000008d1d7e in standard_ProcessUtility (pstmt=0x21619d0, queryString=0x2160d28 "ANALYZE child.
huawei_umts_ucell_201908;",context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
 
    dest=0x2161cc0, completionTag=0x7ffff5f39100 "") at utility.c:670

#3  0x00000000004856a7 in nocachegetattr (tuple=0x1c116b08, attnum=1, tupleDesc=0x23332b0) at heaptuple.c:582
        natts = 554
        j = 1
        tup = 0x1c116b20
        tp = 0x1c116b85
"\374\023\026\213^s#\347(\235=\326\321\067\032\245\321B\026}܋FS\375\244\003\065\336\277;\252O\006\065\320\353\211}F\237\373B\243\357J~\270\"\230ƣ\024xǍ\334\377\202\277S\031\375\351\003\220{\004"
        bp = 0x1c116b37
"\270\027$U}\232\246\235\004\255\331\033\006Qp\376E\316h\376\363\247\366Նgy7\311E\224~F\274\023ϋ%\216,\221\331@\024\363\233\070\275\004\254L\217t\262X\227\352\346\347\371\070\321ш\221\350fc\316\r\356\351h\275\213\230\360\203\374\023\026\213^s#\347(\235=\326\321\067\032\245\321B\026}܋FS\375\244\003\065\336\277;\252O\006\065\320\353\211}F\237\373B\243\357J~\270\"\230ƣ\024xǍ\334\377\202\277S\031\375\351\003\220{\004"
        slow = false
        off = -1
#4  0x000000000062981d in std_fetch_func (stats=0x22700c8, rownum=54991, isNull=0x7ffff5f3844f) at analyze.c:1718
        attnum = 2
        tuple = 0x1c116b08
        tupDesc = 0x23332b0
#5  0x000000000062aa99 in compute_scalar_stats (stats=0x22700c8, fetchfunc=0x62942b <std_fetch_func>,
samplerows=120000,totalrows=227970) at analyze.c:2370
 
        value = 546120480
        isnull = false
        i = 54991
        null_cnt = 0
        nonnull_cnt = 54991
        toowide_cnt = 0
---Type <return> to continue, or q <return> to quit---
        total_width = 274955
        is_varlena = true
        is_varwidth = true
        corr_xysum = 3.9525251667299724e-323
        ssup = {ssup_cxt = 0x22ae080, ssup_collation = 100, ssup_reverse = false, ssup_nulls_first = false, ssup_attno
=0, ssup_extra = 0x22ae5b0, comparator = 0xa06720 <varstrfastcmp_locale>,
 
          abbreviate = false, abbrev_converter = 0, abbrev_abort = 0, abbrev_full_comparator = 0}
        values = 0x204f9970
        values_cnt = 54991
        tupnoLink = 0x206ce5c0
        track = 0x22ae198
        track_cnt = 0
        num_mcv = 100
        num_bins = 100
        mystats = 0x2270490

> > ts=# VACUUM FREEZE VERBOSE child.huawei_umts_ucell_201908;
> > INFO:  00000: aggressively vacuuming "child.huawei_umts_ucell_201908"
> > LOCATION:  lazy_scan_heap, vacuumlazy.c:502
> > ERROR:  XX001: found xmin 73850277 from before relfrozenxid 111408920
> > LOCATION:  heap_prepare_freeze_tuple, heapam.c:6853
> 
> Ugh :(
> 
> We really need to add a error context to vacuumlazy that shows which
> block is being processed.

> > I confirmed I updated to 11.4 immediately on its release:
> > [pryzbyj@database ~]$ ls -ltc /usr/pgsql-11/bin/postgres
> > -rwxr-xr-x. 1 root root 7291736 Jun 20 07:09 /usr/pgsql-11/bin/postgres
> 
> Any chance you might not have restarted postgres at that time?

I don't think so, nagios would've been yelling at me.  Also..I found I have a
log of pg_settings, which shows server_version updated 2019-06-20
07:05:01.425645.  Is there a bug in 11.3 which could explain it ?

> > Here's usable looking bt
> > 
> > #0  heap_prepare_freeze_tuple (tuple=0x7fd3d8fa0058, relfrozenxid=111408920, 
> >     relminmxid=1846, cutoff_xid=111658731, cutoff_multi=1846, frz=0x223e930, 
> >     totally_frozen_p=0x7ffff5f3848f) at heapam.c:6832
> >         changed = 211
> >         xmax_already_frozen = 127
> >         xmin_frozen = false
> >         freeze_xmax = false
> >         xid = 0
> >         __func__ = "heap_prepare_freeze_tuple"
> 
> Hm. That's a backtrace to what precisely? Are you sure it's the the
> erroring call to heap_prepare_freeze_tuple? Because I think that's at
> the function's start - which is why some of the stack variables have
> bogus contents.

Yes, I had just done this: b heap_prepare_freeze_tuple.

> I think you'd need to set the breakpoint to heapam.c:6850 to be sure to
> catch the error (while the error message heapam.c:6853, that's because
> the line macro in some compilers expands to the end of the statement).

I did 6848 and got;

#0  heap_prepare_freeze_tuple (tuple=0x7fd403a02058, relfrozenxid=111408920, relminmxid=1846, cutoff_xid=111665057,
cutoff_multi=1846,frz=0x22aa6f8, totally_frozen_p=0x7ffff5f3848f) at heapam.c:6849
 
        changed = false
        xmax_already_frozen = false
        xmin_frozen = false
        freeze_xmax = false
        xid = 73850277
        __func__ = "heap_prepare_freeze_tuple"

> > errfinish one seems to be not useful?
> > 
> > (gdb) bt
> > #0  errfinish (dummy=0) at elog.c:415
> > #1  0x000000000054aefd in ShowTransactionStateRec (
> >     str=0xafed12 "StartTransaction", s=0xf9efa0) at xact.c:5157
> > [...]
>
> That's probably just the errfinish for a debug message. Might be easier
> to set a breakpoint to pg_re_throw().

Right, that works.

(gdb) bt
#0  pg_re_throw () at elog.c:1730
#1  0x0000000000a3cb1e in errfinish (dummy=0) at elog.c:467
#2  0x00000000004e72c4 in heap_prepare_freeze_tuple (tuple=0x7fd403a02058, relfrozenxid=111408920, relminmxid=1846,
cutoff_xid=111665065,cutoff_multi=1846, frz=0x2260558, totally_frozen_p=0x7ffff5f3848f) at heapam.c:6850
 
#3  0x00000000006c8e6e in lazy_scan_heap (onerel=0x7fd37d5f99c8, options=13, vacrelstats=0x22603b8, Irel=0x2260500,
nindexes=3,aggressive=true) at vacuumlazy.c:1151 

> > I looked at this briefly:
> > 
> > ts=# SELECT * FROM page_header(get_raw_page('child.huawei_umts_ucell_201908', 0));
> > lsn       | 3A49/66F44310
> > checksum  | 0
> > flags     | 4
> > lower     | 36
> > upper     | 1432
> > special   | 8192
> > pagesize  | 8192
> > version   | 4
> > prune_xid | 0
> 
> That's not necessarily the target block though, right?  It'd be useful
> to get this for the block with the corruption, and perhaps one
> before/after.  If the backtrace I commented on at the top is correct,
> the relevant tuple was at "blkno = 34915" (visible a frame or two
> up).

Ah, right.

ts=# SELECT * FROM page_header(get_raw_page('child.huawei_umts_ucell_201908', 34914));
      lsn      | checksum | flags | lower | upper | special | pagesize | version | prune_xid 
---------------+----------+-------+-------+-------+---------+----------+---------+-----------
 3A4A/4F2192B8 |        0 |     0 |    36 |  1520 |    8192 |     8192 |       4 |         0

ts=# SELECT * FROM page_header(get_raw_page('child.huawei_umts_ucell_201908', 34915));
 3A4A/4F21C268 |        0 |     0 |    36 |  1520 |    8192 |     8192 |       4 |         0

ts=# SELECT * FROM page_header(get_raw_page('child.huawei_umts_ucell_201908', 34916));
 3A4A/4F21DC90 |        0 |     0 |    36 |  1496 |    8192 |     8192 |       4 |         0

> Also heap_page_items() for those pages would be useful - you might want
> to not include the t_data column, as it has raw tuple data.

ts=# SELECT lp, lp_off, lp_flags, lp_len, t_xmin, t_xmax, t_field3, t_ctid, t_infomask2, t_infomask, t_hoff, t_bits,
t_oidFROM heap_page_items(get_raw_page('child.huawei_umts_ucell_201908', 34914));
 
 lp | lp_off | lp_flags | lp_len |  t_xmin   | t_xmax | t_field3 |  t_ctid   | t_infomask2 | t_infomask | t_hoff |
t_bits... | t_oid 
 

  1 |   5952 |        1 |   2236 | 111659903 |      0 |        0 | (34914,1) |         554 |       2311 |     96 |
11111111111111111111111111111111111111011100111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001111111111111111111111001111111111111111111111011111111111111111011111111101101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000000000001111111111111111111111111111111111111111111100111111000000
|     
 
  2 |   3736 |        1 |   2212 | 111659903 |      0 |        0 | (34914,2) |         554 |       2311 |     96 |
11111111111111111111111111111111111111011100111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111100000000011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001111111111111111111111000111111111111111111111011111111111111111010111111101101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000000000001111111111111111111111111111111111111111111100111111000000
|     
 
  3 |   1520 |        1 |   2216 | 111659903 |      0 |        0 | (34914,3) |         554 |       2311 |     96 |
11111111111111111111111111111111111111011100111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111100000001011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001111111111111111111111000111111111111111111111011111111111111111011111111101101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000000000001111111111111111111111111111111111111111111100111111000000
|     
 

ts=# SELECT lp, lp_off, lp_flags, lp_len, t_xmin, t_xmax, t_field3, t_ctid, t_infomask2, t_infomask, t_hoff, t_bits,
t_oidFROM heap_page_items(get_raw_page('child.huawei_umts_ucell_201908', 34915));
 
  1 |   5976 |        1 |   2212 | 111659903 |      0 |        0 | (34915,1) |         554 |       2311 |     96 |
11111111111111111111111111111111111111011100111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111100000001011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001111111111111111111111000111111111111111111111011111111111111111010111111101101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000000000001111111111111111111111111111111111111111111100111111000000
|     
 
  2 |   3744 |        1 |   2232 | 111659903 |      0 |        0 | (34915,2) |         554 |       2311 |     96 |
11111111111111111111111111111111111111011100111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111100001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001111111111111111111111001111111111111111111111011111111111111111010111111101101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000000000001111111111111111111111111111111111111111111100111111000000
|     
 
  3 |   1520 |        1 |   2220 | 111659903 |      0 |        0 | (34915,3) |         554 |       2311 |     96 |
11111111111111111111111111111111111111011100111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000001011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001111111111111111111111000111111111111111111111011111111111111111010111111101101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000000000001111111111111111111111111111111111111111111100111111000000
|     
 

ts=# SELECT lp, lp_off, lp_flags, lp_len, t_xmin, t_xmax, t_field3, t_ctid, t_infomask2, t_infomask, t_hoff, t_bits,
t_oidFROM heap_page_items(get_raw_page('child.huawei_umts_ucell_201908', 34916));
 
  1 |   5952 |        1 |   2236 | 111659903 |      0 |        0 | (34916,1) |         554 |       2311 |     96 |
11111111111111111111111111111111111111011100111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001111111111111111111111001111111111111111111111011111111111111111011111111101101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000000000001111111111111111111111111111111111111111111100111111000000
|     
 
  2 |   3736 |        1 |   2212 | 111659903 |      0 |        0 | (34916,2) |         554 |       2311 |     96 |
11111111111111111111111111111111111111011100111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111100000001011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001111111111111111111111000111111111111111111111011111111111111111010111111101101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000000000001111111111111111111111111111111111111111111100111111000000
|     
 
  3 |   1496 |        1 |   2240 | 111659903 |      0 |        0 | (34916,3) |         554 |       2311 |     96 |
11111111111111111111111111111111111111011100111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001111111111111111111111001111111111111111111111011111111111111111010111111101101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001100000001111111111111111111111111111111111111111111100111111000000
|      
> Could you show
> * pg_controldata

[pryzbyj@database ~]$ sudo -u postgres ./src/postgresql.bin/bin/pg_controldata -D /var/lib/pgsql/11/data
[sudo] password for pryzbyj:
pg_control version number:            1100
Catalog version number:               201809051
Database system identifier:           6616072377454496350
Database cluster state:               in production
pg_control last modified:             Wed 07 Aug 2019 06:32:05 PM MDT
Latest checkpoint location:           3A4C/D1BB5BD8
Latest checkpoint's REDO location:    3A4C/CCC1DF90
Latest checkpoint's REDO WAL file:    0000000100003A4C000000CC
Latest checkpoint's TimeLineID:       1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0:111664949
Latest checkpoint's NextOID:          2069503852
Latest checkpoint's NextMultiXactId:  1846
Latest checkpoint's NextMultiOffset:  3891
Latest checkpoint's oldestXID:        33682470
Latest checkpoint's oldestXID's DB:   16564
Latest checkpoint's oldestActiveXID:  111664945
Latest checkpoint's oldestMultiXid:   1484
Latest checkpoint's oldestMulti's DB: 16564
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:            Wed 07 Aug 2019 06:31:35 PM MDT
Fake LSN counter for unlogged rels:   0/1
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    replica
wal_log_hints setting:                off
max_connections setting:              200
max_worker_processes setting:         8
max_prepared_xacts setting:           0
max_locks_per_xact setting:           128
track_commit_timestamp setting:       off
Maximum data alignment:               8
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       8192
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        1996
Size of a large-object chunk:         2048
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           0
Mock authentication nonce:            ff863744b99b849579d561c5117157c78a8ced0563df824f3035f70810fec534

> * SELECT oid, datname, datfrozenxid, datminmxid FROM pg_database;
ts=# SELECT oid, datname, datfrozenxid, datminmxid FROM pg_database;
  oid  |  datname  | datfrozenxid | datminmxid 
-------+-----------+--------------+------------
 13529 | template0 |    103564559 |       1846
 16400 | template1 |     52362404 |       1486
 16443 | postgres  |     46060229 |       1485
 16564 | ts        |     33682470 |       1484
(4 rows)

> That should be easy to redact if necessary. It'd be useful to know which
> database the corrupted table is in.
> 
> It'd also be helpful to get something roughly like
> SELECT oid, oid::regclass, relkind, reltoastrelid, relfrozenxid, age(relfrozenxid), relminmxid, mxid_age(relminmxid),
relpages,(SELECT txid_current()) FROM pg_class WHERE relfrozenxid <> 0;
 

> But I'm not sure if including the tablenames would be a problematic for
> your case. Possibly a bit hard to infer "usage" correlations without,
> but probably still worthwhile.

I'll send to you individually, I don't think it's secret, but it's 60k lines
long...  Its OID verifies to me that the table was created on 8/1, and not 7/1,
which was possible.  Week-old backup also verifies that.

Thanks,
Justin