Обсуждение: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

Поиск

Список

Период

Сортировка

BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

От

PG Bug reporting form

Дата:

16 ноября 2020 г., 18:15:20

The following bug has been logged on the website:

Bug reference:      16722
Logged by:          Sergey Koposov
Email address:      skoposov@ed.ac.uk
PostgreSQL version: 11.9
Operating system:   debian
Description:

Hi,

When ingesting a billion or so rows in the table that has some array columns
(see schema below),  at some point all the backends doing the ingestion hang
with 100% CPU usage. When investigating, I think I can trace this to the
limit of 2^32 toasted records per table. See the gdb full backtrace  of
hanged backends in the bottom. When the problem occurs, it has 272 mill
records.
Basically the hanging happens in GetNewOidWithIndex called by
toast_save_datum. 
While I understand the limit for the toast number is there to stay, but the
behaviour of PG next to the limit is IMO a bug (or at least non-trivial) .

I would rather prefer to see an error-message as opposed to backends
hanging.  
I see a similar report BUG #14137, but  it didn't seem to receive any reply.

If it matters, the ingestion inside the table is done by 10 parallel
backends doing COPY in the table. (I don't know if it affects the way the
objids are allocated 
for toast objects)

Thanks, 
        Sergey Koposov

GDB output
*****************
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
No locals.
#1  0x00000000006b3f42 in PGSemaphoreLock (sema=0x7f607fe1cbb8)
    at pg_sema.c:316
        errStatus = <optimized out>
        __func__ = "PGSemaphoreLock"
#2  0x000000000071ea14 in LWLockAcquire (lock=0x7f689257bf20, 
    mode=mode@entry=LW_EXCLUSIVE) at lwlock.c:1244
        mustwait = <optimized out>
        proc = 0x7f68adeb46b0
        result = true
        extraWaits = 0
        __func__ = "LWLockAcquire"
#3  0x0000000000700440 in StartBufferIO (buf=buf@entry=0x7f6083a2e640, 
    forInput=forInput@entry=true) at bufmgr.c:3887
        buf_state = <optimized out>
#4  0x000000000070204d in BufferAlloc (foundPtr=0x7fff36ee18db,
strategy=0x0, 
    blockNum=7605075, forkNum=MAIN_FORKNUM, relpersistence=<optimized out>,

    smgr=0x130f380) at bufmgr.c:1251
        newPartitionLock = 0x7f607fe24080
        oldTag = {rnode = {spcNode = 20206528, dbNode = 0, relNode =
8688143}, 
          forkNum = MAIN_FORKNUM, blockNum = 8469712}
        oldPartitionLock = <optimized out>
        buf = 0x7f6083a2e640
        oldHash = <optimized out>
        oldFlags = <optimized out>
        buf_id = <optimized out>
        newTag = {rnode = {spcNode = 1663, dbNode = 16818, 
            relNode = 2229524624}, forkNum = MAIN_FORKNUM, blockNum =
7605075}
        newHash = 4182877260
        valid = false
        buf_state = 0
#5  ReadBuffer_common (smgr=0x130f380, relpersistence=<optimized out>, 
    forkNum=forkNum@entry=MAIN_FORKNUM, blockNum=blockNum@entry=7605075, 
    mode=RBM_NORMAL, strategy=0x0, hit=0x7fff36ee198f) at bufmgr.c:745
        bufHdr = <optimized out>
        bufBlock = <optimized out>
        found = true
        isExtend = false
        isLocalBuf = <optimized out>
        __func__ = "ReadBuffer_common"
#6  0x00000000007023e1 in ReadBufferExtended (reln=0x7f607fd9bc08, 
    forkNum=forkNum@entry=MAIN_FORKNUM, blockNum=7605075, 
    mode=mode@entry=RBM_NORMAL, strategy=strategy@entry=0x0) at
bufmgr.c:664
        hit = false
        buf = <optimized out>
        __func__ = "ReadBufferExtended"
#7  0x00000000007024eb in ReadBuffer (blockNum=<optimized out>, 
    reln=<optimized out>) at bufmgr.c:596
No locals.
#8  ReleaseAndReadBuffer (buffer=buffer@entry=20373756, 
    relation=<optimized out>, blockNum=<optimized out>) at bufmgr.c:1545
        bufHdr = <optimized out>
#9  0x00000000004c6db2 in index_fetch_heap (scan=0x0,
scan@entry=0x136e098)
    at indexam.c:595
        prev_buf = 20373756
        tid = 0x136e0fc
        all_dead = false
        got_heap_tuple = <optimized out>
#10 0x00000000004c6e00 in index_getnext (scan=0x136e098, 
    direction=direction@entry=ForwardScanDirection) at indexam.c:691
        heapTuple = <optimized out>
        tid = <optimized out>
#11 0x00000000004c6299 in systable_getnext
(sysscan=sysscan@entry=0x136dec8)
    at genam.c:425
        htup = <optimized out>
        __func__ = "systable_getnext"
#12 0x000000000050cd09 in GetNewOidWithIndex (relation=0x7f607fd9bc08, 
    indexId=310933857, oidcolumn=1) at catalog.c:371
        newOid = 2744075972
        scan = 0x136dec8
        key = {sk_flags = 0, sk_attno = 1, sk_strategy = 3, sk_subtype = 0,

          sk_collation = 100, sk_func = {fn_addr = 0x7b6320 <oideq>, 
            fn_oid = 184, fn_nargs = 2, fn_strict = true, fn_retset = false,

            fn_stats = 2 '\002', fn_extra = 0x0, fn_mcxt = 0x13453c0, 
            fn_expr = 0x0}, sk_argument = 2744075972}
        collides = <optimized out>
#13 0x00000000004c3063 in toast_save_datum (value=20557421, oldexternal=0x0,

    options=0, rel=<optimized out>, rel=<optimized out>) at
tuptoaster.c:1574
        toastrel = 0x7f607fd9bc08
        toastidxs = 0x136bba0
        toasttup = 0x7f607fd98540
        t_values = {300189817, 0, 140734114962192}
        t_isnull = {false, false, false}
        result = 0xfffffffffffffe00
        chunk_data = {hdr = {vl_len_ = "\340\000\000", 
            vl_dat = 0x7fff36ee1b14 "\001"}, 
          data =

"\340\000\000\000\001\000\000\000\000\000\000\000\274\002\000\000\b\000\000\000\001\000\000\000̥\203@\225\002\365@z\210,A/<gA\a\233}AYf\210Aɀ\215AД\226A.U)?\320\325.?\242aA?~\311:?i\215\211?\334\020\203?\000\000\000\000\272

\354@\222\202ي\321\070\354@\356\274\001ԀM\354@\"\272o\016nf\354@Ӯ\327N\275z\354@\242\250\"K\365\223\354@\372k\252\305D\250\354@",
'\000' <repeats 1696 times>..., align_it = 224}
        chunk_size = 0
        num_indexes = 1
        validIndex = 0
#14 0x00000000004c45a2 in toast_insert_or_update (rel=0x7f607fd983a8, 
    newtup=0x0, oldtup=0xb5, options=-1) at tuptoaster.c:867
        old_value = 20557421
        tupleDesc = 0x12beb88
        numAttrs = 181
        need_change = 69
        need_free = 109
        toast_action = "pp pppp ppppppxppp ", 'p' <repeats 51 times>,
"xxxxxxxxxx   ", 'x' <repeats 12 times>, 'p' <repeats 61 times>, "
ppppppxxppppppxxpppppppp", '\000' <repeats 1418 times>
        toast_isnull = {false <repeats 1600 times>}
        toast_oldisnull = {false <repeats 1600 times>}
        toast_values = {9011, 508863, 20557040, 2663, 1, 2048, 130,
20557062, 
          4642831835879822414, 4629766747987641402, 1478912696, 1475380501,

          1151322629, 1152363744, 20557104, 1018397949, 4678179332087438334,

          4678191406000876033, 20557168, 1325489961962492288, 
          18446744072652356428, 18446744072655684425, 1052577735,
1094195074, 
          1088406363, 1106636035, 1157361664, 1100416689, 1139960709, 319,

          1100668600, 1108163953, 29, 1100071362, 1111486228, 33, 0, 0, 0,

          300, 297, 1051787629, 0, 2143289344, 2143289344, 1067470643, 
          1052143608, 31, 1099325834, 1107053637, 1109951334, 1096537497, 
          1083978235, 1077223218, 18446744072661571125, 1115025818, 
          1101955695, 1117711694, 1081872323, 1066791225, 987449489, 
          930850946, 1095769653, 1103322737, 1107459414, 1095769653, 
          1103322737, 1107459414, 20374504, 20557421, 20557474, 20557527, 
          20557580, 20557633, 20557686, 20557739, 20557792, 20557845, 
          20557898, 20557951, 20373112, 20373032, 20573080, 20558163, 
          20558204, 20558245, 20558286, 20558327, 20558368, 20558409, 
          20558450, 20558491, 20558532, 20558573, 20558614, 1064299422, 
          1064635873, 1064948113, 1065291023, 1065314998, 1065345062, 
          1065350129, 1, 1, 4, 278, 278, 31, 16, 1074988052, 1076692416, 
          1076469703, 1076226148, 1066023726, 1057373083, 1041813708, 
          910775196, 897988541, 958014467, 1073709436, 1080820995,
1087871082, 
          1105555510, 996563946, 990516632, 1007439877, 1065342546, 
          1065343854, 1065326389, 0, 0, 0, 0, 0, 0, 0, 0, 1070595106, 
          1067663254, 1068637012, 1132005523, 1120866882, 1124400175, 
          1126997579, 1115152811, 1117702755, 1086250728, 1080808387, 
          1082966116, 1086250960, 1080808777, 1082966483, 1089896366, 
          1069203394, 987853215, 932092729, 20558872, 1123232006,
1124468680, 
          20374216, 20374168, 20374120, 20374072, 20559216, 20559267, 
          20374024, 20373976, 20373928, 20373464, 20373416, 20313024, 
          20559936, 20559987, 0 <repeats 1427 times>}
        toast_oldvalues = {0 <repeats 1245 times>, 8599425, 0, 256, 
          140734115001632, 0, 0, 0, 0, 8602692, 0, 19089440,
140734115001632, 
          19356882, 140734115001634, 13444600, 3, 19356880, 13444600,
8796705, 
          16967, 8796705, 16967, 0 <repeats 23 times>, 140734115003392, 
          140734115003376, 10204964, 10204968, 0, 140087678209897, 0, 0, 
          140734115002240, 0, 0, 0, 0, 0, 140734115002192, 
          0 <repeats 11 times>, 32, 0, 0, 0, 0, 4294967295, 140087678216602,

          0, 0, 4294967296, 18446744073709551615, 5, 0, 140087678216602, 0,

          10204964, 0, 6, 10204968, 0, 0, 0, 0, 0, 0, 0, 0, 206158430232, 
          140734115003968, 140734115003776, 0 <repeats 125 times>, 
          140087698075212, 0, 1037797632, 7, 9, 140087700155240, 
          140087698075212, 10204964, 2090588023, 1, 9, 140087700155240, 
          140087698077630, 140734115003664, 140734115003456,
140087695826704, 
          140087695833544, 140734115003728, 32665437, 140734115003712, 
          4277816, 0, 140087700155384, 140087700264280, 4607254, 
          140087695834744, 4274192, 4294967296, 4294967581, 1, 
          140087700155384, 140734115003872, 140087700276488,
140734115003912, 
          140087700275632, 1, 140087698078061, 0, 140087700155384, 1, 0, 1,

          140087700275632, 140734115003680, 140734115003664, 1037797632, 
          4495133, 0, 140087700276488, 140734115003728, 140734115003712, 
          2090588023, 4607254, 4294967295, 140087678425001, 140087695833544,

          140087700264280, 165776, 140087681730168, 131072, 140087678425001,

          18446744073709547520, 140087678400681, 0, 19492864,
140734115005392, 
          19327088, 165776, 13428184, 0, 27, 27, 0, 19327104,
140087698096119, 
          4294967297, 0, 18446744073709551504, 140087695833544, 28,
19327104, 
          140734115004064, 140087698122005, 0, 0, 0, 0, 140734115004064, 
          19328986, 140087678216602, 140087700275632, 108, 8796505...}
        toast_oldexternal = {0x0 <repeats 181 times>, 
          0x833781 <hash_search_with_hash_value+193>, 0x3c, 0x100, 
          0x7fff36eec5e0, 0x0, 0x0, 0x0, 0x0, 0x834444 <string_hash+36>,
0x7b, 
          0x1234820, 0x7fff36eec5e0, 0x124eee3, 0x7fff36000000, 
          0xcd25f8 <ConfigureNamesString+4408>, 0x0, 0x0, 0x6aa00000000, 
          0x1100000009, 0xffffffff00540100, 0xffffffff1a5d09cb, 
          0xffffffff9b26ada0, 0xffffffff9bd60520, 0xffffffff9ccf30a0, 
          0xffffffff9da4c3a0, 0xffffffff9e9c9da0, 0xffffffff9f971aa0, 
          0xffffffffa085ba20, 0xffffffffa176fca0, 0xffffffffa2659c20, 
          0xffffffffa37bc8a0, 0xffffffffa44eb8a0, 0xffffffffa53ffb20, 
          0xffffffffa6256020, 0xffffffffa727c620, 0xffffffffa82a2c20, 
          0xffffffffa8ebf8a0, 0xffffffffaa00d3a0, 0xffffffffaad51520, 
          0xffffffffabe9f020, 0xffffffffacc76c20, 0xffffffffadc9d220, 
          0xffffffffaea74e20, 0xffffffffafa079a0, 0xffffffffb0873020, 
          0xffffffffb192d0a0, 0xffffffffb2704ca0, 0xffffffffb372b2a0, 
          0xffffffffb4502ea0, 0xffffffffb5495a20, 0xffffffffb63010a0, 
          0xffffffffb73276a0, 0xffffffffb80ff2a0, 0xffffffffb91258a0, 
          0xffffffffb9efd4a0, 0xffffffffbae90020, 0xffffffffbbd8f120, 
          0xffffffffbcdb5720, 0xffffffffbdb8d320, 0xffffffffbeb1fea0, 
          0xffffffffbf98b520, 0xffffffffc09b1b20, 0xffffffffc1789720, 
          0xffffffffc27afd20, 0xffffffffc3587920, 0xffffffffc451a4a0, 
          0xffffffffc5385b20, 0xffffffffc63ac120, 0xffffffffc758d6a0, 
          0xffffffffc7da09a0, 0xffffffffca162690, 0xffffffffca975990, 
          0xffffffffcbd11e90, 0xffffffffcc773b90, 0xffffffffcdb10090, 
          0xffffffffce605810, 0xffffffffcf90e290, 0xffffffffd06e5e90, 
          0xffffffffd1721610, 0xffffffffd1fb3210, 0xffffffffd269fe20, 
          0xffffffffd36329a0, 0xffffffffd449e020, 0xffffffffd51e21a0, 
          0xffffffffd542fd90, 0xffffffffd5dfe010, 0xffffffffd64eac20, 
          0xffffffffd6fe03a0, 0xffffffffd82e8e20, 0xffffffffd8f99520, 
          0xffffffffda0e7020, 0xffffffffdaebec20, 0xffffffffdbe517a0, 
          0xffffffffdccbce20, 0xffffffffddc4f9a0, 0xffffffffdeb4eaa0, 
          0xffffffffdfae1620, 0xffffffffe094cca0, 0xffffffffe17248a0, 
          0xffffffffe26b7420, 0xffffffffe3522aa0, 0xffffffffe45490a0, 
          0xffffffffe5320ca0, 0xffffffffe63dad20, 0xffffffffe71b2920, 
          0xffffffffe81454a0, 0xffffffffe8fb0b20, 0xffffffffe9fd7120, 
          0xffffffffeadaed20, 0xffffffffebdd5320, 0xffffffffecbacf20, 
          0xffffffffedb3faa0, 0xffffffffee9ab120, 0xffffffffef8167a0, 
          0xfffffffff09f7d20, 0xfffffffff16149a0, 0xfffffffff27f5f20, 
          0xfffffffff34a6620, 0xfffffffff45f4120, 0xfffffffff5210da0, 
          0xfffffffff63f2320, 0xfffffffff700efa0, 0xfffffffff81f0520, 
          0xfffffffff8e0d1a0, 0xfffffffff9fee720, 0xfffffffffac0b3a0, 
          0xfffffffffbe803a0, 0xfffffffffc7baba0, 0xfffffffffdc7bb70, 
          0x370c620, 0x4295820, 0x550a820, 0x6093a20, 0x7308a20, 0x7e91c20,

          0x9106c20, 0x9c8fe20, 0xaf04e20, 0xbb21aa0, 0xcd03020, 0xd91fca0,

          0xeb01220, 0xf71dea0, 0x10992ea0, 0x1151c0a0, 0x127910a0, 
          0x1331a2a0, 0x1458f2a0, 0x1523eb90, 0x1638c690, 0x1703cd90, 
          0x1818a890, 0x18e3af90, 0x19f88a90, 0x1ac39190, 0x1be1a710, 
          0x1cacae10, 0x1dc18910, 0x1e8c9010, 0x1fa16b10, 0x206c7210, 
          0x21814d10, 0x224c5410, 0x23612f10, 0x242c3610, 0x254a4b90, 
          0x260c1810, 0x272a2d90, 0x27f53490, 0x290a0f90, 0x29d51690, 
          0x2ae9f190, 0x2bb4f890, 0x2cc9d390, 0x2d94da90, 0x2ea9b590, 
          0x2f74bc90, 0x30899790, 0x315dd910, 0x3272b410, 0x333dbb10, 
          0x34529610, 0x351d9d10, 0x36327810, 0x36fd7f10, 0x381b9490, 
          0x38dd6110, 0x39fb7690, 0x3abd4310, 0x3bdb5890...}
        toast_sizes = {0, 0, 9, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 41, 0, 0,
0, 
          3, 0 <repeats 49 times>, 53 <repeats 12 times>, 27, 27, 27, 
          41 <repeats 12 times>, 0 <repeats 61 times>, 9, 0, 0, 81, 81, 81,

          81, 51, 51, 81, 81, 81, 81, 144, 144, 51, 51, 0 <repeats 1427
times>}
        toast_free = {false <repeats 68 times>, true, 
          false <repeats 11 times>, true, true, true, 
          false <repeats 76 times>, true, true, true, true, false, false, 
          true, true, true, true, true, true, false <repeats 1429 times>}
        toast_delold = {false <repeats 1600 times>}
#15 0x00000000004b8a87 in heap_multi_insert (relation=0x7f607fe1cbb8, 
    tuples=0x0, tuples@entry=0x1341428, ntuples=0, ntuples@entry=22, cid=0,

    options=0, bistate=0x7fffffff, bistate@entry=0x12c9710) at
heapam.c:2730
        xid = 1708296130
        heaptuples = 0x136dc08
        scratch = {
          data =

"\002/\004\000\311\a\265\000\006\b\030\000\063#\000\000\277\303\a\000\023\062\064\065\061p325\000\000\000b\n\000\000\001\000\000\000\000\000\tPSF\000\000\000\000\000\000g\022\240\276\361\246n@\342\273\327E^1@@\372\036\231N\a\"\256N7A\240DF\355dC\226\000\000\000(\000\000\000\004\001\000\003\001\274\002\000\000\005\b\000\000\000\001\020L\264CB0V\313>B\004\033\002\001\000\000\000J\n\254<\000\000\000\000\230a\311\357p;\354@\001\252\367\061LF\354@\003",
'\000' <repeats 119 times>..., force_align_d = 2.9948299028864589e-305, 
          force_align_i64 = 50955530654723842}
        page = 0x1 <error: Cannot access memory at address 0x1>
#16 0x0000000000586edf in CopyFromInsertBatch
(cstate=cstate@entry=0x12c92d0, 
    estate=estate@entry=0x12d0e80, mycid=mycid@entry=0, 
    hi_options=hi_options@entry=0, 
    resultRelInfo=resultRelInfo@entry=0x128fde8, 
    myslot=myslot@entry=0x12d2db8, bistate=0x12c9710, nBufferedTuples=22, 
    bufferedTuples=0x1341428, firstBufferedLineNo=2663) at copy.c:2973
        oldcontext = 0x128c110
        i = <optimized out>
        save_cur_lineno = 2684
#17 0x000000000058c77e in CopyFrom (cstate=cstate@entry=0x12c92d0)
    at copy.c:2815
        slot = 0x12d2db8
        skip_tuple = false
        loaded_oid = 0
        tuple = 0x136cbf8
        tupDesc = 0x12beb88
        values = 0x12d34b8
        nulls = 0x1329c78
        resultRelInfo = <optimized out>
        saved_resultRelInfo = 0x0
        estate = 0x12d0e80
        mtstate = 0x12c9500
        econtext = 0x12d1090
        myslot = 0x12d2db8
        oldcontext = 0x128c110
        errcallback = {previous = 0x0, 
          callback = 0x586d00 <CopyFromErrorCallback>, arg = 0x12c92d0}
        mycid = 0
        hi_options = 0
        bistate = 0x12c9710
        processed = 2683
        useHeapMultiInsert = true
        nBufferedTuples = <optimized out>
        prev_leaf_part_index = -1
        bufferedTuples = 0x1341428
        bufferedTuplesSize = 67408
        firstBufferedLineNo = 2663
        __func__ = "CopyFrom"
#18 0x000000000058cb53 in DoCopy (pstate=0x128c220, stmt=<optimized out>, 
    stmt_location=0, stmt_len=2354, processed=0x7fff36ef14b8) at
copy.c:996
        cstate = 0x12c92d0
        is_from = true
        pipe = <optimized out>
        rel = 0x7f607fd983a8
        relid = 310933852
        query = 0x0
        __func__ = "DoCopy"
#19 0x0000000000731258 in standard_ProcessUtility (pstmt=0x1234560, 
    queryString=0x1233030 "COPY decals9

(release,brickid,brickname,objid,brick_primary,maskbits,fitbits,type,ra,dec,ra_ivar,dec_ivar,bx,by,dchisq,ebv,mjd_min,mjd_max,ref_cat,ref_id,pmra,pmdec,parallax,pmra_ivar,pmdec_ivar,paral"...,

    context=PROCESS_UTILITY_TOPLEVEL, params=0xffffffffffffffff,
queryEnv=0x0, 
    dest=0x7fffffff, completionTag=0x7fff36ef1740 "") at utility.c:551
        processed = 19943856
        pstate = 0x128c220
        __func__ = "standard_ProcessUtility"
#20 0x000000000072e5ba in PortalRunUtility (portal=0x12f4150,
pstmt=0x1234560, 
    isTopLevel=<optimized out>, setHoldSnapshot=<optimized out>, 
    dest=0x1234670, completionTag=<optimized out>) at pquery.c:1178
        utilityStmt = <optimized out>
        snapshot = <optimized out>
#21 0x000000000072f027 in PortalRunMulti (portal=portal@entry=0x12f4150, 
    isTopLevel=isTopLevel@entry=true, 
    setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x1234670,

    altdest=altdest@entry=0x1234670, 
    completionTag=completionTag@entry=0x7fff36ef1740 "") at pquery.c:1331
        pstmt = 0x1234560
        active_snapshot_set = false
        stmtlist_item = 0x12bc400
#22 0x000000000072fbd7 in PortalRun (portal=0x12f4150, 
    count=9223372036854775807, isTopLevel=<optimized out>, 
    run_once=<optimized out>, dest=0x1234670, altdest=0x1234670, 
    completionTag=0x7fff36ef1740 "") at pquery.c:799
        save_exception_stack = 0x7fff36ef19b0
        save_context_stack = 0x0
        local_sigjmp_buf = {{__jmpbuf = {1, -6064393298424732441, 19645008,

              140734115026752, 19874128, 2, 6064270506639647975, 
              -6064395009656331033}, __mask_was_saved = 0, __saved_mask =
{
              __val = {0, 140734115026655, 0, 9981816, 8602692, 1, 19882336,

                10328646, 19874128, 10328646, 1, 1, 112, 19644960, 19080960,

                9895294}}}}
        result = <optimized out>
        nprocessed = <optimized out>
        saveTopTransactionResourceOwner = 0x1285838
        saveTopTransactionContext = 0x13050a0
        saveActivePortal = 0x0
        saveResourceOwner = 0x1285838
        savePortalContext = 0x0
        saveMemoryContext = 0x13050a0
        __func__ = "PortalRun"
#23 0x000000000072be17 in exec_simple_query (
    query_string=0x7f607fe1cbb8 "\001") at postgres.c:1145
        parsetree = 0x12bc220
        portal = 0x12f4150
        commandTag = 0x7fff36ef1740 ""
        completionTag =

"\000\335'\001\000\000\000\000\b\375'\001\002\000\000\000\000\000\000\000\002\000\000\000Q\000\000\000\000\000\000\000\060\060#\001\000\000\000\000\b\375'\001\000\000\000\000\b\375'\001",
'\000' <repeats 11 times>
        plantree_list = 0x1234670
        receiver = 0x1234670
        format = 0
        dest = DestRemote
        parsetree_item = 0x12bc250
        msec_str =
"\000\335'\001\000\000\000\000\b\375'\001\002\000\000\000\000\000\000\000\002\000\000\000Q\000\000\000\000\000\000"
        __func__ = "exec_simple_query"
#24 0x000000000072dade in PostgresMain (argc=1, argv=0x1233030, 
    dbname=0x127fd20 "wsdb", username=0x127fd08 "koposov") at
postgres.c:4193
        query_string = 0x1233030 "COPY decals9

(release,brickid,brickname,objid,brick_primary,maskbits,fitbits,type,ra,dec,ra_ivar,dec_ivar,bx,by,dchisq,ebv,mjd_min,mjd_max,ref_cat,ref_id,pmra,pmdec,parallax,pmra_ivar,pmdec_ivar,paral"...
        firstchar = 19083312
        input_message = {
          data = 0x1233030 "COPY decals9

(release,brickid,brickname,objid,brick_primary,maskbits,fitbits,type,ra,dec,ra_ivar,dec_ivar,bx,by,dchisq,ebv,mjd_min,mjd_max,ref_cat,ref_id,pmra,pmdec,parallax,pmra_ivar,pmdec_ivar,paral"...,

          len = 2356, maxlen = 4096, cursor = 2356}
        local_sigjmp_buf = {{__jmpbuf = {2523489864, -6064395710111914777,

              19397896, 19397896, 0, 19382096, 6064270506719339751, 
              -6064395007743597337}, __mask_was_saved = 1, __saved_mask =
{
              __val = {0, 19397920, 140734115028992, 47, 19080960, 1024, 
                19398080, 19397896, 8711789, 0, 140734115027568, 
                140734115027664, 6479486, 13500160, 8610716, 19086720}}}}
        send_ready_for_query = false
        __func__ = "PostgresMain"
#25 0x000000000047e812 in BackendRun (port=0x128b780) at postmaster.c:4356
        ac = 1
        secs = 658859760
        usecs = 545108
        i = 1
        av = 0x127fdc0
        maxac = <optimized out>
#26 BackendStartup (port=0x128b780) at postmaster.c:4028
        bn = 0x127bf50
        pid = 0
#27 ServerLoop () at postmaster.c:1707
        rmask = {fds_bits = {128, 0 <repeats 15 times>}}
        selres = <optimized out>
        now = <optimized out>
        readmask = {fds_bits = {248, 0 <repeats 15 times>}}
        last_lockfile_recheck_time = <optimized out>
        last_touch_time = <optimized out>
        __func__ = "ServerLoop"
#28 0x00000000006c39a3 in PostmasterMain (argc=argc@entry=3, 
    argv=argv@entry=0x122d3c0) at postmaster.c:1380
        opt = <optimized out>
        status = <optimized out>
        userDoption = <optimized out>
        listen_addr_saved = true
        i = <optimized out>
        output_config_variable = <optimized out>
        __func__ = "PostmasterMain"
#29 0x000000000047f748 in main (argc=3, argv=0x122d3c0) at main.c:228
No locals.
quit
A debugging session is active.

    Inferior 1 [process 115811] will be detached.

Quit anyway? (y or n) Detaching from program: /opt/pgsql11/bin/postgres,
process 115811
***************************************
SCHEMA: 
************************************

                              Table "koposov.decals9"
              Column               |        Type        | Collation |
Nullable | Default 
-----------------------------------+--------------------+-----------+----------+---------
 release                           | smallint           |           |
  | 
 brickid                           | integer            |           |
  | 
 brickname                         | text               |           |
  | 
 objid                             | integer            |           |
  | 
 brick_primary                     | smallint           |           |
  | 
 maskbits                          | smallint           |           |
  | 
 fitbits                           | smallint           |           |
  | 
 type                              | text               |           |
  | 
 ra                                | double precision   |           |
  | 
 dec                               | double precision   |           |
  | 
 ra_ivar                           | real               |           |
  | 
 dec_ivar                          | real               |           |
  | 
 bx                                | real               |           |
  | 
 by                                | real               |           |
  | 
 dchisq                            | real[]             |           |
  | 
 ebv                               | real               |           |
  | 
 mjd_min                           | double precision   |           |
  | 
 mjd_max                           | double precision   |           |
  | 
 ref_cat                           | text               |           |
  | 
 ref_id                            | bigint             |           |
  | 
 pmra                              | real               |           |
  | 
 pmdec                             | real               |           |
  | 
 parallax                          | real               |           |
  | 
 pmra_ivar                         | real               |           |
  | 
 pmdec_ivar                        | real               |           |
  | 
 parallax_ivar                     | real               |           |
  | 
 ref_epoch                         | real               |           |
  | 
 gaia_phot_g_mean_mag              | real               |           |
  | 
 gaia_phot_g_mean_flux_over_error  | real               |           |
  | 
 gaia_phot_g_n_obs                 | smallint           |           |
  | 
 gaia_phot_bp_mean_mag             | real               |           |
  | 
 gaia_phot_bp_mean_flux_over_error | real               |           |
  | 
 gaia_phot_bp_n_obs                | smallint           |           |
  | 
 gaia_phot_rp_mean_mag             | real               |           |
  | 
 gaia_phot_rp_mean_flux_over_error | real               |           |
  | 
 gaia_phot_rp_n_obs                | smallint           |           |
  | 
 gaia_phot_variable_flag           | smallint           |           |
  | 
 gaia_astrometric_excess_noise     | real               |           |
  | 
 gaia_astrometric_excess_noise_sig | real               |           |
  | 
 gaia_astrometric_n_obs_al         | smallint           |           |
  | 
 gaia_astrometric_n_good_obs_al    | smallint           |           |
  | 
 gaia_astrometric_weight_al        | real               |           |
  | 
 gaia_duplicated_source            | smallint           |           |
  | 
 gaia_a_g_val                      | real               |           |
  | 
 gaia_e_bp_min_rp_val              | real               |           |
  | 
 gaia_phot_bp_rp_excess_factor     | real               |           |
  | 
 gaia_astrometric_sigma5d_max      | real               |           |
  | 
 gaia_astrometric_params_solved    | smallint           |           |
  | 
 flux_g                            | real               |           |
  | 
 flux_r                            | real               |           |
  | 
 flux_z                            | real               |           |
  | 
 flux_w1                           | real               |           |
  | 
 flux_w2                           | real               |           |
  | 
 flux_w3                           | real               |           |
  | 
 flux_w4                           | real               |           |
  | 
 flux_ivar_g                       | real               |           |
  | 
 flux_ivar_r                       | real               |           |
  | 
 flux_ivar_z                       | real               |           |
  | 
 flux_ivar_w1                      | real               |           |
  | 
 flux_ivar_w2                      | real               |           |
  | 
 flux_ivar_w3                      | real               |           |
  | 
 flux_ivar_w4                      | real               |           |
  | 
 fiberflux_g                       | real               |           |
  | 
 fiberflux_r                       | real               |           |
  | 
 fiberflux_z                       | real               |           |
  | 
 fibertotflux_g                    | real               |           |
  | 
 fibertotflux_r                    | real               |           |
  | 
 fibertotflux_z                    | real               |           |
  | 
 apflux_g                          | real[]             |           |
  | 
 apflux_r                          | real[]             |           |
  | 
 apflux_z                          | real[]             |           |
  | 
 apflux_resid_g                    | real[]             |           |
  | 
 apflux_resid_r                    | real[]             |           |
  | 
 apflux_resid_z                    | real[]             |           |
  | 
 apflux_blobresid_g                | real[]             |           |
  | 
 apflux_blobresid_r                | real[]             |           |
  | 
 apflux_blobresid_z                | real[]             |           |
  | 
 apflux_ivar_g                     | real[]             |           |
  | 
 apflux_ivar_r                     | real[]             |           |
  | 
 apflux_ivar_z                     | real[]             |           |
  | 
 apflux_masked_g                   | real[]             |           |
  | 
 apflux_masked_r                   | real[]             |           |
  | 
 apflux_masked_z                   | real[]             |           |
  | 
 apflux_w1                         | real[]             |           |
  | 
 apflux_w2                         | real[]             |           |
  | 
 apflux_w3                         | real[]             |           |
  | 
 apflux_w4                         | real[]             |           |
  | 
 apflux_resid_w1                   | real[]             |           |
  | 
 apflux_resid_w2                   | real[]             |           |
  | 
 apflux_resid_w3                   | real[]             |           |
  | 
 apflux_resid_w4                   | real[]             |           |
  | 
 apflux_ivar_w1                    | real[]             |           |
  | 
 apflux_ivar_w2                    | real[]             |           |
  | 
 apflux_ivar_w3                    | real[]             |           |
  | 
 apflux_ivar_w4                    | real[]             |           |
  | 
 mw_transmission_g                 | real               |           |
  | 
 mw_transmission_r                 | real               |           |
  | 
 mw_transmission_z                 | real               |           |
  | 
 mw_transmission_w1                | real               |           |
  | 
 mw_transmission_w2                | real               |           |
  | 
 mw_transmission_w3                | real               |           |
  | 
 mw_transmission_w4                | real               |           |
  | 
 nobs_g                            | smallint           |           |
  | 
 nobs_r                            | smallint           |           |
  | 
 nobs_z                            | smallint           |           |
  | 
 nobs_w1                           | smallint           |           |
  | 
 nobs_w2                           | smallint           |           |
  | 
 nobs_w3                           | smallint           |           |
  | 
 nobs_w4                           | smallint           |           |
  | 
 rchisq_g                          | real               |           |
  | 
 rchisq_r                          | real               |           |
  | 
 rchisq_z                          | real               |           |
  | 
 rchisq_w1                         | real               |           |
  | 
 rchisq_w2                         | real               |           |
  | 
 rchisq_w3                         | real               |           |
  | 
 rchisq_w4                         | real               |           |
  | 
 fracflux_g                        | real               |           |
  | 
 fracflux_r                        | real               |           |
  | 
 fracflux_z                        | real               |           |
  | 
 fracflux_w1                       | real               |           |
  | 
 fracflux_w2                       | real               |           |
  | 
 fracflux_w3                       | real               |           |
  | 
 fracflux_w4                       | real               |           |
  | 
 fracmasked_g                      | real               |           |
  | 
 fracmasked_r                      | real               |           |
  | 
 fracmasked_z                      | real               |           |
  | 
 fracin_g                          | real               |           |
  | 
 fracin_r                          | real               |           |
  | 
 fracin_z                          | real               |           |
  | 
 anymask_g                         | smallint           |           |
  | 
 anymask_r                         | smallint           |           |
  | 
 anymask_z                         | smallint           |           |
  | 
 allmask_g                         | smallint           |           |
  | 
 allmask_r                         | smallint           |           |
  | 
 allmask_z                         | smallint           |           |
  | 
 wisemask_w1                       | smallint           |           |
  | 
 wisemask_w2                       | smallint           |           |
  | 
 psfsize_g                         | real               |           |
  | 
 psfsize_r                         | real               |           |
  | 
 psfsize_z                         | real               |           |
  | 
 psfdepth_g                        | real               |           |
  | 
 psfdepth_r                        | real               |           |
  | 
 psfdepth_z                        | real               |           |
  | 
 galdepth_g                        | real               |           |
  | 
 galdepth_r                        | real               |           |
  | 
 galdepth_z                        | real               |           |
  | 
 nea_g                             | real               |           |
  | 
 nea_r                             | real               |           |
  | 
 nea_z                             | real               |           |
  | 
 blob_nea_g                        | real               |           |
  | 
 blob_nea_r                        | real               |           |
  | 
 blob_nea_z                        | real               |           |
  | 
 psfdepth_w1                       | real               |           |
  | 
 psfdepth_w2                       | real               |           |
  | 
 psfdepth_w3                       | real               |           |
  | 
 psfdepth_w4                       | real               |           |
  | 
 wise_coadd_id                     | text               |           |
  | 
 wise_x                            | real               |           |
  | 
 wise_y                            | real               |           |
  | 
 lc_flux_w1                        | real[]             |           |
  | 
 lc_flux_w2                        | real[]             |           |
  | 
 lc_flux_ivar_w1                   | real[]             |           |
  | 
 lc_flux_ivar_w2                   | real[]             |           |
  | 
 lc_nobs_w1                        | smallint[]         |           |
  | 
 lc_nobs_w2                        | smallint[]         |           |
  | 
 lc_fracflux_w1                    | real[]             |           |
  | 
 lc_fracflux_w2                    | real[]             |           |
  | 
 lc_rchisq_w1                      | real[]             |           |
  | 
 lc_rchisq_w2                      | real[]             |           |
  | 
 lc_mjd_w1                         | double precision[] |           |
  | 
 lc_mjd_w2                         | double precision[] |           |
  | 
 lc_epoch_index_w1                 | smallint[]         |           |
  | 
 lc_epoch_index_w2                 | smallint[]         |           |
  | 
 sersic                            | real               |           |
  | 
 sersic_ivar                       | real               |           |
  | 
 shape_r                           | real               |           |
  | 
 shape_r_ivar                      | real               |           |
  | 
 shape_e1                          | real               |           |
  | 
 shape_e1_ivar                     | real               |           |
  | 
 shape_e2                          | real               |           |
  | 
 shape_e2_ivar                     | real               |           |
  |

Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

От

Fujii Masao

Дата:

18 ноября 2020 г., 01:18:20

On 2020/11/17 3:15, PG Bug reporting form wrote:
> The following bug has been logged on the website:
> 
> Bug reference:      16722
> Logged by:          Sergey Koposov
> Email address:      skoposov@ed.ac.uk
> PostgreSQL version: 11.9
> Operating system:   debian
> Description:
> 
> Hi,
> 
> When ingesting a billion or so rows in the table that has some array columns
> (see schema below),  at some point all the backends doing the ingestion hang
> with 100% CPU usage. When investigating, I think I can trace this to the
> limit of 2^32 toasted records per table.

Yes, this is the limitation of the number of out-of-line values in toast.
https://wiki.postgresql.org/wiki/TOAST#Total_table_size_limit

> See the gdb full backtrace  of
> hanged backends in the bottom. When the problem occurs, it has 272 mill
> records.
> Basically the hanging happens in GetNewOidWithIndex called by
> toast_save_datum.
> While I understand the limit for the toast number is there to stay, but the
> behaviour of PG next to the limit is IMO a bug (or at least non-trivial) .
> 
> I would rather prefer to see an error-message as opposed to backends
> hanging.

To emit an error, we need to check that there is no unused OID for
the toast and it would take very long to do that. So I'm not sure
if to emit an error message really improves the current situation or not.

OTOH it might be good idea to emit a warning message (every time we
cannot find unused OID in recent 1 million OIDs, for example) or report
the issue as wait event, or something while the record insertion is hanging
because of toast limit, so that we can easily detect the issue.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

От

Magnus Hagander

Дата:

18 ноября 2020 г., 10:57:05

On Wed, Nov 18, 2020 at 2:18 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
>
>
> On 2020/11/17 3:15, PG Bug reporting form wrote:
> > The following bug has been logged on the website:
> >
> > Bug reference:      16722
> > Logged by:          Sergey Koposov
> > Email address:      skoposov@ed.ac.uk
> > PostgreSQL version: 11.9
> > Operating system:   debian
> > Description:
> >
> > Hi,
> >
> > When ingesting a billion or so rows in the table that has some array columns
> > (see schema below),  at some point all the backends doing the ingestion hang
> > with 100% CPU usage. When investigating, I think I can trace this to the
> > limit of 2^32 toasted records per table.
>
> Yes, this is the limitation of the number of out-of-line values in toast.
> https://wiki.postgresql.org/wiki/TOAST#Total_table_size_limit
>
>
> > See the gdb full backtrace  of
> > hanged backends in the bottom. When the problem occurs, it has 272 mill
> > records.
> > Basically the hanging happens in GetNewOidWithIndex called by
> > toast_save_datum.
> > While I understand the limit for the toast number is there to stay, but the
> > behaviour of PG next to the limit is IMO a bug (or at least non-trivial) .
> >
> > I would rather prefer to see an error-message as opposed to backends
> > hanging.
>
> To emit an error, we need to check that there is no unused OID for
> the toast and it would take very long to do that. So I'm not sure
> if to emit an error message really improves the current situation or not.
>
> OTOH it might be good idea to emit a warning message (every time we
> cannot find unused OID in recent 1 million OIDs, for example) or report
> the issue as wait event, or something while the record insertion is hanging
> because of toast limit, so that we can easily detect the issue.

This definitely sounds like something that's worth putting out as a
wait event. Even before you start traversing millions of OIDs it might
gradually start to show up, and being able to monitor that would
definitely be useful.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/

Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

От

Andres Freund

Дата:

18 ноября 2020 г., 19:27:43

Hi,

On 2020-11-18 11:57:05 +0100, Magnus Hagander wrote:
> This definitely sounds like something that's worth putting out as a
> wait event. Even before you start traversing millions of OIDs it might
> gradually start to show up, and being able to monitor that would
> definitely be useful.

I don't think this is likely to work well as a wait event. All the index
traversals etc will do IO, acquire locks, etc, which will overwrite the
wait event and reset it to nothing once done.

Greetings,

Andres Freund

Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

От

Magnus Hagander

Дата:

18 ноября 2020 г., 22:12:10

On Wed, Nov 18, 2020 at 8:27 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2020-11-18 11:57:05 +0100, Magnus Hagander wrote:
> > This definitely sounds like something that's worth putting out as a
> > wait event. Even before you start traversing millions of OIDs it might
> > gradually start to show up, and being able to monitor that would
> > definitely be useful.
>
> I don't think this is likely to work well as a wait event. All the index
> traversals etc will do IO, acquire locks, etc, which will overwrite the
> wait event and reset it to nothing once done.

Oh meh. Yeah, I didn't think about the actual implementation of doing it :/

That said, I'd really prefer to see *some* kind of counter that could
get people noticing this a bit earlier, rather than just bugging them
in the logfile once it's gone over a threshold.  A statistics counter
maybe, but I guess that'd have to be tracked at a per-table level to
be really useful, which would make it potentially fairly expensive to
keep around...

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/

Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

От

Andres Freund

Дата:

18 ноября 2020 г., 23:05:29

Hi,

On 2020-11-18 23:12:10 +0100, Magnus Hagander wrote:
> That said, I'd really prefer to see *some* kind of counter that could
> get people noticing this a bit earlier, rather than just bugging them
> in the logfile once it's gone over a threshold.  A statistics counter
> maybe, but I guess that'd have to be tracked at a per-table level to
> be really useful, which would make it potentially fairly expensive to
> keep around...

I'd rather spend effort to solve the underlying issue. It's a bit more
work, but not crazily so, I think. Even the pg_upgrade pieces aren't
that hard to solve.

Greetings,

Andres Freund

Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

От

Tom Lane

Дата:

19 ноября 2020 г., 00:02:34

Andres Freund <andres@anarazel.de> writes:
> On 2020-11-18 23:12:10 +0100, Magnus Hagander wrote:
>> That said, I'd really prefer to see *some* kind of counter that could
>> get people noticing this a bit earlier, rather than just bugging them
>> in the logfile once it's gone over a threshold.  A statistics counter
>> maybe, but I guess that'd have to be tracked at a per-table level to
>> be really useful, which would make it potentially fairly expensive to
>> keep around...

> I'd rather spend effort to solve the underlying issue. It's a bit more
> work, but not crazily so, I think. Even the pg_upgrade pieces aren't
> that hard to solve.

Yeah.  If we're going to put work into this, widening the IDs used
to identify toast values seems like the right work to be doing.

Having said that, I had imagined that we might never have to fix it,
because if your table is big enough that it has a problem of this
ilk then likely you want to partition it anyway.  And partitioning
solves the problem since each partition has its own toast table.

            regards, tom lane

Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

От

Sergey KOPOSOV

Дата:

19 ноября 2020 г., 00:45:46

On Wed, 2020-11-18 at 19:02 -0500, Tom Lane wrote:
> This email was sent to you by someone outside the University.
> You should only click on links or attachments if you are certain that the email is genuine and the content is safe.
>
> Andres Freund <andres@anarazel.de> writes:
> > On 2020-11-18 23:12:10 +0100, Magnus Hagander wrote:
> > > That said, I'd really prefer to see *some* kind of counter that could
> > > get people noticing this a bit earlier, rather than just bugging them
> > > in the logfile once it's gone over a threshold.  A statistics counter
> > > maybe, but I guess that'd have to be tracked at a per-table level to
> > > be really useful, which would make it potentially fairly expensive to
> > > keep around...
> > I'd rather spend effort to solve the underlying issue. It's a bit more
> > work, but not crazily so, I think. Even the pg_upgrade pieces aren't
> > that hard to solve.
>
> Yeah.  If we're going to put work into this, widening the IDs used
> to identify toast values seems like the right work to be doing.
>
> Having said that, I had imagined that we might never have to fix it,
> because if your table is big enough that it has a problem of this
> ilk then likely you want to partition it anyway.  And partitioning
> solves the problem since each partition has its own toast table.

I can say that for my case of very large static clustered tables the
partitioning doesn't really offer much benefits, only overhead (To the best
of my understanding at least), so I'm not using it.

      Sergey

PS And the current workaround that I used for this particular dataset that I reported
was to just expand a few arrays that I have in the table into columns
 (as all arrays had static lengths) to not have any toasts at all.

The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.

Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

От

Andres Freund

Дата:

19 ноября 2020 г., 01:06:37

Hi,

On 2020-11-18 19:02:34 -0500, Tom Lane wrote:
> Having said that, I had imagined that we might never have to fix it,
> because if your table is big enough that it has a problem of this
> ilk then likely you want to partition it anyway.  And partitioning
> solves the problem since each partition has its own toast table.

A few billion rows isn't that much anymore, and partitioning has a fair
number of restrictions (plus this only needs 4 billion toasted fields,
not 4 billion rows). More importantly, to hit problems around this, one
doesn't even have to have all those rows in a table at once -
performance will suffer once the oid counter has wrapped around,
especially if there's longer sequences of assigned values.

> Yeah.  If we're going to put work into this, widening the IDs used
> to identify toast values seems like the right work to be doing.

To outline, here's what I think the two major pieces to get there are:

1) Make toast oid assignment independent of the oid counter. The easiest
   way likely is to also create a serial and use that. That alone
   improves the situation considerably, because it takes much longer to
   to wrap around in each toast table.

   The overhead of the additional WAL records isn't nothing, but
   compared to maintaining a btree it's not likely to be measurable.

2) Widen non pg-upgraded toast tables to have 64bit chunk_id field. To
   reference chunks >= 2^32, add VARTAG_ONDISK64, which is only used
   when needed.

Greetings,

Andres Freund

Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

От

Tom Lane

Дата:

19 ноября 2020 г., 01:17:55

Andres Freund <andres@anarazel.de> writes:
> On 2020-11-18 19:02:34 -0500, Tom Lane wrote:
>> Yeah.  If we're going to put work into this, widening the IDs used
>> to identify toast values seems like the right work to be doing.

> To outline, here's what I think the two major pieces to get there are:

> 1) Make toast oid assignment independent of the oid counter. The easiest
>    way likely is to also create a serial and use that. That alone
>    improves the situation considerably, because it takes much longer to
>    to wrap around in each toast table.

I'm not really on board with creating an actual sequence object for each
toast table.  I don't think they are efficient enough.

I had imagined internally widening the OID counter to 64 bits, of which
ordinary OID usage would just discard the high 32 bits, but for TOAST
chunk IDs we could use the whole value.  Now, compared to your idea
that would result in using the wider toast-pointer format more often,
but I think there are ways around that.  For example, we could do

(1) Grab the next 64-bit OID.
(2) Probe the toast table to see if the low 32 bits correspond to
a value in use in that toast table.  If not, just use the low 32 bits.
(3) Otherwise, use the whole 64 bits and the wider pointer format.
(We could do another uniqueness probe here, but I think it's unnecessary,
since the 64-bit counter will never wrap around.)

Since the existing OID assignment mechanism requires a uniqueness
probe already, this isn't any more expensive than what we do now.

            regards, tom lane

Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

От

Andres Freund

Дата:

19 ноября 2020 г., 01:40:43

Hi,

On 2020-11-18 20:17:55 -0500, Tom Lane wrote:
> I'm not really on board with creating an actual sequence object for each
> toast table.  I don't think they are efficient enough.

I'm not convinced that's true. Sure, nextval() could stand to be cheaper
- but on the other hand, this would allow us to entirely avoid doing
ahead-of-time uniqueness probes. Needing one extra WAL record every
SEQ_LOG_VALS/32 toast insertions seems likely to be win over doing
at least SEQ_LOG_VALS-1 additional uniqueness index probes.

It's not like the oid counter is free of contention - one benefit of
what I suggest would be that the potential contention point would move
from system wide to table wide.

I think reducing the rate of oid counter advancement is a significant
benefit, moving the point of where we have to care about efficiency
problems around system oid assignment costs quite a bit further into the
future.

I think there's also some small potential benefits around indexing in
having densely allocated toast chunk ids for each toast table, over the
sparser allocation we have right now.

Greetings,

Andres Freund

Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

От

tomohiro hiramitsu

Дата:

28 декабря 2020 г., 07:31:31

Hi

> Oh meh. Yeah, I didn't think about the actual implementation of doing it :/
>
> That said, I'd really prefer to see *some* kind of counter that could
> get people noticing this a bit earlier, rather than just bugging them
> in the logfile once it's gone over a threshold. A statistics counter
> maybe, but I guess that'd have to be tracked at a per-table level to
> be really useful, which would make it potentially fairly expensive to
> keep around...

I think the statistics on OID assignments per-table is a good idea, but how about simply logging the event to notify users as the first step?
I think it's easy to implement and helpful for users.

First of all, I would like to provide a way for the user to know that it is taking a long time to generate the OID, and provide an opportunity for the user to deal with this problem with statement_timeout etc.
This is an interim measure, but I think it's better than it is.

Since it is complicated to handle WaitEvent and statistics for each table, I first considered a patch that simply outputs logs, giving priority to avoiding a hang state without outputting logs.

* v01-0001-OID_log_output.patch
In this patch, GetNewOidWithIndex outputs a log when it loops too many times to assign a new OID.
Also, if the log is output even once, it will be output to the log when the OID is assigned.

* This patch does not cause an error even if it loops too many times.
How much time can be tolerated in the OID search loop depends on the user, so I think that the user should decide the time to make an error with statement_timeout.

* The log output interval threshold increases by "* = 2"(exponential backoff) and the maximum number of log outputs is set to 5.
The reason for setting this threshold is to reduce the number of log outputs and output the log before stopping at statement_timeout etc. I think the appropriate value for this threshold is controversial.
Since it is not necessary to keep logging, the upper limit is set to 5 times.

* The log output interval threshold is set to 1 million for the first log output.
In my environment, it takes about 4 seconds to loop 1 million times, and the 5th log is output after about 2 minutes.I think this threshold is just right considering statements_timeout, but it's controversial.

Log output does not need to continue output during OID assignment. I think the first few and the last log are enough.

I will add this patch to the commitfest. I look forward to your feedback about the patch.

By the way
How about distributing OIDs so that the GetNewOidWithIndex function doesn't take long to find an available OID?
For example, skip one and assign an OID.

As a side effect of this method, the OID progresses faster.
(Maybe the btree index will be fragmented faster)

It may be better to include skips only for TOAST chunk_id, as this method may have other side effects.

regards.
--
Tomohiro Hiramitsu
NTT Open Source Software Center

2020年12月21日(月) 16:44 Magnus Hagander <magnus@hagander.net>:

On Wed, Nov 18, 2020 at 8:27 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2020-11-18 11:57:05 +0100, Magnus Hagander wrote:
> > This definitely sounds like something that's worth putting out as a
> > wait event. Even before you start traversing millions of OIDs it might
> > gradually start to show up, and being able to monitor that would
> > definitely be useful.
>
> I don't think this is likely to work well as a wait event. All the index
> traversals etc will do IO, acquire locks, etc, which will overwrite the
> wait event and reset it to nothing once done.

Oh meh. Yeah, I didn't think about the actual implementation of doing it :/

That said, I'd really prefer to see *some* kind of counter that could
get people noticing this a bit earlier, rather than just bugging them
in the logfile once it's gone over a threshold. A statistics counter
maybe, but I guess that'd have to be tracked at a per-table level to
be really useful, which would make it potentially fairly expensive to
keep around...

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

Вложения

v1-0001-GetNewOidWithIndex_log_output.patch

Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

От

Kasahara Tatsuhito

Дата:

06 января 2021 г., 02:13:35

Hi,

On Mon, Dec 28, 2020 at 4:31 PM tomohiro hiramitsu <hiramit.tm@gmail.com> wrote:
>
> Hi
>
> > Oh meh. Yeah, I didn't think about the actual implementation of doing it :/
> >
> > That said, I'd really prefer to see *some* kind of counter that could
> > get people noticing this a bit earlier, rather than just bugging them
> > in the logfile once it's gone over a threshold.  A statistics counter
> > maybe, but I guess that'd have to be tracked at a per-table level to
> > be really useful, which would make it potentially fairly expensive to
> > keep around...
>
> I think the statistics on OID assignments per-table is a good idea, but how about simply logging the event to notify
usersas the first step? 
> I think it's easy to implement and helpful for users.
+1
It is a simple and quick way for users to know if the OID assignment
is taking a long time or not.
At least it will be better than now.

> First of all, I would like to provide a way for the user to know that it is taking a long time to generate the OID,
andprovide an opportunity for the user to deal with this problem with statement_timeout etc. 
> This is an interim measure, but I think it's better than it is.
>
> Since it is complicated to handle WaitEvent and statistics for each table, I first considered a patch that simply
outputslogs, giving priority to avoiding a hang state without outputting logs. 
>
> * v01-0001-OID_log_output.patch
>    In this patch, GetNewOidWithIndex outputs a log when it loops too many times to assign a new OID.
>    Also, if the log is output even once, it will be output to the log when the OID is assigned.
>
>    * This patch does not cause an error even if it loops too many times.
>      How much time can be tolerated in the OID search loop depends on the user, so I think that the user should
decidethe time to make an error with statement_timeout. 
>
>    * The log output interval threshold increases by "* = 2"(exponential backoff) and the maximum number of log
outputsis set to 5. 
>      The reason for setting this threshold is to reduce the number of log outputs and output the log before stopping
atstatement_timeout etc. I think the appropriate value for this threshold is controversial. 
>      Since it is not necessary to keep logging, the upper limit is set to 5 times.
>
>    * The log output interval threshold is set to 1 million for the first log output.
>      In my environment, it takes about 4 seconds to loop 1 million times, and the 5th log is output after about 2
minutes.Ithink this threshold is just right considering statements_timeout, but it's controversial. 
>
> Log output does not need to continue output during OID assignment. I think the first few and the last log are enough.
>
> I will add this patch to the commitfest. I look forward to your feedback about the patch.
>
>
> By the way
> How about distributing OIDs so that the GetNewOidWithIndex function doesn't take long to find an available OID?
> For example, skip one and assign an OID.
>
> As a side effect of this method, the OID progresses faster.
> (Maybe the btree index will be fragmented faster)
I think the idea of assigning OIDs to sparse is interesting.
However, I think it needs to find out how much it will affect the performance.
It could be a non-negligible overhead, especially for large amounts of
data insertion/updating.

Best regards,

> It may be better to include skips only for TOAST chunk_id, as this method may have other side effects.
>
> regards.
> --
> Tomohiro Hiramitsu
> NTT Open Source Software Center
>
> 2020年12月21日(月) 16:44 Magnus Hagander <magnus@hagander.net>:
>>
>> On Wed, Nov 18, 2020 at 8:27 PM Andres Freund <andres@anarazel.de> wrote:
>> >
>> > Hi,
>> >
>> > On 2020-11-18 11:57:05 +0100, Magnus Hagander wrote:
>> > > This definitely sounds like something that's worth putting out as a
>> > > wait event. Even before you start traversing millions of OIDs it might
>> > > gradually start to show up, and being able to monitor that would
>> > > definitely be useful.
>> >
>> > I don't think this is likely to work well as a wait event. All the index
>> > traversals etc will do IO, acquire locks, etc, which will overwrite the
>> > wait event and reset it to nothing once done.
>>
>> Oh meh. Yeah, I didn't think about the actual implementation of doing it :/
>>
>> That said, I'd really prefer to see *some* kind of counter that could
>> get people noticing this a bit earlier, rather than just bugging them
>> in the logfile once it's gone over a threshold.  A statistics counter
>> maybe, but I guess that'd have to be tracked at a per-table level to
>> be really useful, which would make it potentially fairly expensive to
>> keep around...
>>
>> --
>>  Magnus Hagander
>>  Me: https://www.hagander.net/
>>  Work: https://www.redpill-linpro.com/
>>
>>
>>
>>


--
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com

Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

От

tomohiro hiramitsu

Дата:

21 января 2021 г., 12:07:32

On Wed, Jan 6, 2021 at 11:13 AM Kasahara Tatsuhito <kasahara.tatsuhito@gmail.com> wrote:

> By the way
> How about distributing OIDs so that the GetNewOidWithIndex function doesn't take long to find an available OID?
> For example, skip one and assign an OID.
>
> As a side effect of this method, the OID progresses faster.
> (Maybe the btree index will be fragmented faster)
I think the idea of assigning OIDs to sparse is interesting.
However, I think it needs to find out how much it will affect the performance.
It could be a non-negligible overhead, especially for large amounts of
data insertion/updating.

I agree with you. I can't predict the impact of this method on performance.
I may need to do some performance tests to find out the impact of this method.

Thank you for your reply.

regards.
--
Tomohiro Hiramitsu
NTT Open Source Software Center

Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

От

Kasahara Tatsuhito

Дата:

18 февраля 2021 г., 11:29:16

Hi,

On Wed, Jan 6, 2021 at 11:13 AM Kasahara Tatsuhito
<kasahara.tatsuhito@gmail.com> wrote:
>
> Hi,
>
> On Mon, Dec 28, 2020 at 4:31 PM tomohiro hiramitsu <hiramit.tm@gmail.com> wrote:
> >
> > Hi
> >
> > > Oh meh. Yeah, I didn't think about the actual implementation of doing it :/
> > >
> > > That said, I'd really prefer to see *some* kind of counter that could
> > > get people noticing this a bit earlier, rather than just bugging them
> > > in the logfile once it's gone over a threshold.  A statistics counter
> > > maybe, but I guess that'd have to be tracked at a per-table level to
> > > be really useful, which would make it potentially fairly expensive to
> > > keep around...
> >
> > I think the statistics on OID assignments per-table is a good idea, but how about simply logging the event to
notifyusers as the first step? 
> > I think it's easy to implement and helpful for users.
> +1
> It is a simple and quick way for users to know if the OID assignment
> is taking a long time or not.
> At least it will be better than now.
>
> > First of all, I would like to provide a way for the user to know that it is taking a long time to generate the OID,
andprovide an opportunity for the user to deal with this problem with statement_timeout etc. 
> > This is an interim measure, but I think it's better than it is.
> >
> > Since it is complicated to handle WaitEvent and statistics for each table, I first considered a patch that simply
outputslogs, giving priority to avoiding a hang state without outputting logs. 
> >
> > * v01-0001-OID_log_output.patch
> >    In this patch, GetNewOidWithIndex outputs a log when it loops too many times to assign a new OID.
> >    Also, if the log is output even once, it will be output to the log when the OID is assigned.
> >
> >    * This patch does not cause an error even if it loops too many times.
> >      How much time can be tolerated in the OID search loop depends on the user, so I think that the user should
decidethe time to make an error with statement_timeout. 
> >
> >    * The log output interval threshold increases by "* = 2"(exponential backoff) and the maximum number of log
outputsis set to 5. 
> >      The reason for setting this threshold is to reduce the number of log outputs and output the log before
stoppingat statement_timeout etc. I think the appropriate value for this threshold is controversial. 
> >      Since it is not necessary to keep logging, the upper limit is set to 5 times.
> >
> >    * The log output interval threshold is set to 1 million for the first log output.
> >      In my environment, it takes about 4 seconds to loop 1 million times, and the 5th log is output after about 2
minutes.Ithink this threshold is just right considering statements_timeout, but it's controversial. 
> >
> > Log output does not need to continue output during OID assignment. I think the first few and the last log are
enough.
> >
> > I will add this patch to the commitfest. I look forward to your feedback about the patch.
I tested the patch.
The patch can be applied with the current HEAD
(eb42110d952f8d1ad4049b8f2491e9dfba75ffed) and the regression test
passed.

I actually tried to reproduce OID assignment conflicts using the
following procedure.
================================================================================================
DROP TABLE IF EXISTS t1;

-- Create a table. SET TOAST_TUPLE_TARGET for generate much TOAST records.
SELECT 'CREATE UNLOGGED TABLE t1 ( c1 int,' || string_agg(str, ',') ||
') WITH (TOAST_TUPLE_TARGET = 128);'
 FROM (SELECT ' c' || i || ' text' as str FROM generate_series(2,21)i )t;\gexec

-- Generate data and insert. This query generate 20 TOAST records per-rows.
-- So this results in 16 million TOAST rows (same as
GETNEWOID_NOTIFICATION_LIMIT).
WITH r AS (SELECT string_agg(str, '') as d FROM (SELECT
md5(clock_timestamp()::text) AS str
 FROM generate_series(1,4)t)s)
INSERT INTO t1 SELECT generate_series(1,800000)
,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d FROM r;

-- Check min(chunk_id) to rewind OID following test
SELECT 'SELECT min(chunk_id) FROM pg_toast.' || relname
  FROM pg_class WHERE oid = (SELECT reltoastrelid FROM pg_class WHERE
relname = 't1');\gexec

-- Check normal(non OID conflicting) insert's duration
\timing
WITH r AS (SELECT string_agg(str, '') as d FROM (SELECT
md5(clock_timestamp()::text) AS str
 FROM generate_series(1,4)t)s)
INSERT INTO t1 SELECT 1,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d FROM r;

 INSERT 0 1
 Time: 0.899 ms

-- Reset OID to the value of min(chunk_id) by using pg_resetwal (-o)
-- Check problematic(OID conflicting) insert's duration
\timing
WITH r AS (SELECT string_agg(str, '') as d FROM (SELECT
md5(clock_timestamp()::text) AS str
 FROM generate_series(1,4)t)s)
INSERT INTO t1 SELECT 1,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d FROM r;

 INSERT 0 1
 Time: 36399.478 ms (00:36.399)
================================================================================================

And I got following log messages.
(The STATEMENT section is long and has been cut)
===================================================================================================
2021-02-18 19:04:59.055 JST [32153] LOG:  failed to assign new OID in
relation "pg_toast_32018451" after 1000000 retries
2021-02-18 19:04:59.055 JST [32153] STATEMENT:  WITH r ...
2021-02-18 19:05:01.299 JST [32153] LOG:  failed to assign new OID in
relation "pg_toast_32018451" after 2000000 retries
2021-02-18 19:05:01.299 JST [32153] STATEMENT:  WITH r ...
2021-02-18 19:05:05.802 JST [32153] LOG:  failed to assign new OID in
relation "pg_toast_32018451" after 4000000 retries
2021-02-18 19:05:05.802 JST [32153] STATEMENT:  WITH r ...
2021-02-18 19:05:14.853 JST [32153] LOG:  failed to assign new OID in
relation "pg_toast_32018451" after 8000000 retries
2021-02-18 19:05:14.853 JST [32153] STATEMENT:  WITH r ...
2021-02-18 19:05:33.185 JST [32153] LOG:  failed to assign new OID in
relation "pg_toast_32018451" after 16000000 retries
2021-02-18 19:05:33.185 JST [32153] STATEMENT:  WITH r ...
2021-02-18 19:05:33.185 JST [32153] LOG:  the new OID has been
assigned in relation "pg_toast_32018451" after 16000021 retries
2021-02-18 19:05:33.185 JST [32153] STATEMENT:  WITH r ...
===================================================================================================

It works as expected, and I think the message is useful.

Here are some comments about the code.

+#define GETNEWOID_NOTIFICATION_INTERVAL_FACTOR 1000000

It seems to be more of a minimum value for conflict detection rather
than an interval factor.
So, it would be better to use another name, such as
GETNEWOID_NOTIFICATION_MINVAL or others.

+        /* retry count and notification limit check */
+        if (retry_count == next_notify && next_notify <=
GETNEWOID_NOTIFICATION_LIMIT)
+        {
+            ereport(LOG,
+                (errmsg("failed to assign new OID in relation \"%s\"
after "UINT64_FORMAT" retries",
+                    RelationGetRelationName(relation), retry_count)));
+            next_notify *= 2; /* double it for the next notification */
+        }
+        retry_count++;

I think that we should add more comments about why output this LOG message.

Best regards,


> >
> >
> > By the way
> > How about distributing OIDs so that the GetNewOidWithIndex function doesn't take long to find an available OID?
> > For example, skip one and assign an OID.
> >
> > As a side effect of this method, the OID progresses faster.
> > (Maybe the btree index will be fragmented faster)
> I think the idea of assigning OIDs to sparse is interesting.
> However, I think it needs to find out how much it will affect the performance.
> It could be a non-negligible overhead, especially for large amounts of
> data insertion/updating.
>
> Best regards,
>
> > It may be better to include skips only for TOAST chunk_id, as this method may have other side effects.
> >
> > regards.
> > --
> > Tomohiro Hiramitsu
> > NTT Open Source Software Center
> >
> > 2020年12月21日(月) 16:44 Magnus Hagander <magnus@hagander.net>:
> >>
> >> On Wed, Nov 18, 2020 at 8:27 PM Andres Freund <andres@anarazel.de> wrote:
> >> >
> >> > Hi,
> >> >
> >> > On 2020-11-18 11:57:05 +0100, Magnus Hagander wrote:
> >> > > This definitely sounds like something that's worth putting out as a
> >> > > wait event. Even before you start traversing millions of OIDs it might
> >> > > gradually start to show up, and being able to monitor that would
> >> > > definitely be useful.
> >> >
> >> > I don't think this is likely to work well as a wait event. All the index
> >> > traversals etc will do IO, acquire locks, etc, which will overwrite the
> >> > wait event and reset it to nothing once done.
> >>
> >> Oh meh. Yeah, I didn't think about the actual implementation of doing it :/
> >>
> >> That said, I'd really prefer to see *some* kind of counter that could
> >> get people noticing this a bit earlier, rather than just bugging them
> >> in the logfile once it's gone over a threshold.  A statistics counter
> >> maybe, but I guess that'd have to be tracked at a per-table level to
> >> be really useful, which would make it potentially fairly expensive to
> >> keep around...
> >>
> >> --
> >>  Magnus Hagander
> >>  Me: https://www.hagander.net/
> >>  Work: https://www.redpill-linpro.com/
> >>
> >>
> >>
> >>
>
>
> --
> Tatsuhito Kasahara
> kasahara.tatsuhito _at_ gmail.com



--
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com

Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

От

Fujii Masao

Дата:

18 марта 2021 г., 16:32:29


On 2021/03/17 14:30, tomohiro hiramitsu wrote:
> v4-0001-GetNewOidWithIndex_log_output.patch

Thanks for updating the patch! I applied the following and cosmetic changes
to the patch. Attached is the updated version of the patch.

+            ereport(LOG,
+                (errmsg("still finding an unused OID within relation \"%s\"",
+        ereport(LOG,
+            (errmsg("the new OID has been assigned in relation \"%s\" after \"%llu\" retries",

WARNING is more suitable for these kinds of messages?
I changed the log level here to WARNING.

+#define GETNEWOID_NOTIFY_MINVAL 1000000
+#define GETNEWOID_NOTIFY_MAXVAL 100000000

If we use WARNING, GETNEWOID_WARN_THRESHOLD and GETNEWOID_WARN_MAX_INTERVAL
sound better to me for the meanings of them. I renamed them.

+    if (retry_count >= GETNEWOID_NOTIFY_MINVAL)

">=" should be ">"? Otherwise, the message for completion of OID assignment
can be logged even when no message "still finding ..." is output. I fixed this.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Вложения

v5-0001-GetNewOidWithIndex_log_output.patch

Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

От

tomohiro hiramitsu

Дата:

22 марта 2021 г., 05:43:21

On Fri, Mar 19, 2021 at 1:32 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

> Thanks for updating the patch! I applied the following and cosmetic changes
> to the patch. Attached is the updated version of the patch.

Thanks for fixing the patch!
I think your fix is fine.

+#define GETNEWOID_WARN_THRESHOLD 1000000
+#define GETNEWOID_WARN_MAX_INTERVAL 100000000

I think it is better to use the following equation to determine the value of GETNEWOID_WARN_MAX_INTERVAL.

GETNEWOID_WARN_MAX_INTERVAL = 1000000 * 2^n

Example 1:
GETNEWOID_WARN_MAX_INTERVAL = 100000000 /* (Current setting) */

DETAIL: OID candidates were checked "1000000" times, but no unused OID is yet found.
DETAIL: OID candidates were checked "2000000" times, but no unused OID is yet found. (*=2)
DETAIL: OID candidates were checked "4000000" times, but no unused OID is yet found. (*=2)
DETAIL: OID candidates were checked "8000000" times, but no unused OID is yet found. (*=2)
DETAIL: OID candidates were checked "16000000" times, but no unused OID is yet found. (*=2)
DETAIL: OID candidates were checked "32000000" times, but no unused OID is yet found. (*=2)
DETAIL: OID candidates were checked "64000000" times, but no unused OID is yet found. (*=2) <---GETNEWOID_WARN_MAX_INTERVAL=100000000
DETAIL: OID candidates were checked "164000000" times, but no unused OID is yet found. (+= GETNEWOID_WARN_MAX_INTERVAL)
DETAIL: OID candidates were checked "264000000" times, but no unused OID is yet found. (+= GETNEWOID_WARN_MAX_INTERVAL)
DETAIL: OID candidates were checked "364000000" times, but no unused OID is yet found. (+= GETNEWOID_WARN_MAX_INTERVAL)

Example 2:
GETNEWOID_WARN_MAX_INTERVAL= 128000000 /* (1000000 * 2^7) */

DETAIL: OID candidates were checked "1000000" times, but no unused OID is yet found.
DETAIL: OID candidates were checked "2000000" times, but no unused OID is yet found. (*=2)
DETAIL: OID candidates were checked "4000000" times, but no unused OID is yet found. (*=2)
DETAIL: OID candidates were checked "8000000" times, but no unused OID is yet found. (*=2)
DETAIL: OID candidates were checked "16000000" times, but no unused OID is yet found. (*=2)
DETAIL: OID candidates were checked "32000000" times, but no unused OID is yet found. (*=2)
DETAIL: OID candidates were checked "64000000" times, but no unused OID is yet found. (*=2)
DETAIL: OID candidates were checked "128000000" times, but no unused OID is yet found. (*=2) <---GETNEWOID_WARN_MAX_INTERVAL=128000000
DETAIL: OID candidates were checked "256000000" times, but no unused OID is yet found. (+= GETNEWOID_WARN_MAX_INTERVAL)
DETAIL: OID candidates were checked "384000000" times, but no unused OID is yet found. (+= GETNEWOID_WARN_MAX_INTERVAL)

I like example 2 better for the notification timing after the threshold is exceeded.

Best regards,
--
Tomohiro Hiramitsu
NTT Open Source Software Center

Вложения

v6-0001-GetNewOidWithIndex_log_output.patch

Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

От

tomohiro hiramitsu

Дата:

22 марта 2021 г., 09:19:33

On Fri, Mar 19, 2021 at 1:32 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> WARNING is more suitable for these kinds of messages?
> I changed the log level here to WARNING.
>
> +#define GETNEWOID_NOTIFY_MINVAL 1000000
> +#define GETNEWOID_NOTIFY_MAXVAL 100000000
>
> If we use WARNING, GETNEWOID_WARN_THRESHOLD and GETNEWOID_WARN_MAX_INTERVAL
> sound better to me for the meanings of them. I renamed them.

When "LOG" is set, SQL statements can be recorded in the server log even if log_min_error_statement is set to "ERROR"(default).
It would be useful for users to know which SQL statement is retrying to generate OIDs.

In this case, GETNEWOID_WARN_THRESHOLD and GETNEWOID_WARN_MAX_INTERVAL should be renamed to the following names.

* GETNEWOID_LOG_THRESHOLD
* GETNEWOID_LOG_MAX_INTERVAL

Best regards,
--
Tomohiro Hiramitsu
NTT Open Source Software Center

Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

От

Fujii Masao

Дата:

22 марта 2021 г., 10:53:56

On 2021/03/22 14:43, tomohiro hiramitsu wrote:
> Example 2:
> GETNEWOID_WARN_MAX_INTERVAL= 128000000 /* (1000000 * 2^7) */

This looks better to me.

On 2021/03/22 18:19, tomohiro hiramitsu wrote:
> When "LOG" is set, SQL statements can be recorded in the server log even if log_min_error_statement is set to
"ERROR"(default).
> It would be useful for users to know which SQL statement is retrying to generate OIDs.

This makes sense.

> In this case, GETNEWOID_WARN_THRESHOLD and GETNEWOID_WARN_MAX_INTERVAL should be renamed to the following names.
> 
> * GETNEWOID_LOG_THRESHOLD
> * GETNEWOID_LOG_MAX_INTERVAL

Ok, so I updated the patch that way. Patch attached.
Barring any objection, I will commit this patch

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Вложения

v7-0001-GetNewOidWithIndex_log_output.patch

Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

От

Fujii Masao

Дата:

24 марта 2021 г., 01:40:50


On 2021/03/22 19:53, Fujii Masao wrote:
> Ok, so I updated the patch that way. Patch attached.
> Barring any objection, I will commit this patch

Pushed. Thanks!

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

От

tomohiro hiramitsu

Дата:

24 марта 2021 г., 02:31:56

On Wed, Mar 24, 2021 at 10:40 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2021/03/22 19:53, Fujii Masao wrote:
> Ok, so I updated the patch that way. Patch attached.
> Barring any objection, I will commit this patch

Pushed. Thanks!

Thank you very much!

--
Tomohiro Hiramitsu
NTT Open Source Software Center

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

Вложения

Вложения

Вложения

Вложения