Re: backend crash on DELETE, reproducible locally

Поиск
Список
Период
Сортировка
От
Тема Re: backend crash on DELETE, reproducible locally
Дата
Msg-id 4kp.XAMr.3k5V4ORoB1t.1RtVIt@seznam.cz
обсуждение исходный текст
Ответ на Re: backend crash on DELETE, reproducible locally  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Ответы Re: backend crash on DELETE, reproducible locally  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: backend crash on DELETE, reproducible locally  (Ron <ronljohnsonjr@gmail.com>)
Список pgsql-general
Hello,

we reached the exactly same problem after upgrading to PostgreSQL 11 - the server crashed on a DELETE statement with a trigger. We also observed an AFTER DELETE trigger receiving NULL values in OLD. Now I see the problem seems to be solved (theoretically). Unfortunately, we are not able to build the server with the patch, so we cannot confirm that. However, when we just copied the database (within the same server), the same DELETE executed on the copy with no problems.

I would like to ask, however: could the same problem arise from an UPDATE statement (also on a table with an AFTER trigger), or would that be another bug (technical details below)?
As the bug causes our production servers segfault several times a day, we'd like to solve the problem as soon as possible. Do I understand it correctly that if we dump and restore the database, the bug should not occur (until the next ALTER TABLE ADD COLUMN - which we'll avoid until the patch is released)?

With the update, we caught this (we also have the core dump and could provide it privately if useful):

#0  __memcmp_sse4_1 () at ../sysdeps/x86_64/multiarch/memcmp-sse4.S:793
No locals.
#1  0x00007f6a6b993296 in datumIsEqual (value1=7, value2=7, typByVal=<optimized out>, typLen=16) at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/utils/adt/datum.c:249
        size1 = 16
        size2 = <optimized out>
        s1 = 0x7 <error: Cannot access memory at address 0x7>
        s2 = 0x7 <error: Cannot access memory at address 0x7>
        res = <optimized out>
#2  0x00007f6a6b6b89de in ProjIndexIsUnchanged (newtup=0x7f6a6d504e80, oldtup=0x7ffc9c1c1270, relation=0x7f6a6b4c1738) at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/access/heap/heapam.c:4539
        att = <optimized out>
        indexOid = <optimized out>
        indexDesc = 0x7f6a6b4c5008
        indexInfo = 0x7f6a6d505ca8
        i = 0
        indexoidlist = <optimized out>
        econtext = 0x7f6a6d4c5080
        new_isnull = {false, false, false, 156, 252, 127, false, false, 44, 29, 128, 107, 106, 127, false, false, 116, 54, 76, 107, 106, 127, false, false, 123, false, false, false, false, false, false, false}
        slot = 0x7f6a6d507b78
        equals = true
        old_isnull = {false, false, false, 109, 106, 127, false, false, 165, 199, 80, 109, 106, 127, false, false, 160, 199, 80, 109, 106, 127, false, false, 168, 199, 80, 109, 106, 127, false, false}
        new_values = {7, 4, 140095077240522, 16, 140722927572064, 140095077271456, 140095077272720, 140722927572079, 0, 0, 140722927572048, 140095054460096, 140095077060032, 140095077242520, 5, 4, 140722927572096, 140095046814123, 
          140722927572096, 140095046825162, 140095043410520, 140095077060880, 140722927572224, 140095046865098, 60129542543, 23274985272, 140095043410520, 140095077060032, 140095043409720, 140095077060304, 140095077237568, 
          140095043414504}
        indexno = 1
        l = 0x7f6a6d50d180
        estate = 0x7f6a6d4c4e70
        old_values = {7, 5, 140078657614882, 1, 140095077060032, 1024, 140095077271456, 0, 140722927571760, 140095049517261, 140722927571808, 140722927571776, 140722927571792, 140095046814329, 18, 140095077271464, 140095076986224, 
          140095077237568, 22, 100, 140722927571856, 140722927571920, 140095046831813, 140722927571856, 140095046842624, 0, 140095077265656, 3, 140095077271456, 140095077271792, 140095077271456, 140095077242520}
#3  heap_update (relation=relation@entry=0x7f6a6b4c1738, otid=otid@entry=0x7ffc9c1c1690, newtup=newtup@entry=0x7f6a6d504e80, cid=0, crosscheck=<optimized out>, wait=wait@entry=true, hufd=hufd@entry=0x7ffc9c1c15b0, 
    lockmode=lockmode@entry=0x7ffc9c1c15a4) at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/access/heap/heapam.c:4230
        result = <optimized out>
        xid = 245028971
        hot_attrs = 0x7f6a6d50d0a0
        proj_idx_attrs = 0x7f6a6d50d0b8
        key_attrs = 0x7f6a6d50d0d0
        id_attrs = 0x7f6a6d50d0e8
        interesting_attrs = 0x7f6a6d50d100
        lp = <optimized out>
        oldtup = {t_len = 81, t_self = {ip_blkid = {bi_hi = 0, bi_lo = 9}, ip_posid = 71}, t_tableOid = 65471913, t_data = 0x7f669aa0b3f0}
        heaptup = 0x7f6a6d504e80
        old_key_tuple = 0x0
        old_key_copied = false
        page = 0x7f669aa0b180 "=="
        block = <optimized out>
        mxact_status = <optimized out>
        buffer = <optimized out>
        newbuf = 17858
        vmbuffer = 0
        vmbuffer_new = 0
        need_toast = <optimized out>
        newtupsize = <optimized out>
        pagefree = <optimized out>
        have_tuple_lock = false
        iscombo = false
        use_hot_update = false
        hot_attrs_checked = <optimized out>
        key_intact = <optimized out>
        all_visible_cleared = false
        all_visible_cleared_new = false
        checked_lockers = <optimized out>
        locker_remains = <optimized out>
        xmax_new_tuple = <optimized out>
        xmax_old_tuple = 245028971
        infomask_old_tuple = 0
        infomask2_old_tuple = 8192
        infomask_new_tuple = 144
        infomask2_new_tuple = 0
        __func__ = "heap_update"
#4  0x00007f6a6b82f2c9 in ExecUpdate (mtstate=mtstate@entry=0x7f6a6d4da3f0, tupleid=0x7ffc9c1c1690, oldtuple=0x0, slot=0x7f6a6d504340, planSlot=planSlot@entry=0x7f6a6d5685e8, epqstate=epqstate@entry=0x7f6a6d4da4b0,
    estate=estate@entry=0x7f6a6d4d8ed0, canSetTag=true) at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/executor/nodeModifyTable.c:1209
        lockmode = LockTupleNoKeyExclusive
        partition_constraint_failed = false
        tuple = 0x7f6a6d504e80
        resultRelInfo = 0x7f6a6d4d9110
        resultRelationDesc = 0x7f6a6b4c1738
        result = <optimized out>
        hufd = {ctid = {ip_blkid = {bi_hi = 5616, bi_lo = 39964}, ip_posid = 32764}, xmax = 1803594893, cmax = 32618}
        recheckIndexes = 0x0
        saved_tcs_map = 0x0
        __func__ = "ExecUpdate"
#5  0x00007f6a6b82f978 in ExecModifyTable (pstate=0x7f6a6d4da3f0) at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/executor/nodeModifyTable.c:2166
        node = 0x7f6a6d4da3f0
        proute = 0x0
        estate = 0x7f6a6d4d8ed0
        operation = CMD_UPDATE
        saved_resultRelInfo = 0x0
        resultRelInfo = 0x7f6a6d4d9110
        subplanstate = 0x7f6a6d5679f8
        junkfilter = 0x7f6a6d504218
        slot = <optimized out>
        tupleid = <optimized out>
        tuple_ctid = {ip_blkid = {bi_hi = 0, bi_lo = 9}, ip_posid = 71}
        oldtupdata = {t_len = 2619086544, t_self = {ip_blkid = {bi_hi = 32764, bi_lo = 0}, ip_posid = 9204}, t_tableOid = 32618, t_data = 0x7ffc9c1c16d0}
        oldtuple = <optimized out>
        __func__ = "ExecModifyTable"
#6  0x00007f6a6b80b25b in ExecProcNode (node=0x7f6a6d4da3f0) at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/include/executor/executor.h:237
No locals.
#7  ExecutePlan (execute_once=<optimized out>, dest=0x7f6a6d5ed298, direction=<optimized out>, numberTuples=0, sendTuples=false, operation=CMD_UPDATE, use_parallel_mode=<optimized out>, planstate=0x7f6a6d4da3f0, estate=0x7f6a6d4d8ed0)
    at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/executor/execMain.c:1723
        slot = <optimized out>
        current_tuple_count = 0
#8  standard_ExecutorRun (queryDesc=0x7f6a6d5f5b80, direction=<optimized out>, count=0, execute_once=<optimized out>) at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/executor/execMain.c:364
        estate = 0x7f6a6d4d8ed0
        operation = CMD_UPDATE
        dest = 0x7f6a6d5ed298
        sendTuples = <optimized out>
        __func__ = "standard_ExecutorRun"
#9  0x00007f6a6b95b41d in ProcessQuery (plan=<optimized out>, 
    sourceText=0x7f6a6d3b45a0 "UPDATE public.schedulecard SET ext_ident=NULL, finalreport_absence_cnt=NULL, period_day=4, period_num=1, rotates=false, rotationgroup_id=NULL, schedulecard_id=NULL, schedulecardblock_id=NULL, schedule"..., 
    params=0x0, queryEnv=0x0, dest=0x7f6a6d5ed298, completionTag=0x7ffc9c1c19d0 "") at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/tcop/pquery.c:161
        queryDesc = 0x7f6a6d5f5b80
#10 0x00007f6a6b95b668 in PortalRunMulti (portal=portal@entry=0x7f6a6d47f230, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x7f6a6d5ed298, altdest=altdest@entry=0x7f6a6d5ed298, 
    completionTag=completionTag@entry=0x7ffc9c1c19d0 "") at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/tcop/pquery.c:1286
        pstmt = 0x7f6a6d5ed138
        active_snapshot_set = true
        stmtlist_item = 0x7f6a6d5ed248
#11 0x00007f6a6b95c2d5 in PortalRun (portal=portal@entry=0x7f6a6d47f230, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x7f6a6d5ed298, 
    altdest=altdest@entry=0x7f6a6d5ed298, completionTag=completionTag@entry=0x7ffc9c1c19d0 "") at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/tcop/pquery.c:799
        save_exception_stack = 0x7ffc9c1c1c00
        save_context_stack = 0x0
        local_sigjmp_buf = {{__jmpbuf = {140722927574016, 7153736072036697610, 140095075869696, 140722927573456, 140095076692528, 140095075869744, 7153736071992657418, 7091693995611473418}, __mask_was_saved = 0, __saved_mask = {__val = {
                0, 140723125634669, 0, 140095051070678, 64, 112, 140095075869696, 140095075861648, 140095078191720, 140722927573328, 140095049517496, 140722927573360, 2, 140095075869696, 2, 140722927573360}}}}
        result = <optimized out>
        nprocessed = <optimized out>
        saveTopTransactionResourceOwner = 0x7f6a6d4182c8
        saveTopTransactionContext = 0x7f6a6d490150
        saveActivePortal = 0x0
        saveResourceOwner = 0x7f6a6d4182c8
        savePortalContext = 0x0
        saveMemoryContext = 0x7f6a6d490150
        __func__ = "PortalRun"
#12 0x00007f6a6b957f7f in exec_simple_query (
    query_string=0x7f6a6d3b45a0 "UPDATE public.schedulecard SET ext_ident=NULL, finalreport_absence_cnt=NULL, period_day=4, period_num=1, rotates=false, rotationgroup_id=NULL, schedulecard_id=NULL, schedulecardblock_id=NULL, schedule"...) at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/tcop/postgres.c:1122
        parsetree = 0x7f6a6d3b6400
        portal = 0x7f6a6d47f230
        snapshot_set = <optimized out>
        commandTag = <optimized out>
        completionTag = "\000\032\034\234\374\177\000\000\341\356\247kj\177\000\000\000\000\000\000\000\000\000\000\060\316\367kj\177\000\000\340&\360kj\177\000\000\234U\357kj\177\000\000\240E;mj\177\000\000Q\000\000\000\000\000\000"
        querytree_list = <optimized out>
        plantree_list = <optimized out>
        receiver = 0x7f6a6d5ed298
        format = 0
        dest = DestRemote
        parsetree_list = 0x7f6a6d3b6450
        parsetree_item = 0x7f6a6d3b6430
        save_log_statement_stats = false
        was_logged = false
        use_implicit_block = false
        msec_str = "\000\032\034\234\374\177\000\000\341\356\247kj\177\000\000\000\000\000\000\000\000\000\000\060\316\367kj\177\000"
        __func__ = "exec_simple_query"
#13 0x00007f6a6b95930a in PostgresMain (argc=<optimized out>, argv=argv@entry=0x7f6a6d400220, dbname=0x7f6a6d4001b0 "mydb", username=<optimized out>)
    at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/tcop/postgres.c:4159
        query_string = 0x7f6a6d3b45a0 "UPDATE public.schedulecard SET ext_ident=NULL, finalreport_absence_cnt=NULL, period_day=4, period_num=1, rotates=false, rotationgroup_id=NULL, schedulecard_id=NULL, schedulecardblock_id=NULL, schedule"...
        firstchar = 81
        input_message = {
          data = 0x7f6a6d3b45a0 "UPDATE public.schedulecard SET ext_ident=NULL, finalreport_absence_cnt=NULL, period_day=4, period_num=1, rotates=false, rotationgroup_id=NULL, schedulecard_id=NULL, schedulecardblock_id=NULL, schedule"..., len = 319, maxlen = 1024, cursor = 319}
        local_sigjmp_buf = {{__jmpbuf = {140095054460088, 7153736071902479882, 140095054460096, 1, 0, 140095076133088, 7153736071933937162, 7091693993451144714}, __mask_was_saved = 1, __saved_mask = {__val = {0, 140095076133088, 
                140095004765576, 844424930131970, 206158430256, 140722927574360, 140722927574144, 293, 140095076133088, 0, 0, 140095075861648, 1024, 140722927574468, 0, 140722927574240}}}}
        send_ready_for_query = false
        disable_idle_in_transaction_timeout = false
        __func__ = "PostgresMain"
#14 0x00007f6a6b67867d in BackendRun (port=0x7f6a6d3f68e0) at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/postmaster/postmaster.c:4361
        ac = 1
        secs = 594476990
        usecs = 933865
        i = 1
        av = 0x7f6a6d400220
        maxac = <optimized out>
#15 BackendStartup (port=0x7f6a6d3f68e0) at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/postmaster/postmaster.c:4033
        bn = <optimized out>
        pid = <optimized out>
#16 ServerLoop () at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/postmaster/postmaster.c:1706
        rmask = {fds_bits = {8, 0 <repeats 15 times>}}
        selres = <optimized out>
        now = <optimized out>
        readmask = {fds_bits = {200, 0 <repeats 15 times>}}
        last_lockfile_recheck_time = 1541161751
        last_touch_time = 1541161147
        __func__ = "ServerLoop"
#17 0x00007f6a6b8e7644 in PostmasterMain (argc=5, argv=<optimized out>) at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/postmaster/postmaster.c:1379
        opt = <optimized out>
        status = <optimized out>
        userDoption = <optimized out>
        listen_addr_saved = true
        i = <optimized out>
        output_config_variable = <optimized out>
        __func__ = "PostmasterMain"
#18 0x00007f6a6b679616 in main (argc=5, argv=0x7f6a6d3af1f0) at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/main/main.c:228
No locals.


Best regards, 
Ondřej Bouda


---------- Původní e-mail ----------
Od: Tom Lane <tgl@sss.pgh.pa.us>
Komu: Karsten Hilbert <Karsten.Hilbert@gmx.net>
Datum: 3. 11. 2018 4:57:19
Předmět: Re: backend crash on DELETE, reproducible locally
Karsten Hilbert <Karsten.Hilbert@gmx.net> writes:
> On Thu, Nov 01, 2018 at 11:43:56AM -0400, Tom Lane wrote:
>> Yeah, apparently we've passed a null OLD tuple to an RI_FKey_cascade_del
>> trigger, which surely shouldn't happen. It'd be interesting to look at
>> the set of triggers on this table. I don't entirely trust psql's \d
>> to show us reality if there's something screwed up about the triggers,
>> so in addition to \d output, could we see
>> select * from pg_trigger where tgrelid = 'ref.auto_hint'::regclass;

> [ pretty normal-looking trigger entries ]

I was feeling baffled about this, but it suddenly occurs to me that maybe
the bug fixed in 040a1df61/372102b81 explains this. If the particular
tuple you're trying to delete predates the last ALTER TABLE ADD COLUMN
operation on the table, then this bug would result in t_self getting
set to "invalid", which'd cause AfterTriggerSaveEvent to save "ate_ctid1"
as "invalid", which'd lead to a null tuple getting passed when the trigger
eventually gets invoked.

regards, tom lane

В списке pgsql-general по дате отправления:

Предыдущее
От: Karsten Hilbert
Дата:
Сообщение: Re: backend crash on DELETE, reproducible locally
Следующее
От: Tom Lane
Дата:
Сообщение: Re: backend crash on DELETE, reproducible locally