Re: BUG #12718: Apparent LWlock Problem Perhaps With Page Cleanup

Поиск
Список
Период
Сортировка
От Charles R. Harwood
Тема Re: BUG #12718: Apparent LWlock Problem Perhaps With Page Cleanup
Дата
Msg-id 54CFB36E.9080608@zuerchertech.com
обсуждение исходный текст
Ответ на Re: BUG #12718: Apparent LWlock Problem Perhaps With Page Cleanup  (Peter Geoghegan <pg@heroku.com>)
Список pgsql-bugs
Yes.  It was pg_upgrad-ed from 9.2 almost a year ago (2014-02-27).
Unfortunately, I do not have the 9.2 data anymore.

It was pg_upgraded with the binaries in postgresql 9.3.1 which I
understand were affected by a bug related to leaving pg_multixact files
behind as per
http://www.postgresql.org/docs/9.4/static/release-9-3-5.html.  But the
query associated with the offsets/0000 bug, if I understand it, doesn't
seem to apply because offset 0000 doesn't exist in my case and therefore
the associated query returns False.

I'm afraid my grasp of MultiXacts isn't up to the task of understand
exactly what's going on.  By the time this issue is resolved I'm sure
I'll understand them better than I ever wanted to.  I'll attempt to
supply the same information as was requested in the referenced thread.



Here is the pg_controldata output:

pg_control version number:            937
Catalog version number:               201306121
Database system identifier:           5985061615841541596
Database cluster state:               in production
pg_control last modified:             Mon 02 Feb 2015 10:50:00 AM CST
Latest checkpoint location:           43A/607AD468
Prior checkpoint location:            43A/5DBA9040
Latest checkpoint's REDO location:    43A/5FF7A3C8
Latest checkpoint's REDO WAL file:    000000010000043A0000005F
Latest checkpoint's TimeLineID:       1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0/487855284
Latest checkpoint's NextOID:          134556810
Latest checkpoint's NextMultiXactId:  1183038
Latest checkpoint's NextMultiOffset:  2584953
Latest checkpoint's oldestXID:        289465105
Latest checkpoint's oldestXID's DB:   16414
Latest checkpoint's oldestActiveXID:  487855284
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 16414
Time of latest checkpoint:            Mon 02 Feb 2015 10:47:30 AM CST
Fake LSN counter for unlogged rels:   0/1
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
Current wal_level setting:            hot_standby
Current max_connections setting:      1000
Current max_prepared_xacts setting:   0
Current max_locks_per_xact setting:   128
Maximum data alignment:               8
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       8192
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        1996
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           0



leds=# SELECT datname, datfrozenxid, datminmxid FROM pg_database WHERE
datname = 'leds';
 datname | datfrozenxid | datminmxid
---------+--------------+------------
 leds    |    289465105 |          1
(1 row)



leds=# SELECT oid::regclass, relnamespace, relfrozenxid, relminmxid FROM
pg_class WHERE relname = 'addresses' AND relnamespace = 2200;
    oid    | relnamespace | relfrozenxid | relminmxid
-----------+--------------+--------------+------------
 addresses |         2200 |    354953499 |     261502



The oldest file in the cluster's pg_multixact/offsets directory is 0003
which as a modify time within an hour of pg_upgrade on 2014-02-27.



From where I was previously stuck, I shutdown the cluster "-m immediate"
and started it backup.  I disabled the script that was aggressively
updating the addresses table but have left autovacuum on.  My
understanding is that vacuuming the wrong pages will result in the same
sort of LWLock deadlock as occurred before.  Unfortunately this database
has fairly high uptime requirements, however I have a recent
pg_basebackup I can operate on which is exhibiting the same behavior.

Here is my current butchered 'understanding' of the problem:  There is
an incrementing value (MultiXactId) that tracks which process have which
locks on which tuples.  Since it's a finite (32bit in fact), it has to
wrap around eventually.  This requires something monitor where it is and
was and will be to handle wraparound cases and this job is given to
VACUUM.  Each table knows the oldest id that is has (relminmxid?) and as
VACUUM does its job, it updates keeps these up to date including
properly wrapping them around when necessary.  Somehow during pg_upgrade
the state of the multixactid state isn't fully transferred to the new
cluster (possibly due to known and documented bugs, and possibly because
of a yet unfound bug), and a cluster in that state is bound to run into
trouble when VACUUM discovers numbers that don't make sense.

I'll read up on what I can.  There seem to be plenty of relevant posts.
http://www.postgresql.org/message-id/5464A838.5070705@innogames.de

Let me know if I can supply any more information.  It sounds like there
are two fixes:
1.) Somehow set the values correctly which probably requries having a
copy of the pre-upgraded data around (I don't).
2.) Dump the database, reset the variables and reload the data.

Thanks
-Charles


I promised backtraces and here they are:
=============================================
This is the autovacuum process: (pid 24718)
=============================================

0x00007f5b1476e633 in select () from /lib/x86_64-linux-gnu/libc.so.6
#0  0x00007f5b1476e633 in select () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#1  0x00007f5b16a11dde in pg_usleep (microsec=<optimized out>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/port/pgsleep.c:43
        delay = {tv_sec = 0, tv_usec = 449}
#2  0x00007f5b1674db6c in GetMultiXactIdMembers (allow_old=<optimized
out>, members=0x7ffff69aa3a0, multi=197438) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/access/transam/multixact.c:1261
        nextMXOffset = <optimized out>
        pageno = <optimized out>
        truelength = <optimized out>
        i = <optimized out>
        oldestMXact = <optimized out>
        nextMXact = <optimized out>
        ptr = <optimized out>
        entryno = <optimized out>
        length = <optimized out>
        tmpMXact = <optimized out>
        nextOffset = 4137329568
        prev_pageno = 96
        slotno = <optimized out>
        offptr = <optimized out>
        offset = 0
#3  GetMultiXactIdMembers (multi=197438, members=0x7ffff69aa3a0,
allow_old=<optimized out>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/access/transam/multixact.c:1100
        length = 197439
#4  0x00007f5b1671fa08 in FreezeMultiXactId (flags=<synthetic pointer>,
cutoff_multi=<optimized out>, cutoff_xid=435461386,
t_infomask=<optimized out>, multi=197438) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/access/heap/heapam.c:5379
        members = <optimized out>
        nmembers = <optimized out>
        has_lockers = <optimized out>
        update_committed = <optimized out>
        xid = 0
        i = <optimized out>
        need_replace = <optimized out>
        nnewmembers = <optimized out>
        newmembers = <optimized out>
        update_xid = <optimized out>
#5  heap_prepare_freeze_tuple (tuple=0x7f59fd6f9240,
cutoff_xid=435461386, cutoff_multi=<optimized out>, frz=0x7f5b17620b6c)
at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/access/heap/heapam.c:5617
        flags = 0
        changed = 0 '\000'
        freeze_xmax = 0 '\000'
        xid = 197438
#6  0x00007f5b168221aa in lazy_scan_heap (scan_all=0 '\000', nindexes=8,
Irel=<optimized out>, vacrelstats=<optimized out>,
onerel=0x7f5b166002b0) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/commands/vacuumlazy.c:899
        buf = 347988
        page = 0x7f59fd6f8b80 "6\004"
        offnum = <optimized out>
        maxoff = 37
        hastup = 1 '\001'
        nfrozen = <optimized out>
        freespace = <optimized out>
        tupgone = 0 '\000'
        prev_dead_count = 0
        all_visible_according_to_vm = 0 '\000'
        all_visible = 0 '\000'
        has_dead_tuples = <optimized out>
        visibility_cutoff_xid = 0
        num_tuples = 2761072
        indstats = 0x7f5b17607aa0
        i = <optimized out>
        ru0 = {tv = {tv_sec = 1422574761, tv_usec = 520665}, ru =
{ru_utime = {tv_sec = 0, tv_usec = 524894}, ru_stime = {tv_sec = 0,
tv_usec = 302987}, ru_maxrss = 1077356, ru_ixrss = 0, ru_idrss = 0,
ru_isrss = 0, ru_minflt = 270000, ru_majflt = 0, ru_nswap = 0,
            ru_inblock = 0, ru_oublock = 112, ru_msgsnd = 0, ru_msgrcv =
0, ru_nsignals = 0, ru_nvcsw = 2090, ru_nivcsw = 6}}
        frozen = 0x7f5b17620b60
        blkno = <optimized out>
        empty_pages = <optimized out>
        vacuumed_pages = <optimized out>
        tups_vacuumed = 0
        nkeep = 593975
        next_not_all_visible_block = <optimized out>
        skipping_all_visible_blocks = 0 '\000'
        nblocks = 98427
        relname = 0x7f5b16601d30 "addresses"
        nunused = 0
        vmbuffer = 183145
#7  lazy_vacuum_rel (onerel=0x7f5b166002b0, vacstmt=<optimized out>,
bstrategy=<optimized out>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/commands/vacuumlazy.c:236
        vacrelstats = <optimized out>
        Irel = 0x7f5b17624ac8
        nindexes = 8
        possibly_freeable = <optimized out>
        ru0 = {tv = {tv_sec = 140029210944257, tv_usec =
140029198239941}, ru = {ru_utime = {tv_sec = 140029211069184, tv_usec =
140029211068928}, ru_stime = {tv_sec = 844429225099264, tv_usec =
429496729600}, ru_maxrss = 140029197707696, ru_ixrss = 281483566645432,
            ru_idrss = 140029194147330, ru_isrss = 0, ru_minflt =
140737330718495, ru_majflt = 140029194142448, ru_nswap =
140029162548016, ru_inblock = 4, ru_oublock = 140737330718192, ru_msgsnd
= 140029194142384, ru_msgrcv = 7, ru_nsignals = 0,
            ru_nvcsw = 140029194147408, ru_nivcsw = 8}}
        starttime = 0
        secs = <optimized out>
        usecs = <optimized out>
        read_rate = <optimized out>
        write_rate = <optimized out>
        scan_all = 0 '\000'
        scanned_all = <optimized out>
        xidFullScanLimit = 335823296
        mxactFullScanLimit = 4146143161
        new_rel_pages = <optimized out>
        new_rel_tuples = <optimized out>
        new_rel_allvisible = <optimized out>
        new_frozen_xid = <optimized out>
        new_min_multi = <optimized out>
        __func__ = "lazy_vacuum_rel"
#8  0x00007f5b1681fc75 in vacuum_rel (relid=17441,
vacstmt=0x7ffff69aabb0, do_toast=0 '\000', for_wraparound=0 '\000') at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/commands/vacuum.c:1288
        lmode = 4
        onerel = 0x7f5b166002b0
        onerelid = {relId = 17441, dbId = 16414}
        toast_relid = 0
        save_userid = <optimized out>
        save_sec_context = 0
        save_nestlevel = <optimized out>
        __func__ = "vacuum_rel"
#9  0x00007f5b16820b05 in vacuum (vacstmt=0x7ffff69aabb0,
relid=<optimized out>, do_toast=0 '\000', bstrategy=<optimized out>,
for_wraparound=0 '\000', isTopLevel=<optimized out>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/commands/vacuum.c:237
        relid = 17441
        cur = 0x7f5b177d0460
        save_exception_stack = 0x7ffff69aa9e0
        save_context_stack = 0x0
        local_sigjmp_buf = {{__jmpbuf = {140029202659720,
1004169673111825598, 17441, 140029210764256, 20, 200,
1004169673157962942, 911493825480551614}, __mask_was_saved = 0,
__saved_mask = {__val = {0, 140737330718960, 140737331063969, 0,
140029211147472, 8432298256,
                140029055545016, 140737330719728, 2, 140737330718992,
140029197856355, 1422574761, 520095, 140737330719056, 140029197018826,
140737330719040}}}}
        stmttype = <optimized out>
        in_outer_xact = 0 '\000'
        use_own_xacts = 1 '\001'
        relations = 0x7f5b177d0480
#10 0x00007f5b168b6bfa in autovacuum_do_vac_analyze
(bstrategy=0x7f5b17637a60, tab=0x7f5b17637c80) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/autovacuum.c:2811
        vacstmt = {type = T_VacuumStmt, options = 33, freeze_min_age =
50000000, freeze_table_age = 150000000, relation = 0x7ffff69aab70,
va_cols = 0x0, multixact_freeze_min_age = 5000000,
multixact_freeze_table_age = 150000000}
        rangevar = {type = T_Invalid, catalogname = 0x0, schemaname =
0x7f5b17637cd0 "public", relname = 0x7f5b17637a40 "addresses", inhOpt =
INH_NO, relpersistence = 0 '\000', alias = 0x0, location = -1}
#11 do_autovacuum () at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/autovacuum.c:2329
        save_exception_stack = 0x7ffff69aacf0
        save_context_stack = 0x0
        local_sigjmp_buf = {{__jmpbuf = {17441, 1004169673302666430,
5000000, 140029211147120, 20, 200, 1004169673118117054,
911493888709422270}, __mask_was_saved = 0, __saved_mask = {__val =
{140029202655000, 140029202657676, 140029202659752, 140029193999920, 1,
                140737330719600, 140029198332043, 0, 8606777346,
140029209619712, 0, 140737330719984, 140029209619712, 140029202345472,
1004169673048911038, 140029202657676}}}}
        tab = 0x7f5b17637c80
        skipit = 0 '\000'
        stdVacuumCostDelay = 0
        stdVacuumCostLimit = 200
        iter = <optimized out>
        relid = 17441
        classRel = 0x7f5b165d2430
        tuple = <optimized out>
        relScan = <optimized out>
        dbForm = <optimized out>
        table_oids = <optimized out>
        ctl = {num_partitions = 0, ssize = 0, dsize = 0, max_dsize = 0,
ffactor = 0, keysize = 4, entrysize = 64, hash = 0x7f5b169e9950
<oid_hash>, match = 0, keycopy = 0, alloc = 0, hcxt = 0x0, hctl = 0x0}
        table_toast_map = 0x7f5b17638390
        cell = 0x7f5b17637a20
        shared = 0x7f5b176089f8
        dbentry = 0x7f5b17608c80
        bstrategy = 0x7f5b17637a60
        key = {sk_flags = 0, sk_attno = 17, sk_strategy = 3, sk_subtype
= 0, sk_collation = 100, sk_func = {fn_addr = 0x7f5b16931a50 <chareq>,
fn_oid = 61, fn_nargs = 2, fn_strict = 1 '\001', fn_retset = 0 '\000',
fn_stats = 2 '\002', fn_extra = 0x0,
            fn_mcxt = 0x7f5b174c2bf0, fn_expr = 0x0}, sk_argument = 116}
        pg_class_desc = 0x7f5b17636380
        __func__ = "do_autovacuum"
#12 0x00007f5b168b70f6 in AutoVacWorkerMain (argc=<optimized out>,
argv=<optimized out>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/autovacuum.c:1679
        dbname =

"leds\000\000\000\000I\245\230\025[\177\000\000\000\000\000\000\000\000\000\000È\213\026[\177\000\000\060\n\000\366\377\177\000\000
\227\243\024[\177\000\000
\256\232\366\377\177\000\000\034\020\214\026[\177\000"
        local_sigjmp_buf = {{__jmpbuf = {0, 1004169673227168958, 4,
140029209964800, 140029202359232, 24717, 1004169673300569278,
911493888971435198}, __mask_was_saved = 1, __saved_mask = {__val =
{18446744066192964103, 12, 0, 0, 140737330720112, 140737330720208,
                140029161902022, 140029165031304, 140737330720144,
140029162696144, 140029199346469, 140029165026144, 0, 140029194442432,
140029161901637, 140029194442432}}}}
        dbid = 16414
#13 0x00007f5b168b71ba in StartAutoVacWorker () at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/autovacuum.c:1464
        worker_pid = 0
        __func__ = "StartAutoVacWorker"
#14 0x00007f5b168c46d5 in StartAutovacuumWorker () at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/postmaster.c:5070
        bn = 0x7f5b17517100
#15 sigusr1_handler (postgres_signal_arg=<optimized out>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/postmaster.c:4725
        save_errno = 4
        __func__ = "sigusr1_handler"
#16 <signal handler called>
No symbol table info available.
#17 0x00007f5b1476e633 in select () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#18 0x00007f5b168c3943 in ServerLoop () at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/postmaster.c:1546
        timeout = {tv_sec = 59, tv_usec = 882764}
        rmask = {fds_bits = {776, 0 <repeats 15 times>}}
        selres = <optimized out>
        readmask = {fds_bits = {776, 0 <repeats 15 times>}}
        nSockets = 10
        now = <optimized out>
        last_touch_time = 1422573909
#19 PostmasterMain (argc=<optimized out>, argv=<optimized out>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/postmaster.c:1253
        opt = <optimized out>
        status = <optimized out>
        userDoption = <optimized out>
        listen_addr_saved = <optimized out>
        i = <optimized out>
        __func__ = "PostmasterMain"
#20 0x00007f5b166f8687 in main (argc=5, argv=0x7f5b174c11a0) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/main/main.c:226






======================================
Here is the process (2900) that was doing the heavy writing/many
transactions:
======================================
#0  0x00007f5b147773e7 in semop () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#1  0x00007f5b168b40c2 in PGSemaphoreLock (sema=0x7f5b0e171550,
interruptOK=0 '\000') at pg_sema.c:421
        errStatus = <optimized out>
        sops = {sem_num = 5, sem_op = -1, sem_flg = 0}
        __func__ = "PGSemaphoreLock"
#2  0x00007f5b168fb46b in LWLockAcquire (lockid=696144, mode=LW_SHARED)
at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/storage/lmgr/lwlock.c:480
        mustwait = 1 '\001'
        lock = 0x7f594d96da80
        proc = 0x7f5b0e171540
        retry = 0 '\000'
        extraWaits = 0
        __func__ = "LWLockAcquire"
#3  0x00007f5b1672b1c3 in index_fetch_heap (scan=0x7f5b17693f50) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/access/index/indexam.c:528
        tid = 0x7f5b17693fa4
        all_dead = 0 '\000'
        got_heap_tuple = <optimized out>
#4  0x00007f5b1672b2ce in index_getnext (scan=0x7f5b17693f50,
direction=ForwardScanDirection) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/access/index/indexam.c:612
        heapTuple = <optimized out>
        tid = <optimized out>
#5  0x00007f5b1683b0e5 in IndexNext (node=0x7f5b176920c0) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/executor/nodeIndexscan.c:78
        estate = <optimized out>
        econtext = 0x7f5b176921d0
        direction = ForwardScanDirection
        scandesc = 0x7f5b17693f50
        tuple = <optimized out>
        slot = 0x7f5b17693430
#6  0x00007f5b16830806 in ExecScanFetch (recheckMtd=0x7f5b1683b050
<IndexRecheck>, accessMtd=0x7f5b1683b090 <IndexNext>,
node=0x7f5b176920c0) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/executor/execScan.c:82
        estate = <optimized out>
#7  ExecScan (node=0x7f5b176920c0, accessMtd=0x7f5b1683b090 <IndexNext>,
recheckMtd=0x7f5b1683b050 <IndexRecheck>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/executor/execScan.c:167
        slot = 0x7f5b176920c0
        econtext = 0x7f5b176921d0
        qual = 0x0
        projInfo = 0x7f5b17693880
        isDone = 32603
        resultSlot = <optimized out>
#8  0x00007f5b168295a8 in ExecProcNode (node=0x7f5b176920c0) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/executor/execProcnode.c:404
        result = <optimized out>
        __func__ = "ExecProcNode"
#9  0x00007f5b1683fa0d in ExecModifyTable (node=0x7f5b176c8058) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/executor/nodeModifyTable.c:918
        estate = 0x7f5b176d4140
        operation = CMD_UPDATE
        saved_resultRelInfo = 0x0
        resultRelInfo = 0x7f5b176d42d0
        subplanstate = 0x7f5b176920c0
        junkfilter = 0x7f5b17695418
        slot = <optimized out>
        planSlot = <optimized out>
        tupleid = 0x0
        tuple_ctid = {ip_blkid = {bi_hi = 16704, bi_lo = 5997}, ip_posid
= 32603}
        oldtuple = 0x0
        __func__ = "ExecModifyTable"
#10 0x00007f5b168295f8 in ExecProcNode (node=0x7f5b176c8058) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/executor/execProcnode.c:377
        result = <optimized out>
        __func__ = "ExecProcNode"
#11 0x00007f5b16826907 in ExecutePlan (dest=0x7f5b1768a998,
direction=<optimized out>, numberTuples=0, sendTuples=0 '\000',
operation=CMD_UPDATE, planstate=0x7f5b176c8058, estate=0x7f5b176d4140)
    at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/executor/execMain.c:1473
        slot = <optimized out>
        current_tuple_count = 0
#12 standard_ExecutorRun (queryDesc=0x7f5b176b18b0, direction=<optimized
out>, count=0) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/executor/execMain.c:307
        estate = 0x7f5b176d4140
        operation = CMD_UPDATE
        dest = 0x7f5b1768a998
        sendTuples = <optimized out>
        oldcontext = 0x7f5b176403f0
#13 0x00007f5b1690ba44 in ProcessQuery (plan=0x7f5b1768a8b8,
    sourceText=0x7f5b174db4c0 "\n\t\t\t\t  UPDATE addresses\n\t\t\t\t
SET num_type = NULL,\n\t\t\t\t\t  num_1 = E'5262',\n\t\t\t\t\t  num_2 =
NULL,\n\t\t\t\t\t  street = E'5262 240TH ST',\n\t\t\t\t\t  street_1 =
E'240TH ST',\n\t\t\t\t\t  street_2 = NULL,\n\t\t\t\t\t  unit_typ"...,
params=<optimized out>, dest=0x7f5b1768a998,
completionTag=0x7ffff69ab230 "") at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/tcop/pquery.c:185
        queryDesc = 0x7f5b176b18b0
        __func__ = "ProcessQuery"
#14 0x00007f5b1690bc7b in PortalRunMulti (portal=0x7f5b17608030,
isTopLevel=1 '\001', dest=0x7f5b1768a998, altdest=0x7f5b1768a998,
completionTag=0x7ffff69ab230 "") at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/tcop/pquery.c:1279
        pstmt = 0x7f5b1768a8b8
        stmt = 0x7f5b1768a8b8
        active_snapshot_set = 1 '\001'
        stmtlist_item = 0x7f5b1768a948
#15 0x00007f5b1690c94d in PortalRun (portal=0x7f5b17608030,
count=9223372036854775807, isTopLevel=1 '\001', dest=0x7f5b1768a998,
altdest=0x7f5b1768a998, completionTag=0x7ffff69ab230 "") at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/tcop/pquery.c:816
        save_exception_stack = 0x7ffff69ab110
        save_context_stack = 0x0
        local_sigjmp_buf = {{__jmpbuf = {140029209727360,
1004169673422204094, 140029210951728, 140029209727440, 140029211486616,
2, 1004169673241849022, 911493952020212926}, __mask_was_saved = 0,
__saved_mask = {__val = {1, 140737330720751, 140029199549563,
                140029198238304, 64, 140737330720720, 88,
140029210951728, 140029209727440, 140029199440175, 2, 140737330720752,
140029198326346, 2, 140029210951728, 140737330720784}}}}
        result = <optimized out>
        nprocessed = <optimized out>
        saveTopTransactionResourceOwner = 0x7f5b174c2d50
        saveTopTransactionContext = 0x7f5b174c14e0
        saveActivePortal = 0x0
        saveResourceOwner = 0x7f5b174c2d50
        savePortalContext = 0x0
        saveMemoryContext = 0x7f5b174c14e0
        __func__ = "PortalRun"
#16 0x00007f5b16908b5c in exec_simple_query (
    query_string=0x7f5b174db4c0 "\n\t\t\t\t  UPDATE addresses\n\t\t\t\t
SET num_type = NULL,\n\t\t\t\t\t  num_1 = E'5262',\n\t\t\t\t\t  num_2 =
NULL,\n\t\t\t\t\t  street = E'5262 240TH ST',\n\t\t\t\t\t  street_1 =
E'240TH ST',\n\t\t\t\t\t  street_2 = NULL,\n\t\t\t\t\t  unit_typ"...) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/tcop/postgres.c:1048
        parsetree = 0x7f5b174dd180
        portal = 0x7f5b17608030
        snapshot_set = <optimized out>
        commandTag = <optimized out>
        completionTag = "\000PDATE

1\000\060\065\070\071\060\000\000`\273\243\024[\177\000\000\000\000\000\000\000\000\000\000\b\000\000\000\000\000\000\000E\016t\024[\177\000\000\b\000\000\000\000\000\000\000\212\226\230\025[\177\000"
        querytree_list = <optimized out>
        plantree_list = 0x7f5b1768a968
        receiver = 0x7f5b1768a998
        format = 0
        dest = DestRemote
        parsetree_list = 0x7f5b174dd1f0
        save_log_statement_stats = 0 '\000'
        was_logged = 0 '\000'
        msec_str = "224.871\000\000\000\230\025[\177", '\000' <repeats
14 times>, "[\177\000"
        parsetree_item = 0x7f5b174dd1d0
        isTopLevel = 1 '\001'
#17 PostgresMain (argc=<optimized out>, argv=<optimized out>,
dbname=0x7f5b174c2058 "leds", username=<optimized out>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/tcop/postgres.c:4007
        query_string = 0x7f5b174db4c0 "\n\t\t\t\t  UPDATE
addresses\n\t\t\t\t  SET num_type = NULL,\n\t\t\t\t\t  num_1 =
E'5262',\n\t\t\t\t\t  num_2 = NULL,\n\t\t\t\t\t  street = E'5262 240TH
ST',\n\t\t\t\t\t  street_1 = E'240TH ST',\n\t\t\t\t\t  street_2 =
NULL,\n\t\t\t\t\t  unit_typ"...
        firstchar = 390975872
        input_message = {
          data = 0x7f5b174db4c0 "\n\t\t\t\t  UPDATE addresses\n\t\t\t\t
SET num_type = NULL,\n\t\t\t\t\t  num_1 = E'5262',\n\t\t\t\t\t  num_2 =
NULL,\n\t\t\t\t\t  street = E'5262 240TH ST',\n\t\t\t\t\t  street_1 =
E'240TH ST',\n\t\t\t\t\t  street_2 = NULL,\n\t\t\t\t\t  unit_typ"...,
len = 457, maxlen = 1024, cursor = 457}
        local_sigjmp_buf = {{__jmpbuf = {140029202659696,
1004169673363483838, 140029209616448, 1, 140029202359232, 0,
1004169673428495550, 911493949973916862}, __mask_was_saved = 1,
__saved_mask = {__val = {0, 0, 65280, 0, 0, 140737330721488, 14,
140737330721424,
                140029202359232, 2895, 140029162201848, 0, 206158430256,
140737330721424, 140737330721232, 0}}}}
        send_ready_for_query = 0 '\000'
        __func__ = "PostgresMain"
#18 0x00007f5b168c422d in BackendRun (port=0x7f5b1750b080) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/postmaster.c:4011
        ac = 1
        secs = 475880591
        usecs = 771115
        i = 1
        av = 0x7f5b174c2070
        maxac = <optimized out>
#19 BackendStartup (port=0x7f5b1750b080) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/postmaster.c:3685
        bn = <optimized out>
        pid = 0
#20 ServerLoop () at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/postmaster.c:1586
        port = 0x7f5b1750b080
        rmask = {fds_bits = {512, 0 <repeats 15 times>}}
        selres = <optimized out>
        readmask = {fds_bits = {776, 0 <repeats 15 times>}}
        nSockets = 10
        now = <optimized out>
        last_touch_time = 1422563469
#21 PostmasterMain (argc=<optimized out>, argv=<optimized out>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/postmaster.c:1253
        opt = <optimized out>
        status = <optimized out>
        userDoption = <optimized out>
        listen_addr_saved = <optimized out>
        i = <optimized out>
        __func__ = "PostmasterMain"
#22 0x00007f5b166f8687 in main (argc=5, argv=0x7f5b174c11a0) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/main/main.c:226






======================================
And here is one of the many processes that were attempting SELECTs from
that table.  Pid
======================================
#0  0x00007f5b147773e7 in semop () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#1  0x00007f5b168b40c2 in PGSemaphoreLock (sema=0x7f5b0e109790,
interruptOK=0 '\000') at pg_sema.c:421
        errStatus = <optimized out>
        sops = {sem_num = 3, sem_op = -1, sem_flg = 0}
        __func__ = "PGSemaphoreLock"
#2  0x00007f5b168fb46b in LWLockAcquire (lockid=696144, mode=LW_SHARED)
at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/storage/lmgr/lwlock.c:480
        mustwait = 1 '\001'
        lock = 0x7f594d96da80
        proc = 0x7f5b0e109780
        retry = 0 '\000'
        extraWaits = 0
        __func__ = "LWLockAcquire"
#3  0x00007f5b1671d03a in heapgetpage (page=71845, scan=0x7f5b1863e840)
at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/access/heap/heapam.c:350
        buffer = 347988
        snapshot = 0x7f5b17a6de80
        dp = <optimized out>
        lines = <optimized out>
        lineoff = <optimized out>
        ntup = <optimized out>
        lpp = <optimized out>
        all_visible = <optimized out>
#4  heapgetpage (scan=0x7f5b1863e840, page=71845) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/access/heap/heapam.c:301
No locals.
#5  0x00007f5b1671d500 in heapgettup_pagemode (scan=0x7f5b1863e840,
dir=<optimized out>, nkeys=0, key=0x0) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/access/heap/heapam.c:936
        tuple = 0x7f5b1863e880
        backward = 0 '\000'
        page = 71845
        finished = 0 '\000'
        dp = 0x7f5a72b3eb80 "8\004"
        lines = <optimized out>
        lineindex = 6
        lineoff = <optimized out>
        linesleft = 0
        lpp = <optimized out>
#6  0x00007f5b1671e89e in heap_getnext (scan=0x7f5b1863e840,
direction=<optimized out>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/access/heap/heapam.c:1456
No locals.
#7  0x00007f5b1684201d in SeqNext (node=<optimized out>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/executor/nodeSeqscan.c:66
        tuple = <optimized out>
        scandesc = 0x7f5b1863e840
        estate = <optimized out>
        direction = <optimized out>
        slot = 0x7f5b1863e790
#8  0x00007f5b16830806 in ExecScanFetch (recheckMtd=0x7f5b16841fe0
<SeqRecheck>, accessMtd=0x7f5b16841ff0 <SeqNext>, node=0x7f5b1794ebb0)
at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/executor/execScan.c:82
        estate = <optimized out>
#9  ExecScan (node=0x7f5b1794ebb0, accessMtd=0x7f5b16841ff0 <SeqNext>,
recheckMtd=0x7f5b16841fe0 <SeqRecheck>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/executor/execScan.c:167
        slot = 0x7f5b1863e790
        econtext = 0x7f5b1794ecc0
        qual = 0x7f5b1857dc30
        projInfo = 0x7f5b1863f540
        isDone = 32603
        resultSlot = <optimized out>
#10 0x00007f5b168295b8 in ExecProcNode (node=0x7f5b1794ebb0) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/executor/execProcnode.c:400
        result = <optimized out>
        __func__ = "ExecProcNode"
#11 0x00007f5b16842ae1 in ExecSort (node=0x7f5b1794e910) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/executor/nodeSort.c:103
        plannode = <optimized out>
        outerNode = 0x7f5b1794ebb0
        tupDesc = <optimized out>
        estate = 0x7f5b1794e7b0
        dir = ForwardScanDirection
        tuplesortstate = 0x7f5b18644230
        slot = <optimized out>
#12 0x00007f5b168294c8 in ExecProcNode (node=0x7f5b1794e910) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/executor/execProcnode.c:468
        result = <optimized out>
        __func__ = "ExecProcNode"
#13 0x00007f5b16826907 in ExecutePlan (dest=0x7f5b18520038,
direction=<optimized out>, numberTuples=0, sendTuples=1 '\001',
operation=CMD_SELECT, planstate=0x7f5b1794e910, estate=0x7f5b1794e7b0)
    at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/executor/execMain.c:1473
        slot = <optimized out>
        current_tuple_count = 0
#14 standard_ExecutorRun (queryDesc=0x7f5b1842cbb0, direction=<optimized
out>, count=0) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/executor/execMain.c:307
        estate = 0x7f5b1794e7b0
        operation = CMD_SELECT
        dest = 0x7f5b18520038
        sendTuples = <optimized out>
        oldcontext = 0x7f5b183abd80
#15 0x00007f5b1690b427 in PortalRunSelect (portal=0x7f5b17613950,
forward=<optimized out>, count=0, dest=<optimized out>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/tcop/pquery.c:946
        queryDesc = 0x7f5b1842cbb0
        direction = ForwardScanDirection
        nprocessed = <optimized out>
        __func__ = "PortalRunSelect"
#16 0x00007f5b1690c8af in PortalRun (portal=0x7f5b17613950,
count=9223372036854775807, isTopLevel=1 '\001', dest=0x7f5b18520038,
altdest=0x7f5b18520038, completionTag=0x7ffff69ab230 "") at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/tcop/pquery.c:790
        save_exception_stack = 0x7ffff69ab110
        save_context_stack = 0x0
        local_sigjmp_buf = {{__jmpbuf = {140029225580208,
1004169673422204094, 140029210999120, 140029225582896, 140029226778680,
2, 1004169673241849022, 911493952020212926}, __mask_was_saved = 0,
__saved_mask = {__val = {1, 140737330720751, 140029199549563,
                140029198238304, 64, 140737330720720, 88,
140029210999120, 140029225582896, 140029199307619, 2, 140737330720752,
140029198326346, 2, 140029210999120, 140737330720784}}}}
        result = <optimized out>
        nprocessed = <optimized out>
        saveTopTransactionResourceOwner = 0x7f5b174c2d50
        saveTopTransactionContext = 0x7f5b185c9a90
        saveActivePortal = 0x0
        saveResourceOwner = 0x7f5b174c2d50
        savePortalContext = 0x0
        saveMemoryContext = 0x7f5b185c9a90
        __func__ = "PortalRun"
#17 0x00007f5b16908b5c in exec_simple_query (
    query_string=0x7f5b174dbce0 "SELECT '', '', street, city, state,
zipcode, address_groupsid,  addressesid FROM ((SELECT
a.address_groupsid,\n\t\t          a.addressesid,\n\t\t
a.num_type,\n\t\t          a.num_1,\n\t\t          a.num_"...)
    at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/tcop/postgres.c:1048
        parsetree = 0x7f5b183fb6b0
        portal = 0x7f5b17613950
        snapshot_set = <optimized out>
        commandTag = <optimized out>
        completionTag = "\000ELECT
1\000E\000X\000\000\000\000`\273\243\024[\177", '\000' <repeats 18
times>,
"E\016t\024[\177\000\000\000\000\000\000\000\000\000\000\212\226\230\025[\177\000"
        querytree_list = <optimized out>
        plantree_list = 0x7f5b18520008
        receiver = 0x7f5b18520038
        format = 0
        dest = DestRemote
        parsetree_list = 0x7f5b183fc150
        save_log_statement_stats = 0 '\000'
        was_logged = 0 '\000'
        msec_str =
"453.861\000\000\245\230\025[\177\000\000\000\000\000\000\000\000\000\000\064\071\062\071\063\000\000"
        parsetree_item = 0x7f5b183fc130
        isTopLevel = 1 '\001'
#18 PostgresMain (argc=<optimized out>, argv=<optimized out>,
dbname=0x7f5b174c2058 "leds", username=<optimized out>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/tcop/postgres.c:4007
        query_string = 0x7f5b174dbce0 "SELECT '', '', street, city,
state, zipcode, address_groupsid,  addressesid FROM ((SELECT
a.address_groupsid,\n\t\t          a.addressesid,\n\t\t
a.num_type,\n\t\t          a.num_1,\n\t\t          a.num_"...
        firstchar = 406828720
        input_message = {data = 0x7f5b174dbce0 "SELECT '', '', street,
city, state, zipcode, address_groupsid,  addressesid FROM ((SELECT
a.address_groupsid,\n\t\t          a.addressesid,\n\t\t
a.num_type,\n\t\t          a.num_1,\n\t\t          a.num_"...,
          len = 1146, maxlen = 2048, cursor = 1146}
        local_sigjmp_buf = {{__jmpbuf = {140029202659696,
1004169673363483838, 140029209616448, 1, 140029202359232, 0,
1004169673428495550, 911493949973916862}, __mask_was_saved = 1,
__saved_mask = {__val = {0, 0, 65280, 0, 0, 140737330721488, 14,
140737330721424,
                140029202359232, 2365, 140029162201848, 0, 206158430256,
140737330721424, 140737330721232, 0}}}}
        send_ready_for_query = 0 '\000'
        __func__ = "PostgresMain"
#19 0x00007f5b168c422d in BackendRun (port=0x7f5b1750b080) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/postmaster.c:4011
        ac = 1
        secs = 475880538
        usecs = 428207
        i = 1
        av = 0x7f5b174c2070
        maxac = <optimized out>
#20 BackendStartup (port=0x7f5b1750b080) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/postmaster.c:3685
        bn = <optimized out>
        pid = 0
#21 ServerLoop () at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/postmaster.c:1586
        port = 0x7f5b1750b080
        rmask = {fds_bits = {8, 0 <repeats 15 times>}}
        selres = <optimized out>
        readmask = {fds_bits = {776, 0 <repeats 15 times>}}
        nSockets = 10
        now = <optimized out>
        last_touch_time = 1422563469
#22 PostmasterMain (argc=<optimized out>, argv=<optimized out>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/postmaster.c:1253
        opt = <optimized out>
        status = <optimized out>
        userDoption = <optimized out>
        listen_addr_saved = <optimized out>
        i = <optimized out>
        __func__ = "PostmasterMain"
#23 0x00007f5b166f8687 in main (argc=5, argv=0x7f5b174c11a0) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/main/main.c:226




On 01/30/2015 06:41 PM, Peter Geoghegan wrote:
> On Fri, Jan 30, 2015 at 2:14 PM,  <charles.harwood@zuerchertech.com> wrote:
>> Many postgres worker processes stopped responding.  They continued to be in
>> state 'active' and not 'waiting' for hours.  Each query was on the table
>> that an autovacuum process was also hung on processing and was the oldest
>> active transaction on.
>
> Was pg_upgrade run on this database at some point [1]?
>
> I foresee more problems here...
>
> [1] http://www.postgresql.org/message-id/20140825225029.2536.5315@wrigleys.postgresql.org

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Hans Ginzel
Дата:
Сообщение: Re: BUG #12725: psql: no interpretation of option -F
Следующее
От: David G Johnston
Дата:
Сообщение: Re: BUG #12725: psql: no interpretation of option -F