Обсуждение: Orphaned locks in 7.0?
Argh! I can't reproduce this:
NOTICE: Message from PostgreSQL backend: The Postmaster has informed me that some other backend died abnormally
andpossibly corrupted shared memory. I have rolled back the current transaction and am going to terminate your
databasesystem connection and exit. Please reconnect to the database system and repeat your query.
NOTICE: Message from PostgreSQL backend: The Postmaster has informed me that some other backend died abnormally
andpossibly corrupted shared memory. I have rolled back the current transaction and am going to terminate your
databasesystem connection and exit. Please reconnect to the database system and repeat your query.
NOTICE: Message from PostgreSQL backend: The Postmaster has informed me that some other backend died abnormally
andpossibly corrupted shared memory. I have rolled back the current transaction and am going to terminate your
databasesystem connection and exit. Please reconnect to the database system and repeat your query.
pqReadData() -- backend closed the channel unexpectedly. This probably means the backend terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
Basically I was running two instances of psql, in one I issued:
one two
begin;
lock data; -- some table lock data;^C -- cancel
select * from data;^C -- cancel
end; lock data;^C -- HUNG then aborted
It's annoying that I can't seem to reproduce this, and I know LOCKs
are only to be requested during a transaction, but it did happen.
thanks,
--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."
Alfred Perlstein <bright@wintelcom.net> writes:
> Argh! I can't reproduce this:
Was a core file left behind? Can you get a backtrace from it?
regards, tom lane
* Tom Lane <tgl@sss.pgh.pa.us> [000511 11:26] wrote:
> Alfred Perlstein <bright@wintelcom.net> writes:
> > Argh! I can't reproduce this:
>
> Was a core file left behind? Can you get a backtrace from it?
I enabled assertion checking and debug, here we go:
Core was generated by `postgres'.
Program terminated with signal 6, Abort trap.
Reading symbols from /usr/lib/libcrypt.so.2...done.
Reading symbols from /usr/lib/libm.so.2...done.
Reading symbols from /usr/lib/libreadline.so.4...done.
Reading symbols from /usr/lib/libncurses.so.5...done.
Reading symbols from /usr/lib/libc.so.4...done.
Reading symbols from /usr/libexec/ld-elf.so.1...done.
#0 0x48281fd8 in kill () from /usr/lib/libc.so.4
(gdb) bt
#0 0x48281fd8 in kill () from /usr/lib/libc.so.4
#1 0x482bb4a2 in abort () from /usr/lib/libc.so.4
#2 0x8143c53 in ExcAbort () at excabort.c:27
#3 0x8143bd2 in ExcUnCaught (excP=0x81a5708, detail=0, data=0x0, message=0x8189040 "!((result->nHolding > 0) &&
(result->holders[lockmode]>= 0))") at exc.c:170
#4 0x8143c19 in ExcRaise (excP=0x81a5708, detail=0, data=0x0, message=0x8189040 "!((result->nHolding > 0) &&
(result->holders[lockmode]>= 0))") at exc.c:187
#5 0x8143308 in ExceptionalCondition ( conditionName=0x8189040 "!((result->nHolding > 0) &&
(result->holders[lockmode]>= 0))", exceptionP=0x81a5708, detail=0x0, fileName=0x8188e0c "lock.c", lineNumber=617) at
assert.c:73
#6 0x810422e in LockAcquire (lockmethod=1, locktag=0xbfbfe808, lockmode=1) at lock.c:617
#7 0x81036d1 in LockRelation (relation=0x8471ba0, lockmode=1) at lmgr.c:148
#8 0x8071957 in heap_open (relationId=1249, lockmode=1) at heapam.c:551
#9 0x813e329 in SearchSysCache (cache=0x847c018, v1=8490746, v2=136106584, v3=0, v4=0) at catcache.c:1009
#10 0x8142210 in SearchSysCacheTuple (cacheId=4, key1=8490746, key2=136106584, key3=0, key4=0) at syscache.c:532
#11 0x80cfc5d in make_var (pstate=0x81cd020, relid=8490746, refname=0x81cd180 "data", attrname=0x81cd258 "referer")
atparse_node.c:202
#12 0x80d12be in expandAll (pstate=0x81cd020, relname=0x81cd180 "data", ref=0x81cd158, this_resno=0x81cd020) at
parse_relation.c:408
#13 0x80d25b8 in ExpandAllTables (pstate=0x81cd020) at parse_target.c:444
#14 0x80d213b in transformTargetList (pstate=0x81cd020, targetlist=0x81ccea8) at parse_target.c:139
#15 0x80c0ef6 in transformSelectStmt (pstate=0x81cd020, stmt=0x81ccf50) at analyze.c:1423
#16 0x80bf780 in transformStmt (pstate=0x81cd020, parseTree=0x81ccf50) at analyze.c:238
#17 0x80bf3c2 in parse_analyze (pl=0x81cd008, parentParseState=0x0) at analyze.c:75
#18 0x80cafa1 in parser (str=0x8469018 "select * from data;", typev=0x0, nargs=0) at parser.c:64
#19 0x8109923 in pg_parse_and_rewrite ( query_string=0x8469018 "select * from data;", typev=0x0, nargs=0,
aclOverride=0'\000') at postgres.c:395
#20 0x8109bcb in pg_exec_query_dest ( query_string=0x8469018 "select * from data;", dest=Remote, aclOverride=0) at
postgres.c:580
#21 0x8109b91 in pg_exec_query (query_string=0x8469018 "select * from data;") at postgres.c:562
#22 0x810ab4a in PostgresMain (argc=7, argv=0xbfbff138, real_argc=8, real_argv=0xbfbffb98) at postgres.c:1590
#23 0x80f00d6 in DoBackend (port=0x8463000) at postmaster.c:2006
#24 0x80efc7d in BackendStartup (port=0x8463000) at postmaster.c:1775
#25 0x80eeea1 in ServerLoop () at postmaster.c:1035
#26 0x80ee88a in PostmasterMain (argc=8, argv=0xbfbffb98) at postmaster.c:723
#27 0x80bf327 in main (argc=8, argv=0xbfbffb98) at main.c:93
#28 0x80633d5 in _start ()
(gdb) up
#6 0x810422e in LockAcquire (lockmethod=1, locktag=0xbfbfe808, lockmode=1) at lock.c:617
617 Assert((result->nHolding > 0) && (result->holders[lockmode] >= 0));
(gdb) list
612 XID_PRINT("LockAcquire: new", result);
613 }
614 else
615 {
616 XID_PRINT("LockAcquire: found", result);
617 Assert((result->nHolding > 0) && (result->holders[lockmode] >= 0));
618 Assert(result->nHolding <= lock->nActive);
619 }
620
621 /* ----------------
(gdb)
Seems to be what brought things down.
If you need anything else, let me know.
--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."
* Hiroshi Inoue <Inoue@tpf.co.jp> [000515 02:07] wrote: > > -----Original Message----- > > From: pgsql-hackers-owner@hub.org [mailto:pgsql-hackers-owner@hub.org]On > > Behalf Of Alfred Perlstein > > > > Basically I was running two instances of psql, in one I issued: > > > > one two > > > > begin; > > lock data; -- some table > > lock data;^C -- cancel > > select * from data;^C -- cancel > > end; > > > > lock data;^C -- HUNG then aborted > > > > It's annoying that I can't seem to reproduce this, and I know LOCKs > > are only to be requested during a transaction, but it did happen. > > > > Could the following example explain your HUNG problem ? > > Session-1 > # begin; > BEGIN > =# lock t; > LOCK TABLE > > Session-2 > =# begin; > BEGIN > =# lock t; > [blocked] ^C > Cancel request sent > ERROR: Query cancel requested while waiting lock > reindex=# select * from t; > [blocked] > > Session-1 > =# commit; > COMMIT > > Session-2 > ERROR: LockRelation: LockAcquire failed > =# abort; > ROLLBACK > =# lock t; > [blocked] That looks pretty much like the sequence of events that lead up to the problem, the problem is that I was just manually testing out the way locks work and didn't write down the exact steps I took. This is probably exactly the right steps though. -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] "I have the heart of a child; I keep it in a jar on my desk."
> -----Original Message-----
> From: Hiroshi Inoue
> > -----Original Message-----
> > From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> >
> > Anyway, it sounds like we agree that this is the approach to pursue.
> > Do you have time to chase down the details?
>
> OK,I will examine a little though I'm a little busy this week.
>
Sorry,I'm so late and haven't so much time to examin the details.
I'm afraid another point now.
Woundn't this change waste XIDs in case of abort loop ?
Anyway,I examied the loop in PostgresMain()(;;){ .. StartTransactionCommand() .. pg_exec_query() ..
CommitTransactionCommand()(/AbortCurrentTrabsaction())..}
In my thoughts,the follwoing commands preceded by +?
would be added,ones preceded by -? would be removed.
StartTransactionCommand()TBLOCK_DEFAULT StartTransaction() ->TBLOCK_BEGIN ->
TBLOCK_INPROGRESSTBLOCK_INPROGRES ->TBLOCK_END CommitTransaction() ->
StartTransaction() -> TBLOCK_DEFAULTTBLOCK_ABORT ->TBLOCK_ENDABORT ->
CommitTransactionCommand()TBLOCK_DEFAULT CommitTransaction() ->TBLOCK_BEGIN ->
TBLOCK_INPROGRESSTBLOCK_INPROGRESS CommandCounterIncrement() ->TBLOCK_END CommitTransaction() ->
TBLOCK_DEFAULTTBLOCK_ABORT +? AbortTransaction() +? StartTransaction() ->TBLOCK_ENDABORT +?
AbortTransaction() -> TBLOCK_DEFAULT
BeginTransactionBlock() ( <- BEGIN command )TRANS_DISABLED ->otherwise -> TBLOCK_BEGIN ->
TBLOCK_INPROGRESS
UserAbortTransaction() ( <- ROLLBACK command )TRANS_DISABLED ->TBLOCK_INPROGRESS -?
AbortTransaction() -> TBLOCK_ENDABORTTBLOCK_ABORT -> TBLOCK_ENDABORTotherwise -?
AbortTransaction() -> TBLOCK_ENDABORT
EndTransactionBlock() ( <- COMMIT command )TRANS_DISABLED ->TBLOCK_INPROGRESS ->
TBLOCK_END TBLOCK_ABORT -> TBLOCK_ENDABORTotherwise -> TBLOCK_ENDABORT
AbortCurrentTransaction() ( elog(ERROR/FATAL) )TBLOCK_DEFAULT AbortTransaction() ->TBLOCK_BEGIN
AbortTransaction() +? StartTransaction() -> TBLOCK_ABORTTBLOCK_INGRESS AbortTransaction() +?
StartTransaction() -> TBLOCK_ABORTTBLOCK_END AbortTransaction() -> TBLOCK_DEFAULTTBLOCK_ABORT +?
AbortTransaction() +? StartTransaction() ->TBLOCK_ENDABORT +? AbortTransaction() -> TBLOCK_DEFAULT
AbortOutAnyTransaction() ( Async_UnlistenOnExit() )TRANS_DEFAULT -> TBLOCK_DEFAULTotherwise
AbortTransaction() -> TBLOCK_DEFAULT
Regards.
Hiroshi Inoue
Inoue@tpf.co.jp