Обсуждение: BUG #5952: SetRWConflict assertion failure
The following bug has been logged online: Bug reference: 5952 Logged by: YAMAMOTO Takashi Email address: yamt@mwd.biglobe.ne.jp PostgreSQL version: 9.1devel Operating system: NetBSD Description: SetRWConflict assertion failure Details: i got the following assertion failure with 71ac48fd9cebd3d2a873635a04df64096c981f73. (with this patch applied, if it matters: http://archives.postgresql.org/pgsql-hackers/2011-03/msg01180.php) SerializableXactHashLock relocking in CheckTargetForConflictsIn() seems racy to me. (gdb) bt #0 0xbbba4cc7 in _lwp_kill () from /usr/lib/libc.so.12 #1 0xbbba4c85 in raise (s=6) at /siro/nbsd/src/lib/libc/gen/raise.c:48 #2 0xbbba445a in abort () at /siro/nbsd/src/lib/libc/stdlib/abort.c:74 #3 0x083df0d8 in ExceptionalCondition ( conditionName=0x854f370 "!(!RWConflictExists(reader, writer))", errorType=0x854f320 "FailedAssertion", fileName=0x854f314 "predicate.c", lineNumber=565) at assert.c:57 #4 0x082e9a1c in SetRWConflict (reader=0xbb7cce1c, writer=0xbb7c9720) at predicate.c:565 #5 0x082ef299 in FlagRWConflict (reader=0xbb7cce1c, writer=0xbb7c9720) at predicate.c:3870 #6 0x082eeeeb in CheckTargetForConflictsIn (targettag=0xbfbfda28) at predicate.c:3774 #7 0x082ef1cc in CheckForSerializableConflictIn (relation=0xb941a450, tuple=0x0, buffer=1901) at predicate.c:3841 #8 0x080bb5cc in _bt_doinsert (rel=0xb941a450, itup=0xb9484684, checkUnique=UNIQUE_CHECK_YES, heapRel=0xbb9ebfec) at nbtinsert.c:185 #9 0x080c33ee in btinsert (fcinfo=0xbfbfdaf0) at nbtree.c:262 #10 0x083e6b55 in FunctionCall6 (flinfo=0xb946a01c, arg1=3108086864, arg2=3217022352, arg3=3217022320, arg4=3108521372, arg5=3147743212, arg6=1) at fmgr.c:1488 #11 0x080b95a7 in index_insert (indexRelation=0xb941a450, values=0xbfbfdd90, isnull=0xbfbfdd70 "", heap_t_ctid=0xb948459c, heapRelation=0xbb9ebfec, checkUnique=UNIQUE_CHECK_YES) at indexam.c:205 #12 0x081f9a85 in ExecInsertIndexTuples (slot=0xb9483be0, tupleid=0xb948459c, estate=0xb944501c) at execUtils.c:1084 #13 0x082096df in ExecUpdate (tupleid=0xbfbfdef6, oldtuple=0x0, slot=0xb9483be0, planSlot=0xb9446cc4, epqstate=0xb94455e0, estate=0xb944501c, canSetTag=1 '\001') at nodeModifyTable.c:627 #14 0x08209ba3 in ExecModifyTable (node=0xb9445588) at nodeModifyTable.c:838 #15 0x081edc31 in ExecProcNode (node=0xb9445588) at execProcnode.c:371 #16 0x081ebcc4 in ExecutePlan (estate=0xb944501c, planstate=0xb9445588, operation=CMD_UPDATE, sendTuples=0 '\0', numberTuples=0, direction=ForwardScanDirection, dest=0x8592fa4) at execMain.c:1385 #17 0x081ea5c9 in standard_ExecutorRun (queryDesc=0xb9464cfc, direction=ForwardScanDirection, count=0) at execMain.c:311 #18 0x081ea466 in ExecutorRun (queryDesc=0xb9464cfc, direction=ForwardScanDirection, count=0) at execMain.c:259 #19 0x082faff9 in ProcessQuery (plan=0xb946b04c, sourceText=0xb9464c1c "UPDATE file SET mtime = current_timestamp, ctime = current_timestamp, rev = rev + 1 WHERE fileid = $1", params=0xb9464cbc, dest=0x8592fa4, completionTag=0xbfbfe1e0 "") at pquery.c:187 #20 0x082fc56b in PortalRunMulti (portal=0xbb9b601c, isTopLevel=1 '\001', dest=0x8592fa4, altdest=0x8592fa4, completionTag=0xbfbfe1e0 "") at pquery.c:1276 #21 0x082fbd32 in PortalRun (portal=0xbb9b601c, count=2147483647, isTopLevel=1 '\001', dest=0xbb91c428, altdest=0xbb91c428, completionTag=0xbfbfe1e0 "") at pquery.c:813 #22 0x082f7afd in exec_execute_message (portal_name=0xbb91c01c "", max_rows=2147483647) at postgres.c:1963 #23 0x082fa43d in PostgresMain (argc=2, argv=0xbb919618, username=0xbb91956c "takashi") at postgres.c:3965 #24 0x082aae0b in BackendRun (port=0xbb9510f0) at postmaster.c:3590 #25 0x082aa4cf in BackendStartup (port=0xbb9510f0) at postmaster.c:3275 #26 0x082a774f in ServerLoop () at postmaster.c:1449 #27 0x082a6efe in PostmasterMain (argc=3, argv=0xbfbfe590) at postmaster.c:1110 #28 0x08227740 in main (argc=3, argv=0xbfbfe590) at main.c:199 (gdb) fr 4 #4 0x082e9a1c in SetRWConflict (reader=0xbb7cce1c, writer=0xbb7c9720) at predicate.c:565 565 Assert(!RWConflictExists(reader, writer)); (gdb) p *reader $1 = {vxid = {backendId = 3, localTransactionId = 658423}, commitSeqNo = 18446744073709551615, SeqNo = { earliestOutConflictCommit = 3617761, lastCommitBeforeSnapshot = 3617761}, outConflicts = {prev = 0xbb7da7e0, next = 0xbb7da7e0}, inConflicts = { prev = 0xbb7cce3c, next = 0xbb7cce3c}, predicateLocks = { prev = 0xbb858950, next = 0xbb84e090}, finishedLink = {prev = 0x0, next = 0x0}, possibleUnsafeConflicts = {prev = 0xbb7cce54, next = 0xbb7cce54}, topXid = 1267267, finishedBefore = 0, xmin = 1267266, flags = 0, pid = 21551} (gdb) p *writer $2 = {vxid = {backendId = 4, localTransactionId = 943664}, commitSeqNo = 18446744073709551615, SeqNo = { earliestOutConflictCommit = 3617761, lastCommitBeforeSnapshot = 3617761}, outConflicts = {prev = 0xbb7c9738, next = 0xbb7c9738}, inConflicts = { prev = 0xbb7da7e8, next = 0xbb7da7e8}, predicateLocks = { prev = 0xbb861fe0, next = 0xbb861fe0}, finishedLink = {prev = 0x0, next = 0x0}, possibleUnsafeConflicts = {prev = 0xbb7c9758, next = 0xbb7c9758}, topXid = 1267266, finishedBefore = 0, xmin = 1267266, flags = 16, pid = 9331} (gdb)
"YAMAMOTO Takashi" wrote: > Description: SetRWConflict assertion failure > SerializableXactHashLock relocking in CheckTargetForConflictsIn() > seems racy to me. You're right. The attached patch should fix the assertion you hit. I will take a close look at the code above the patched area (for the optimization to drop the predicate lock when we acquire a write lock). Looking at it just now I'm wondering if it really is a safe optimization in PostgreSQL. It was in the Cahill paper, but InnoDB doesn't have subtransactions. I fear that we could give up the SIReadLock within a subtransaction and then have problems if the subtransaction rolls back, effectively discarding the write lock. I suspect we need to do this only if we're not within a subtransaction. Will follow up on that after further review. Thanks! -Kevin
Вложения
On Sun, Mar 27, 2011 at 3:18 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> "YAMAMOTO Takashi" =A0wrote:
>
>> Description: SetRWConflict assertion failure
>
>> SerializableXactHashLock relocking in CheckTargetForConflictsIn()
>> seems racy to me.
>
> You're right. =A0The attached patch should fix the assertion you hit.
This patch looks reasonable, but I'm a bit concerned about the chunk
immediately preceding the patched area.
When we do this:
LWLockRelease(SerializableXactHashLock);
LWLockRelease(partitionLock);
LWLockRelease(SerializablePredicateLockListLock);
LWLockAcquire(partitionLock, LW_SHARED);
LWLockAcquire(SerializableXactHashLock, LW_SHARED);
Don't we need to also reset nextpredlock to the head of the list? I'm
assuming it's the partitionLock that's keeping the PREDICATELOCKs from
bouncing out from under us, so if we release it, aren't we potentially
point off into thin air?
--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> wrote: > This patch looks reasonable, but I'm a bit concerned about the > chunk immediately preceding the patched area. > > When we do this: > > LWLockRelease(SerializableXactHashLock); > LWLockRelease(partitionLock); > LWLockRelease(SerializablePredicateLockListLock); > LWLockAcquire(partitionLock, LW_SHARED); > LWLockAcquire(SerializableXactHashLock, LW_SHARED); > > Don't we need to also reset nextpredlock to the head of the list? > I'm assuming it's the partitionLock that's keeping the > PREDICATELOCKs from bouncing out from under us, so if we release > it, aren't we potentially point off into thin air? I think you are right. That sequence should be followed by a copy of the same "nextpredlock = " statement that's just above. Do you want me to revise the patch or do you just want to take care of it as part of the commit? Thanks for catching that. -Kevin
On Tue, Apr 5, 2011 at 11:18 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Robert Haas <robertmhaas@gmail.com> wrote: >> This patch looks reasonable, but I'm a bit concerned about the >> chunk immediately preceding the patched area. >> >> When we do this: >> >> =A0 =A0 LWLockRelease(SerializableXactHashLock); >> =A0 =A0 LWLockRelease(partitionLock); >> =A0 =A0 LWLockRelease(SerializablePredicateLockListLock); >> =A0 =A0 LWLockAcquire(partitionLock, LW_SHARED); >> =A0 =A0 LWLockAcquire(SerializableXactHashLock, LW_SHARED); >> >> Don't we need to also reset nextpredlock to the head of the list? >> I'm assuming it's the partitionLock that's keeping the >> PREDICATELOCKs from bouncing out from under us, so if we release >> it, aren't we potentially point off into thin air? > > I think you are right. =A0That sequence should be followed by a copy > of the same "nextpredlock =3D " statement that's just above. =A0Do you > want me to revise the patch or do you just want to take care of it > as part of the commit? > > Thanks for catching that. If you could send a revised patch, that would be great. I don't want to muck it up by accident. --=20 Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> wrote: > If you could send a revised patch, that would be great. Attached. I put it in the same spot relative to the lock acquisition that was used earlier in the function. -Kevin
Вложения
I wrote: > Robert Haas <robertmhaas@gmail.com> wrote: > >> If you could send a revised patch, that would be great. > > Attached. I put it in the same spot relative to the lock > acquisition that was used earlier in the function. And version 3 which might actually work. [sigh] -Kevin
Вложения
On Tue, Apr 5, 2011 at 12:13 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > I wrote: >> Robert Haas <robertmhaas@gmail.com> wrote: >> >>> If you could send a revised patch, that would be great. >> >> Attached. =A0I put it in the same spot relative to the lock >> acquisition that was used earlier in the function. > > And version 3 which might actually work. =A0[sigh] Committed. --=20 Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company