Обсуждение: 9.3 reference constraint regression
In 9.3 I can delete the parent of a parent-child relation if the child row is an uncommitted insert and I first update the parent.
USER1:drop table parent;
create table parent (i int, c char(3));
create unique index parent_idx on parent (i);
insert into parent values (1, 'AAA');
create table child (i int references parent(i));
insert into child values (1);
delete from parent;
Daniel Wood wrote: > In 9.3 I can delete the parent of a parent-child relation if the child row > is an uncommitted insert and I first update the parent. Interesting test case, thanks. I traced through it and promptly noticed that the problem is that HeapTupleSatisfiesUpdate is failing to consider the case that there are foreign lockers when a tuple was created by our own transaction. In the past, a tuple created by our own transaction can't possibly be seen by other transactions -- so it wouldn't be possible for them to lock them. But this is no longer the case, because an UPDATE can carry forward the locks from the previous version of the tuple. The fix for the immediate bug is to add some code to HTSU so that it checks for locks by other transactions even when the tuple was created by us. I haven't looked at the other tqual routines yet, but I imagine they will need equivalent fixes. One thought on the whole HTSU issue is that it's a bit strange that it is returning HeapTupleBeingUpdated when the tuple is actually only locked; and thus the callers can update the tuple in that case anyway. While I haven't explored the idea fully, perhaps we should consider adding a new return code to HTSU, with the meaning "this tuple is not updated, but it is locked by someone". Then it's the caller's responsibility to check the lock for conflicts, and perhaps add more locks. When BeingUpdate is returned, then a tuple cannot possibly be updated. This may help disentangling the mess in the heapam.c routines a bit ... or it may not (at least heap_lock_tuple will still need to consider doing some stuff in the BeingUpdated case, so it's unlikely that we will see much simplification there). Another consideration is that perhaps it's too late for the 9.3 series for a change this large. I will start working on a patch for this tomorrow. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera wrote: > The fix for the immediate bug is to add some code to HTSU so that it > checks for locks by other transactions even when the tuple was created > by us. I haven't looked at the other tqual routines yet, but I imagine > they will need equivalent fixes. This POC patch changes the two places in HeapTupleSatisfiesUpdate that need to be touched for this to work. This is probably too simplistic, in that I make the involved cases return HeapTupleBeingUpdated without checking that there actually are remote lockers, which is the case of concern. I'm not yet sure if this is the final form of the fix, or instead we should expand the Multi (in the cases where there is a multi) and verify whether any lockers are transactions other than the current one. As is, nothing seems to break, but I think that's probably just chance and should not be relied upon. Attached are two isolation specs which illustrate both the exact issue reported by Dan, and a similar one which involves an aborted subtransaction having updated the second version of the row. (This takes a slightly different code path.) As far as I can tell, no other routine in tqual.c needs to change other than HeapTupleSatisfiesUpdate. The ones that test for visibility (Dirty, Any, Self) are only concerned with whether the tuple is visible, and of course that won't be affected by the tuple being locked; and HeapTupleSatisfiesVacuum is only concerned with the tuple being dead, which similarly won't. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Вложения
Alvaro Herrera wrote: > This POC patch changes the two places in HeapTupleSatisfiesUpdate that > need to be touched for this to work. This is probably too simplistic, > in that I make the involved cases return HeapTupleBeingUpdated without > checking that there actually are remote lockers, which is the case of > concern. I'm not yet sure if this is the final form of the fix, or > instead we should expand the Multi (in the cases where there is a multi) > and verify whether any lockers are transactions other than the current > one. As is, nothing seems to break, but I think that's probably just > chance and should not be relied upon. After playing with this, I think the reason this seems to work without fail is that all callers of HeapTupleSatisfiesUpdate are already prepared to deal with the case where HeapTupleBeingUpdated is returned but there is no actual transaction that would block the operation. So I think the proposed patch is okay, barring a few more comments. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2013-12-16 17:43:37 -0300, Alvaro Herrera wrote: > Alvaro Herrera wrote: > > > This POC patch changes the two places in HeapTupleSatisfiesUpdate that > > need to be touched for this to work. This is probably too simplistic, > > in that I make the involved cases return HeapTupleBeingUpdated without > > checking that there actually are remote lockers, which is the case of > > concern. I'm not yet sure if this is the final form of the fix, or > > instead we should expand the Multi (in the cases where there is a multi) > > and verify whether any lockers are transactions other than the current > > one. As is, nothing seems to break, but I think that's probably just > > chance and should not be relied upon. > > After playing with this, I think the reason this seems to work without > fail is that all callers of HeapTupleSatisfiesUpdate are already > prepared to deal with the case where HeapTupleBeingUpdated is returned > but there is no actual transaction that would block the operation. > So I think the proposed patch is okay, barring a few more comments. Are you sure? the various wait/nowait cases don't seem to handle that correctly. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund wrote: > On 2013-12-16 17:43:37 -0300, Alvaro Herrera wrote: > > Alvaro Herrera wrote: > > > > > This POC patch changes the two places in HeapTupleSatisfiesUpdate that > > > need to be touched for this to work. This is probably too simplistic, > > > in that I make the involved cases return HeapTupleBeingUpdated without > > > checking that there actually are remote lockers, which is the case of > > > concern. I'm not yet sure if this is the final form of the fix, or > > > instead we should expand the Multi (in the cases where there is a multi) > > > and verify whether any lockers are transactions other than the current > > > one. As is, nothing seems to break, but I think that's probably just > > > chance and should not be relied upon. > > > > After playing with this, I think the reason this seems to work without > > fail is that all callers of HeapTupleSatisfiesUpdate are already > > prepared to deal with the case where HeapTupleBeingUpdated is returned > > but there is no actual transaction that would block the operation. > > So I think the proposed patch is okay, barring a few more comments. > > Are you sure? the various wait/nowait cases don't seem to handle that > correctly. Well, it would help if those cases weren't dead code. Neither heap_update nor heap_delete are ever called in the "no wait" case at all. Only heap_lock_tuple is, and I can't see any misbehavior there either, even with HeapTupleBeingUpdated returned when there's a non-local locker, or when there's a MultiXact as xmax, regardless of its status. Don't get me wrong --- it's not like this case is all that difficult to handle. All that's required is something like this in HeapTupleSatisfiesUpdate: if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tuple))) { ... if (HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask)) /* not deleter */ { if (tuple->t_infomask & HEAP_XMAX_IS_MULTI) { int nmembers; bool remote; int i; MultiXactMember *members; nmembers = GetMultiXactIdMembers(HeapTupleHeaderGetRawXmax(tuple), &members, false); remote = false; for (i = 0; i < nmembers;i++) { if (!TransactionIdIsCurrentTransactionId(members[i].xid)) { remote = true; break; } } if (nmembers > 0) pfree(members); if (remote) return HeapTupleBeingUpdated; else return HeapTupleMayBeUpdated; } else if (!TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetRawXmax(tuple))) return HeapTupleBeingUpdated; return HeapTupleMayBeUpdated; } } The simpler code just does away with the GetMultiXactIdMembers() and returns HeapTupleBeingUpdated always. In absence of a test case that misbehaves with that, it's hard to see that it is a good idea to go all this effort there. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera wrote: > Well, it would help if those cases weren't dead code. Neither > heap_update nor heap_delete are ever called in the "no wait" case at > all. Only heap_lock_tuple is, and I can't see any misbehavior there > either, even with HeapTupleBeingUpdated returned when there's a > non-local locker, or when there's a MultiXact as xmax, regardless of its > status. I spent some more time trying to generate a test case that would show a problem with the changed return values here, and was unable to. I intend to apply this patch soon. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Вложения
Hi, On 2013-12-17 18:27:26 -0300, Alvaro Herrera wrote: > > Well, it would help if those cases weren't dead code. Neither > > heap_update nor heap_delete are ever called in the "no wait" case at > > all. Yea, that sucks, maybe we ought to change that in HEAD. But there very well might be external callers that use it, I don't think we can just break the API in a stable release. > > Only heap_lock_tuple is, and I can't see any misbehavior there > > either, even with HeapTupleBeingUpdated returned when there's a > > non-local locker, or when there's a MultiXact as xmax, regardless of its > > status. > > I spent some more time trying to generate a test case that would show a > problem with the changed return values here, and was unable to. > > I intend to apply this patch soon. I have to say, it makes me really uncomfortable to take such shortcuts. In other branches we are doing liveliness checks with MultiXactIdIsRunning() et al. Why isn't that necessary here? And how likely is that this won't cause breakage somewhere down the line because somebody doesn't know of that subtlety? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund wrote: > I have to say, it makes me really uncomfortable to take such > shortcuts. In other branches we are doing liveliness checks with > MultiXactIdIsRunning() et al. Why isn't that necessary here? And how > likely is that this won't cause breakage somewhere down the line because > somebody doesn't know of that subtlety? I came up with the attached last night, which should do the right thing in both cases. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Вложения
Alvaro Herrera wrote: > Andres Freund wrote: > > > I have to say, it makes me really uncomfortable to take such > > shortcuts. In other branches we are doing liveliness checks with > > MultiXactIdIsRunning() et al. Why isn't that necessary here? And how > > likely is that this won't cause breakage somewhere down the line because > > somebody doesn't know of that subtlety? > > I came up with the attached last night, which should do the right thing > in both cases. That one had a silly bug, which I fixed and pushed now. Thanks for the feedback, -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services