Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> On 2019-Jun-14, Alvaro Herrera wrote:
>
>> I think there are worse problems here. I tried the attached isolation
>> spec. Note that the only difference in the two permutations is that s0
>> finishes earlier in one than the other; yet the first one works fine and
>> the second one hangs until killed by the 180s timeout. (s3 isn't
>> released for a reason I'm not sure I understand.)
>
> Actually, those behaviors both seem correct to me now that I look
> closer. So this was a false alarm. In the code before de87a084c0, the
> first permutation deadlocks, and the second permutation hangs. The only
> behavior change is that the first one no longer deadlocks, which is the
> desired change.
>
> I'm still trying to form a case to exercise the case of skip_tuple_lock
> having the wrong lifetime.
Hm… I think it was an oversight from my part not to give skip_lock_tuple the
same lifetime as have_tuple_lock or first_time (also initializing it to
false at the same time). Even if now it might not break anything in an
obvious way, a backward jump to l3 label will leave skip_lock_tuple
uninitialized, making it very dangerous for any future code that will rely
on its value.
> The fact that both permutations behave differently, even though the
> only difference is where s0 commits relative to the s3_share step, is an
> artifact of our unusual tuple locking implementation.
Cheers,
Oleksii