Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Feb 16, 2016 at 2:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Not to be neglected also is that (I believe) this gives the right answer,
>> whereas isolationtester's existing query is currently completely broken by
>> parallel queries, and it doesn't understand non-conflicting lock modes
>> either. (It did, at least partially, before commit 38f8bdcac4982215;
>> I am not sure that taking out the mode checks was a good idea. But
>> putting them back would make the query slower yet.)
> The reason I took that out is because it breaks the deadlock-soft
> test. It's possible to have a situation where no granted lock
> conflicts with an awaited lock. If that happens, the old query
> wrongly concluded that the waiting process was not in fact waiting.
> (Consider A hold AccessShareLock, B awaits AccessExclusiveLock, C now
> requests AccessShareLock and *waits*.)
Ah, well, with this patch the deadlock-soft test still passes.
> As for the patch itself, I'm having trouble grokking what it's trying
> to do. I think it might be worth having a comment defining precisely
> what we mean by "A blocks B". I would define "A blocks B" in general
> as either A holds a lock which conflicts with one sought by B
> (hard-blocked) or A awaits a lock which conflicts with one sought by B
> and precedes it in the wait queue (soft-blocked).
Yes, that is exactly what I implemented ... and it's something you can't
find out from pg_locks. I'm not sure how that view could be made to
expose wait-queue ordering.
> For parallel queries, there's a further relevant distinction when we
> say "A blocks B". We might mean either that (1) process B cannot
> resume execution until the lock conflict is resolved or (2) the group
> leader for process B cannot complete the current parallel operation
> until the lock conflict is resolved.
The definition I used in this patch is "some member of A's lock group
blocks some member of B's lock group", because that corresponds directly
to whether A is preventing B's query from completing, which is what
isolationtester wants to know --- and, I would argue, it's generally what
any client would want to know. 99.9% of clients would just as soon not be
aware of parallel workers lurking underneath the pg_backend_pid() values
that they see for their sessions.
regards, tom lane