Re: Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.
Дата
Msg-id CA+TgmoZqOpSJk1H9U4JxbQuX5Ywzb0=ZVRnzZzBQj3RCJuzbLA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Tue, Feb 16, 2016 at 2:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andres Freund <andres@anarazel.de> writes:
>> I wonder if we shouldn't just expose a 'which pid is process X waiting
>> for' API, implemented serverside. That's generally really useful, and
>> looks like it's actually going to be less complicated than that
>> query... And it's surely going to be faster.
>
> Attached is a draft patch for a new function that reports the set of PIDs
> directly blocking a given PID (that is, holding or awaiting conflicting
> locks on the lockable object it's waiting for).
>
> I replaced isolationtester's pg_locks query with this, and found that
> it's about 9X faster in a normal build, and 3X faster with
> CLOBBER_CACHE_ALWAYS turned on.  That would give us some nice headroom
> for the isolation tests with CLOBBER_CACHE_ALWAYS animals.  (Note that
> in view of
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=markhor&dt=2016-02-14%2007%3A38%3A37
> we still need to do *something* about the speed of the new deadlock-hard
> test; this patch could avoid the need to dumb down or slow down that test
> even further.)
>
> Not to be neglected also is that (I believe) this gives the right answer,
> whereas isolationtester's existing query is currently completely broken by
> parallel queries, and it doesn't understand non-conflicting lock modes
> either.  (It did, at least partially, before commit 38f8bdcac4982215;
> I am not sure that taking out the mode checks was a good idea.  But
> putting them back would make the query slower yet.)

The reason I took that out is because it breaks the deadlock-soft
test.  It's possible to have a situation where no granted lock
conflicts with an awaited lock.  If that happens, the old query
wrongly concluded that the waiting process was not in fact waiting.
(Consider A hold AccessShareLock, B awaits AccessExclusiveLock, C now
requests AccessShareLock and *waits*.)

As for the patch itself, I'm having trouble grokking what it's trying
to do.  I think it might be worth having a comment defining precisely
what we mean by "A blocks B".  I would define "A blocks B" in general
as either A holds a lock which conflicts with one sought by B
(hard-blocked) or A awaits a lock which conflicts with one sought by B
and precedes it in the wait queue (soft-blocked).  I have wondered
before if we shouldn't modify pg_locks to expose the wait-queue
ordering; without that, you can't reliably determine in general
whether A soft-blocks B, which means every view anyone has ever
written over pg_locks that purports to say who blocks who is
necessarily buggy.

For parallel queries, there's a further relevant distinction when we
say "A blocks B".  We might mean either that (1) process B cannot
resume execution until the lock conflict is resolved or (2) the group
leader for process B cannot complete the current parallel operation
until the lock conflict is resolved.  If you're trying to figure out
why one particular member of a parallel group is stuck, you want to
answer question #1.  If you're trying to figure out what all the
things that need to get out of the way to finish the query, you want
to answer question #2.  I think this function is aiming to answer
question #2, not question #1, but I'm less clear on the reason behind
that definitional choice.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Artur Zakirov
Дата:
Сообщение: Re: Fuzzy substring searching with the pg_trgm extension
Следующее
От: Teodor Sigaev
Дата:
Сообщение: Re: WIP: Access method extendability