Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl
Дата
Msg-id 22842.1455474783@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl  (Robert Haas <robertmhaas@gmail.com>)
Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> First, the overall concept here is that processes can either be a
> member of a lock group or a member of no lock group.

Check.

> Second, I can review the data structure changes and the associated
> invariants.

The data structure additions seemed relatively straightforward, though
I did wonder why you added groupLeader to PROCLOCK.  Why is that not
redundant with tag.myProc->lockGroupLeader?  If it isn't redundant,
it's inadequately documented.  If it is, seems better to lose it.

Also, isn't the comment on this PGPROC field incorrect:
   PGPROC     *lockGroupLeader;    /* lock group leader, if I'm a follower */

ie shouldn't you s/follower/member/ ?

> ... the lock manager lock that protects the fields in a
> given PGPROC is chosen by taking pgprocno modulo the number of lock
> manager partitions.

pgprocno of the specific PGPROC, or of the group leader?  If it's
the former, I'm pretty suspicious that there are deadlock and/or
linked-list-corruption hazards here.  If it's the latter, at least
the comments around this are misleading.

> Each PROCLOCK now has a new groupLeader flag.  While a PGPROC's
> lockGroupLeader may be NULL if that process is not involved in group
> locking, a PROCLOCK's groupLeader always has a non-NULL value; it
> points back to the PGPROC that acquired the lock.

This is not what the comment on it says; and your prose explanation
here implies that it should actually be equal to tag.myProc, or else
you are using some definition of "acquired the lock" that I don't
follow at all.  There could be lots of PGPROCs that have acquired
a lock in one sense or another; which one do you mean?

> With respect to pg_locks - and for that matter also pg_stat_activity -
> I think you are right that improvement is needed.

This is really the core of my concern at the moment.  I think that
isolationtester is probably outright broken for any situation where the
queries-under-test are being parallel executed, and so will be any other
client that's trying to identify who blocks whom from pg_locks.

> The simplest thing we could do to make that easier is, in
> pg_stat_activity, have parallel workers advertise the PID that
> launched them in a new field; and in pg_locks, have members of a lock
> group advertise the leader's PID in a new field.

That would be simple for us, but it would break every existing client-side
query that tries to identify blockers/blockees; and not only would those
queries need work but they would become very substantially more complex
and slower (probably at least 4-way joins not 2-way).  We already know
that isolationtester's query has performance problems in the buildfarm.
I think more thought is needed here, and that this area should be
considered a release blocker.  It's certainly a blocker for any thought
of turning parallel query on by default.

> I hope this helps and please let me know what more you think I should do.

At the least you need to make another pass over the comments and README
addition.  I highlighted above a few critical inaccuracies, and it's not
hard to find a lot of less-critical typos in the comments in that commit.
Transposing some of what you've written here into the README would likely
be a good thing too.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: proposal: enhancing slow query log, and autoexplain log about waiting on lock before query exec time
Следующее
От: Robert Haas
Дата:
Сообщение: Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl