Re: Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.
Дата
Msg-id CA+TgmoZ8tJf2sktXNmJvR7gX4VWrqPctD=GkOKFuCMwGzgi1QA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Mon, Feb 22, 2016 at 7:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> No, you don't.  I've spent a good deal of time thinking about that problem.
>> [ much snipped ]
>> Unless I'm missing something, though, this is a fairly obscure
>> problem.  Early release of catalog locks is desirable, and locks on
>> scanned tables should be the same locks (or weaker) than already held
>> by the master.  Other cases are rare.  I think.  It would be good to
>> know if you think otherwise.
>
> After further thought, what I think about this is that it's safe so long
> as parallel workers are strictly read-only.  Given that, early lock
> release after user table access is okay for the same reasons it's okay
> after catalog accesses.  However, this is one of the big problems that
> we'd have to have a solution for before we ever consider allowing
> read-write parallelism.

Actually, I don't quite see what read-only vs. read-write queries has
to do with this particular issue.  We retain relation locks on target
relations until commit, regardless of whether those locks are
AccessShareLock, RowShareLock, or RowExclusiveLock.  As far as I
understand it, this isn't because anything would fail horribly if we
released those locks at end of query, but rather because we think that
releasing those locks early might result in unpleasant surprises for
client applications.  I'm actually not really convinced that's true: I
will grant that it might be surprising to run the same query twice in
the same transaction and get different tuple descriptors, but it might
also be surprising to get different rows, which READ COMMITTED allows
anyway.  And I've met a few users who were pretty surprised to find
out that they couldn't do DDL on table A and the blocking session
mentioned table A nowhere in the currently-executing query.

The main issues with allowing read-write parallelism that I know of
off-hand are:

* Updates or deletes might create new combo CIDs.  In order to handle
that, we'd need to store the combo CID mapping in some sort of
DSM-based data structure which could expand as new combo CIDs were
generated.

* Relation extension locks, and a few other types of heavyweight
locks, are used for mutual exclusion of operations that would need to
be serialized even among cooperating backends.  So group locking would
need to be enhanced to handle those cases differently, or some other
solution would need to be found.  (I've done some more detailed
analysis here about possible solutions most of which has been posted
to -hackers in various emails at one time or another; I'll refrain
from diving into all the details in this missive.)

But those are separate from the question of whether parallel workers
need to transfer any heavyweight locks they accumulate on non-scanned
tables back to the leader.

> So what distresses me about the current situation is that this is a large
> stumbling block that I don't see documented anywhere.  It'd be good if you
> transcribed some of this conversation into README.parallel.
>
> (BTW, I don't believe your statement that transferring locks back to the
> master would be deadlock-prone.  If the lock system treats locks held by
> a lock group as effectively all belonging to one entity, then granting a
> lock identical to one already held by another group member should never
> fail.  I concur that it might be expensive performance-wise, though it
> hardly seems like this would be a dominant factor compared to all the
> other costs of setting up and tearing down parallel workers.)

I don't mean that the heavyweight lock acquisition itself would fail;
I agree with your analysis on that.  I mean that you'd have to design
the protocol for the leader and the worker to communicate very
carefully in order for it not to get stuck.  Right now, the leader
initializes the DSM at startup before any workers are running with all
the data the workers will need, and after that data flows strictly
from workers to leader.  So the workers could send control messages
indicating heavyweight locks that they held to the leader, and that
would be fine.  Then the leader would need to read those messages and
do something with them, after which it would need to tell the workers
that they could now exit.  You'd need to make sure there was no
situation in which that handshake couldn't get stuck, for example
because the leader was waiting for a tuple from the worker while the
worker was waiting for a lock-acquisition-confirmation from the
leader.  That particular thing is probably not an issue but hopefully
it illustrates the sort of hazard I'm concerned about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: tab completion for CREATE USER MAPPING
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby