Re: parallel mode and parallel contexts
От | Robert Haas |
---|---|
Тема | Re: parallel mode and parallel contexts |
Дата | |
Msg-id | CA+TgmoYL4mXLZe6b9x68+z8SgHFuDiS0frwqDvAP_T8nD0HGTQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: parallel mode and parallel contexts (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: parallel mode and parallel contexts
(Robert Haas <robertmhaas@gmail.com>)
|
Список | pgsql-hackers |
On Tue, Apr 28, 2015 at 2:38 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Sun, Apr 26, 2015 at 2:03 PM, Euler Taveira <euler@timbira.com.br> wrote: >> On 19-03-2015 15:13, Robert Haas wrote: >>> On Wed, Mar 18, 2015 at 5:36 PM, Andres Freund <andres@2ndquadrant.com> wrote: >>>> Reading the README first, the rest later. So you can comment on my >>>> comments, while I actually look at the code. Parallelism, yay! >>> >> I'm also looking at this piece of code and found some low-hanging fruit. > > Thanks. 0001 and 0003 look good, but 0002 is actually unrelated to this patch. So, I think it makes sense to split up this patch in two. There's no real debate, AFAICS, about anything in the patch other than the heavyweight locking stuff. So I'd like to go ahead and commit the rest. That's attached here as parallel-mode-v10.patch. The remaining question here is what to do about the heavyweight locking. There seem to be a couple of possible approaches to that, and perhaps with this specific issue separated out we can focus in on that aspect of the problem. Let me start by restating my goals for that portion of the patch: 1. Avoid undetected deadlocks. For example, suppose A and A1 are cooperating parallel processes, and B is an unrelated process. If A1 waits for a lock held by B, and B waits for a lock held by A, and A is waiting for A1 to finish executing before winding up parallelism, the deadlock detector currently won't detect that. Either the deadlock detector needs to know that A waits for A1, or it needs to view the lock held by A as also held by A1, so that it can detect a cycle A1 -> B -> A1. 2. Avoid unprincipled deadlocks. For example, suppose A holds AccessExclusiveLock on a relation R and attempts a parallel sequential scan on R. It launches a worker A1. If A1 attempts to lock R and blocks, and A then waits for A1, we have a deadlock. Detecting this deadlock (as per goal 1) is better than not detecting it, but it's not good enough. Either A shouldn't have attempted a parallel sequential scan in the first place, or it should run to completion without deadlocking with its own workers. 3. Allow parallel DDL. For example, parallel VACUUM FULL or parallel CLUSTER. We don't have code for these things today, but the locking architecture we choose should not preclude doing them later. 4. Avoid restricting parallelism based on the history of the transaction. If BEGIN; COPY foo FROM ...; SELECT ... FROM foo can use parallelism, then BEGIN; TRUNCATE foo; COPY foo FROM ...; SELECT ... FROM foo should also be able to use parallelism. 5. Impose as few restriction as possible on what can be done in parallel mode. For example, the current version of this patch (attached as parallel-mode-locking.patch) allows parallel workers to do system cache lookups, where a relation lock is taken and released, but they cannot take and retain any heavyweight locks; so for example a user-defined function that runs on SQL query against some unrelated table and returns the results isn't parallel-safe right now. This is not ideal. That's all I've got for goals. How do we achieve those goals? So far I think we've hit on three approaches to deadlock detection that I think are at least somewhat plausible. None of them are perfect: D-1. Teach the deadlock detector about implicit edges between cooperating parallel processes. Upthread, I wrote about some problems with this approach: http://www.postgresql.org/message-id/CA+TgmoZ0tOPvfuz5BmEUqmdQ9xYQHzkG4jjJXduYDgQJXmPFbQ@mail.gmail.com D-2. Copy locks from the master to every worker, and forbid workers from acquiring any new, long-lived locks. (Locks that are taken and released quickly, like a system catalog lock for a syscache lookup, or a relation extension lock, won't cause deadlock.) Copying locks ensures that whenever there is a dangerous structure like A1 -> B -> A in the waits-for graph, there will also be a cycle A1 -> B -> A1, so the unmodified deadlock detector will detect the deadlock. Forbidding workers from acquiring any new, long-lived locks prevents locks taken after the copying step from causing similar problems. The restriction against new, long-lived locks is annoying. Also, a transaction holding many locks (like pg_dump) and using many workers will use an enormous number of lock table slots. D-3. Teach the deadlock detector to consider all locks held by any member of the locking group as held by the leader. This guarantees that all deadlocks between the parallel group and other processes in the system will be detected. When a process in the group requests a lock that conflicts with one already held by another group member, but not with any held by a process outside the group, grant it anyway, and without waiting behind pending conflicting lock requests. At the end of parallelism, transfer any locks held by workers and not released back to the leader. In a world where lock requests within a group can't conflict, there's no such thing as a deadlock within a parallel group, so we don't need to detect deadlocks of that type, and there will never be any unprincipled deadlocks within a group because there will never be any deadlocks within a group at all. The biggest problem with this approach is figuring out how to make it safe. I originally proposed this approach on the "group locking" thread and it kind of went down in flames. If two parallel backends are trying to extend the same relation at the same time, or lock the same tuple or page at the same time, those requests have got to conflict. In those cases, we are using locking to enforce real mutual exclusion. However, that doesn't seem like a serious problem. First, locks of those types will not be held at the start of parallelism; starting parallelism while you hold a lock of one of those types would be ridiculous. Second, locks of those types are never long-lived: you take them, do what you need to do, and then release them. Finally, I don't believe you ever take any other heavyweight locks while holding them; I *think* they're always the last heavyweight lock you take. So I think we could allow these locks to conflict normally without introducing any deadlock risk. The only locks that we need to consider as mutually non-conflicting are relation and database object locks. Those are a different kettle of fish: when we change a relation's relfilenode, for example, we must keep the relation locked for the duration of the transaction because of MVCC concenrs. The new pg_class tuple isn't visible to anyone else yet, and any further modifications to the relation must be done using the new relfilenode. But none of that matters for a parallel worker, which shares the same snapshot. There are still cases that do matter; for example, if a parallel backend could REINDEX a backend being concurrently scanned by another parallel backend in the same group, that would cause problems, because REINDEX uses backend-local state that wouldn't be shared. But these cases can also arise without parallelism, due to cursors, and there is an existing mechanism (CheckTableNotInUse) to prevent them. So I think that's OK, too. If not, we can fix other problems as we find them; I don't see any major architectural problems here. The other big problem with this approach is figuring out how to implement it. It doesn't work to have the worker processes manipulate the leader's lock-related data structures as if they were the parent, both because the locking code relies on the shared data structures to match its backend-private data structures and also because if we do block waiting for a lock, there could be more than one waiting process in the lock group at a time, and we need to know which specific processes are waiting so we can wake them up at the correct time. Instead, I think the way to do this is to add an additional field to each PROCLOCK storing a pointer to the leader's PGPROC; FindLockCycleRecurse() can check whether two PGPROCs have the same leader pointer instead of whether the PGPROCs themselves are the same. It can also refuse to follow a waits-for edge if that edge is of one of the types we do want to conflict within a locking group (e.g. relation extension). --- I'm OK with any of these approaches if we can agree on how to solve the problems they respectively present. Currently, I've got an implementation of D-2; I started with an implementation of D-3, which was flawed, but perhaps the idea is salveable, per the discussion above. Andres has consistently advocated for D-1, but see the link above for where I'm stuck in terms of implementing that. There may be another approach we haven't come up with yet, too, for all I know, and that's fine too if it accomplishes the goals listed above. Thanks, -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
В списке pgsql-hackers по дате отправления: