Обсуждение: Idea for fixing parallel pg_dump's lock acquisition problem

Поиск
Список
Период
Сортировка

Idea for fixing parallel pg_dump's lock acquisition problem

От
Tom Lane
Дата:
We just had another complaint (bug #15767) about parallel dump's
inability to cope with concurrent lock requests.  The problem is
well described by the comments for lockTableForWorker():

 * Acquire lock on a table to be dumped by a worker process.
 *
 * The master process is already holding an ACCESS SHARE lock.  Ordinarily
 * it's no problem for a worker to get one too, but if anything else besides
 * pg_dump is running, there's a possible deadlock:
 *
 * 1) Master dumps the schema and locks all tables in ACCESS SHARE mode.
 * 2) Another process requests an ACCESS EXCLUSIVE lock (which is not granted
 *      because the master holds a conflicting ACCESS SHARE lock).
 * 3) A worker process also requests an ACCESS SHARE lock to read the table.
 *      The worker is enqueued behind the ACCESS EXCLUSIVE lock request.
 * 4) Now we have a deadlock, since the master is effectively waiting for
 *      the worker.  The server cannot detect that, however.
 *
 * To prevent an infinite wait, prior to touching a table in a worker, request
 * a lock in ACCESS SHARE mode but with NOWAIT.  If we don't get the lock,
 * then we know that somebody else has requested an ACCESS EXCLUSIVE lock and
 * so we have a deadlock.  We must fail the backup in that case.

Failing the whole backup is, of course, not a nice outcome.

While thinking about that, it occurred to me that we could close the
gap if the server somehow understood that the master was waiting for
the worker.  And it's actually not that hard to make that happen:
we could use advisory locks.  Consider a dance like the following:

1. Master has A.S. lock on table t, and dispatches a dump job for t
to worker.

2. Worker chooses some key k, does pg_advisory_lock(k), and sends k
back to master.

3. Worker attempts to get A.S. lock on t.  This might block, if some
other session has a pending lock request on t.

4. Upon receipt of message from worker, master also does
pg_advisory_lock(k).  This blocks, but now the server can see that a
deadlock exists, and after deadlock_timeout elapses it will fix the
deadlock by letting the worker bypass the other pending lock request.

5. Once worker has the lock on table t, it does pg_advisory_unlock(k)
to release the master.

6. Master also does pg_advisory_unlock(k), and goes on about its business.


I've tested that the server side of this works, for either order of steps
3 and 4.  It seems like mostly just a small matter of programming to teach
pg_dump to do this, although there are some questions to resolve, mainly
how we choose the advisory lock keys.  If there are user applications
running that also use advisory locks, there could be unwanted
interference.  One easy improvement is to use pg_try_advisory_lock(k) in
step 2, and just choose a different k if the lock's in use.  Perhaps,
since we don't expect that the locks would be held long, that's
sufficient --- but I suspect that users might wish for some pg_dump
options to restrict the set of keys it could use.

Another point is that the whole dance is unnecessary in the normal
case, so maybe we should only do this if an initial attempt to get
the lock on table t fails.  However, LOCK TABLE NOWAIT throws an
error if it can't get the lock, so this would require using a
subtransaction or starting a whole new transaction in the worker,
so maybe that's more trouble than it's worth.  Some performance
testing might be called for.  There's also the point that the code
path would go almost entirely untested if it's not exercised always.

Lastly, we'd probably want both the master and worker processes to
run with small values of deadlock_timeout, so as to reduce the
time wasted before the server breaks the deadlock.  Are there any
downsides to that?  (If so, again maybe it's not worth the trouble,
since the typical case is that no wait is needed.)

Thoughts?  I'm not volunteering to write this right now, but maybe
somebody else will take up the project.

            regards, tom lane



Re: Idea for fixing parallel pg_dump's lock acquisition problem

От
Robert Haas
Дата:
On Wed, Apr 17, 2019 at 11:34 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> While thinking about that, it occurred to me that we could close the
> gap if the server somehow understood that the master was waiting for
> the worker.  And it's actually not that hard to make that happen:
> we could use advisory locks.  Consider a dance like the following:
>
> 1. Master has A.S. lock on table t, and dispatches a dump job for t
> to worker.
>
> 2. Worker chooses some key k, does pg_advisory_lock(k), and sends k
> back to master.
>
> 3. Worker attempts to get A.S. lock on t.  This might block, if some
> other session has a pending lock request on t.
>
> 4. Upon receipt of message from worker, master also does
> pg_advisory_lock(k).  This blocks, but now the server can see that a
> deadlock exists, and after deadlock_timeout elapses it will fix the
> deadlock by letting the worker bypass the other pending lock request.
>
> 5. Once worker has the lock on table t, it does pg_advisory_unlock(k)
> to release the master.
>
> 6. Master also does pg_advisory_unlock(k), and goes on about its business.

Neat idea.

> I've tested that the server side of this works, for either order of steps
> 3 and 4.  It seems like mostly just a small matter of programming to teach
> pg_dump to do this, although there are some questions to resolve, mainly
> how we choose the advisory lock keys.  If there are user applications
> running that also use advisory locks, there could be unwanted
> interference.  One easy improvement is to use pg_try_advisory_lock(k) in
> step 2, and just choose a different k if the lock's in use.  Perhaps,
> since we don't expect that the locks would be held long, that's
> sufficient --- but I suspect that users might wish for some pg_dump
> options to restrict the set of keys it could use.

This seems like a pretty significant wart.  I think we probably need a
better solution, but I'm not sure what it is.  I guess we could define
a new lock space that is specifically intended for this kind of
inter-process coordination, where it's expected that the key is a PID.

> Another point is that the whole dance is unnecessary in the normal
> case, so maybe we should only do this if an initial attempt to get
> the lock on table t fails.  However, LOCK TABLE NOWAIT throws an
> error if it can't get the lock, so this would require using a
> subtransaction or starting a whole new transaction in the worker,
> so maybe that's more trouble than it's worth.  Some performance
> testing might be called for.  There's also the point that the code
> path would go almost entirely untested if it's not exercised always.

Seems like it might make sense just to do it always.

> Lastly, we'd probably want both the master and worker processes to
> run with small values of deadlock_timeout, so as to reduce the
> time wasted before the server breaks the deadlock.  Are there any
> downsides to that?  (If so, again maybe it's not worth the trouble,
> since the typical case is that no wait is needed.)

I think we shouldn't do this part.  It's true that reducing
deadlock_timeout prevents time from being wasted if a deadlock occurs,
but that problem is not confined to this case; that's what
deadlock_timeout does in general.  I can't see why we should
substitute our judgement regarding the proper value of
deadlock_timeout for that of the DBA in this one case.  I'm a little
fuzzy-headed at the moment but it seems to me that no deadlock will
occur unless the worker fails to get the lock, which should be rare,
and even then I wonder if we couldn't somehow jigger things so that
the special case in ProcSleep ("Determine where to add myself in the
wait queue.") rescues us.  Even if not, a 1 second delay in a rare
case doesn't seem like a huge problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Idea for fixing parallel pg_dump's lock acquisition problem

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Apr 17, 2019 at 11:34 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> ... If there are user applications
>> running that also use advisory locks, there could be unwanted
>> interference.  One easy improvement is to use pg_try_advisory_lock(k) in
>> step 2, and just choose a different k if the lock's in use.  Perhaps,
>> since we don't expect that the locks would be held long, that's
>> sufficient --- but I suspect that users might wish for some pg_dump
>> options to restrict the set of keys it could use.

> This seems like a pretty significant wart.  I think we probably need a
> better solution, but I'm not sure what it is.  I guess we could define
> a new lock space that is specifically intended for this kind of
> inter-process coordination, where it's expected that the key is a PID.

My thought was that we'd like this to work without requiring any new
server-side facilities, so that pg_dump could use it against any server
version that supports parallel dump.  If we're willing to restrict
the fix to server >= v13, or whenever this gets done, then yes we could
(probably) arrange things to avoid the hazard.  I'm not quite sure how
it'd work though.  We can't just invent a process-local key space,
because both the master and worker need to be able to lock the same
key.

            regards, tom lane



Re: Idea for fixing parallel pg_dump's lock acquisition problem

От
Julien Rouhaud
Дата:
On Fri, Apr 19, 2019 at 7:17 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Wed, Apr 17, 2019 at 11:34 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> ... If there are user applications
> >> running that also use advisory locks, there could be unwanted
> >> interference.  One easy improvement is to use pg_try_advisory_lock(k) in
> >> step 2, and just choose a different k if the lock's in use.  Perhaps,
> >> since we don't expect that the locks would be held long, that's
> >> sufficient --- but I suspect that users might wish for some pg_dump
> >> options to restrict the set of keys it could use.
>
> > This seems like a pretty significant wart.  I think we probably need a
> > better solution, but I'm not sure what it is.  I guess we could define
> > a new lock space that is specifically intended for this kind of
> > inter-process coordination, where it's expected that the key is a PID.
>
> My thought was that we'd like this to work without requiring any new
> server-side facilities, so that pg_dump could use it against any server
> version that supports parallel dump.

Couldn't we use LOCKTAG_USERLOCK for that?  It should be compatible
with all needed server versions, and the odds of collision seem low as
the extension as been dropped in pg 8.2 and the pgfoundry project has
no activity since 2006.  I'm not aware of any other extension  using
it, and a quick search didn't find anything.



Re: Idea for fixing parallel pg_dump's lock acquisition problem

От
Tom Lane
Дата:
Julien Rouhaud <rjuju123@gmail.com> writes:
> On Fri, Apr 19, 2019 at 7:17 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> My thought was that we'd like this to work without requiring any new
>> server-side facilities, so that pg_dump could use it against any server
>> version that supports parallel dump.

> Couldn't we use LOCKTAG_USERLOCK for that?

Um, no, because there's no way for pg_dump to get at it in existing
server releases.  The only available feature that supports mid-transaction
unlock is the advisory-lock stuff.

If we have to add new code, we could perfectly well add another
LockTagType to go with it.  But that doesn't really solve the problem.
Whatever SQL API we provide would have to be available to everybody
(since pg_dump doesn't necessarily run as superuser), and as soon as
somebody says "hey that's a neat feature, I think I'll use it in my
app" we're back to square one.  It's not very apparent how we could
have a lock tag that's available to pg_dump processes and nobody else.

I had some vague ideas about making it depend on the processes sharing
a snapshot; but it's not obvious how to get from there to a suitable
locktag, and in any case that certainly wouldn't be pre-existing
server functionality.

            regards, tom lane