Обсуждение: reporting reason for certain locks

Поиск
Список
Период
Сортировка

reporting reason for certain locks

От
Alvaro Herrera
Дата:
Hi,

When we lock on a Xid or VirtualXid, there's no way to obtain clear
information on the reason for locking.  Consider the following example:

CREATE TABLE foo (a int);

Session 1:
BEGIN;
SELECT 1;
-- we now have a snapshot

Session 2:
CREATE INDEX CONCURRENTLY foo_a ON foo(a);

This blocks until transaction 1 commits, and it's not obvious to the
user the reason for this.  There's some info in pg_locks but it just
says it's blocked in a VirtualXid.

A much more common ocurrence is tuple locks.  We block in an Xid in that
case; and this has been a frequent question in the mailing lists and
IRC.

I think it would be very nice to be able to report something to the
user; however, I'm not seeing the mechanism.

A simple idea I had was that each backend would have a reserved shared
memory area where they would write what they are about to lock, when
locking an Xid or VXid.  Thus, if they block, someone else can examine
that and make the situation clearer.  The problem with this idea is that
it would require locking a LWLock just before trying each lock on
Xid/VXid, which would be horrible for performance.

... or maybe not, because when we call XactLockTableWait, we've already
established that we've accepted to sleep.

Thoughts?

-- 
Álvaro Herrera <alvherre@alvh.no-ip.org>


Re: reporting reason for certain locks

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> A much more common ocurrence is tuple locks.  We block in an Xid in that
> case; and this has been a frequent question in the mailing lists and
> IRC.

> I think it would be very nice to be able to report something to the
> user; however, I'm not seeing the mechanism.

At least for tuple locks, the information is already visible, because we
have a "real" lock on the target tuple before we try to lock the current
holder's VXID.  So I think this isn't so much a question of needing more
low-level mechanism as one of providing a more useful view --- some kind
of self-join on pg_locks is needed.
        regards, tom lane


Re: reporting reason for certain locks

От
Josh Berkus
Дата:
> ... or maybe not, because when we call XactLockTableWait, we've already
> established that we've accepted to sleep.
> 
> Thoughts?

Other than this being a sincere need, no.


--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: reporting reason for certain locks

От
Robert Haas
Дата:
On Mon, Nov 22, 2010 at 5:55 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> Hi,
>
> When we lock on a Xid or VirtualXid, there's no way to obtain clear
> information on the reason for locking.  Consider the following example:
>
> CREATE TABLE foo (a int);
>
> Session 1:
> BEGIN;
> SELECT 1;
> -- we now have a snapshot
>
> Session 2:
> CREATE INDEX CONCURRENTLY foo_a ON foo(a);
>
> This blocks until transaction 1 commits, and it's not obvious to the
> user the reason for this.  There's some info in pg_locks but it just
> says it's blocked in a VirtualXid.
>
> A much more common ocurrence is tuple locks.  We block in an Xid in that
> case; and this has been a frequent question in the mailing lists and
> IRC.
>
> I think it would be very nice to be able to report something to the
> user; however, I'm not seeing the mechanism.
>
> A simple idea I had was that each backend would have a reserved shared
> memory area where they would write what they are about to lock, when
> locking an Xid or VXid.  Thus, if they block, someone else can examine
> that and make the situation clearer.  The problem with this idea is that
> it would require locking a LWLock just before trying each lock on
> Xid/VXid, which would be horrible for performance.
>
> ... or maybe not, because when we call XactLockTableWait, we've already
> established that we've accepted to sleep.
>
> Thoughts?

How about publishing additional details to pg_stat_activity via
pgstat_report_waiting()?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: reporting reason for certain locks

От
Alvaro Herrera
Дата:
Excerpts from Robert Haas's message of mar nov 23 00:08:54 -0300 2010:

> How about publishing additional details to pg_stat_activity via
> pgstat_report_waiting()?

I'm not sure what you mean here.  Are you suggesting we should create a
new function with that name to report the reason for the lock?

If so, that's great -- but we'd be getting a bit ahead of ourselves.
Because while we do have the locking details available some of the time,
it's not easy to figure out what they are; and the rest of the time, it
just isn't available.

So I'm proposing a mechanism for this information to be available in the
first place.  When this is done we can talk about some nice user
interface for it.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: reporting reason for certain locks

От
Alvaro Herrera
Дата:
Excerpts from Tom Lane's message of lun nov 22 20:51:09 -0300 2010:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > A much more common ocurrence is tuple locks.  We block in an Xid in that
> > case; and this has been a frequent question in the mailing lists and
> > IRC.
> 
> > I think it would be very nice to be able to report something to the
> > user; however, I'm not seeing the mechanism.
> 
> At least for tuple locks, the information is already visible, because we
> have a "real" lock on the target tuple before we try to lock the current
> holder's VXID.  So I think this isn't so much a question of needing more
> low-level mechanism as one of providing a more useful view --- some kind
> of self-join on pg_locks is needed.

Hmm, that's true, but it seems ugly: if we are blocking on a
transactionid, then go back to pg_locks and extract a lock of type
"tuple"; if it's there, you know you're waiting for that; if it's not,
you have to guess that you're waiting on something else (what?).
(Right now, it seems the only other thing that could wait is CREATE
INDEX CONCURRENTLY, but I don't want to bet that we're not going to
create something else in the future.  There's no way to figure out
what's happening from pg_locks, in any case.)

So what I want is something a bit more trustworthy than that.

On the other hand, pg_locks is already rather unwieldy to use.  We
already have a self-join that tells us the details of what's locking
processes: you need to join pg_locks like this:

FROM   pg_catalog.pg_locks l1
JOIN   pg_catalog.pg_locks l2 ON (       (           l1.locktype, l1.database, l1.relation, l1.page,
l1.tuple,l1.virtualxid, l1.transactionid, l1.classid,           l1.objid, l1.objsubid       )   IS NOT DISTINCT FROM
  (           l2.locktype, l2.database, l2.relation, l2.page,           l2.tuple, l2.virtualxid, l2.transactionid,
l2.classid,          l2.objid, l2.objsubid       )   )
 

and throw in a bunch of left joins to see the details of database,
relation, etc.  This works fine for all kinds of locks except xid and
vxid ones.  I don't think it's fair to users to expect that they need to 
deal with that mess *plus* the details of tuple locks.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: reporting reason for certain locks

От
Robert Haas
Дата:
On Thu, Nov 25, 2010 at 9:00 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Excerpts from Robert Haas's message of mar nov 23 00:08:54 -0300 2010:
>
>> How about publishing additional details to pg_stat_activity via
>> pgstat_report_waiting()?
>
> I'm not sure what you mean here.  Are you suggesting we should create a
> new function with that name to report the reason for the lock?

No, what I was suggesting was taking the existing function:

extern void pgstat_report_waiting(bool waiting);

...and instead doing something like this:

extern void pgstat_report_waiting(char *reason);

...and then arrange to pass the reason via the eponymous argument.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: reporting reason for certain locks

От
Alvaro Herrera
Дата:
Excerpts from Robert Haas's message of jue nov 25 11:56:27 -0300 2010:

> No, what I was suggesting was taking the existing function:
> 
> extern void pgstat_report_waiting(bool waiting);
> 
> ...and instead doing something like this:
> 
> extern void pgstat_report_waiting(char *reason);
> 
> ...and then arrange to pass the reason via the eponymous argument.

Ah, I was looking for a SQL-callable function :-)  Hmm, maybe this would
work.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: reporting reason for certain locks

От
Robert Haas
Дата:
On Thu, Nov 25, 2010 at 10:05 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Excerpts from Robert Haas's message of jue nov 25 11:56:27 -0300 2010:
>
>> No, what I was suggesting was taking the existing function:
>>
>> extern void pgstat_report_waiting(bool waiting);
>>
>> ...and instead doing something like this:
>>
>> extern void pgstat_report_waiting(char *reason);
>>
>> ...and then arrange to pass the reason via the eponymous argument.
>
> Ah, I was looking for a SQL-callable function :-)  Hmm, maybe this would
> work.

I was just thinking it might be a good place for it because we're
already doing some nontrivial work at that point anyway.  Although,
lock acquisition (especially of AccessShareLocks) is already a bit of
a hotspot, so we definitely have to be careful what we put in there no
matter how we do it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: reporting reason for certain locks

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> On the other hand, pg_locks is already rather unwieldy to use.  We
> already have a self-join that tells us the details of what's locking
> processes: you need to join pg_locks like this:
> ...
> and throw in a bunch of left joins to see the details of database,
> relation, etc.

Sure.  I'm just suggesting one more left join to see if there's a tuple
lock.

> This works fine for all kinds of locks except xid and
> vxid ones.  I don't think it's fair to users to expect that they need to 
> deal with that mess *plus* the details of tuple locks.

Well, what was in the back of my mind was that we should create a join
of this sort as a stock system view, which would certainly improve
usability across the board.  Getting to consensus on exactly what the
view should contain might be hard though.
        regards, tom lane


Re: reporting reason for certain locks

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> No, what I was suggesting was taking the existing function:
> extern void pgstat_report_waiting(bool waiting);
> ...and instead doing something like this:
> extern void pgstat_report_waiting(char *reason);
> ...and then arrange to pass the reason via the eponymous argument.

The question is how many cycles are we willing to expend on preparing a
reason string that (in approximately 99.9% of the calls) will not be
of any use.  It would be much better to avoid doing this and instead
expend the extra work on the inspection side.
        regards, tom lane


Re: reporting reason for certain locks

От
Alvaro Herrera
Дата:
Excerpts from Tom Lane's message of jue nov 25 13:23:42 -0300 2010:
> Robert Haas <robertmhaas@gmail.com> writes:
> > No, what I was suggesting was taking the existing function:
> > extern void pgstat_report_waiting(bool waiting);
> > ...and instead doing something like this:
> > extern void pgstat_report_waiting(char *reason);
> > ...and then arrange to pass the reason via the eponymous argument.
> 
> The question is how many cycles are we willing to expend on preparing a
> reason string that (in approximately 99.9% of the calls) will not be
> of any use.  It would be much better to avoid doing this and instead
> expend the extra work on the inspection side.

I'm all for making this cheap -- and your proposal works for tuple
locks (ugly however it may be).  But it doesn't work for "snapshot"
locks such as the ones CREATE INDEX CONCURRENTLY takes.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: reporting reason for certain locks

От
Robert Haas
Дата:
On Thu, Nov 25, 2010 at 11:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> No, what I was suggesting was taking the existing function:
>> extern void pgstat_report_waiting(bool waiting);
>> ...and instead doing something like this:
>> extern void pgstat_report_waiting(char *reason);
>> ...and then arrange to pass the reason via the eponymous argument.
>
> The question is how many cycles are we willing to expend on preparing a
> reason string that (in approximately 99.9% of the calls) will not be
> of any use.  It would be much better to avoid doing this and instead
> expend the extra work on the inspection side.

I'd much rather have the information be pulled than pushed, if there's
a way to make that work.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company