Обсуждение: locking question - why is this not a deadlock?

Поиск
Список
Период
Сортировка

locking question - why is this not a deadlock?

От
peter royal
Дата:
i am using PostgreSQL 7.4.1 (i am aware of my need to upgrade :)

i have a situation where i *believe* postgresql should be detecting a
deadlock, but it is not. in the locks view you see below, all of the
processes waiting on the 'numberfactory' table are blocked. (which
smells like a deadlock to me). i killed pid 6829, and then i
immediately saw the log entries that are at the bottom of this
message regarding deadlocks amongst the remaining locks. pid 6829 was
in that state of locks for close to an hour.

* is this a bug?
* if so, is this a known bug? fixed in the latest 7.4 release or
would i have to migrate to 8.0?

statement used to view locks:

select ps.procpid, ps.query_start, ps.current_query, pc.relname,
pl.mode, pl.granted from pg_locks pl left join pg_class pc on
pl.relation = pc.oid, pg_stat_activity ps where pl.pid = ps.procpid
order by ps.query_start;

(I apologize for the wrapping, its also at http://pace2020.com/
~proyal/locks.html)

procpid |          query_start          |
current_query                        |     relname      |
mode       | granted
---------+-------------------------------
+-------------------------------------------------------------
+------------------+------------------+---------
     6829 | 2005-06-07 08:42:30.303431-04 | UPDATE numberfactory SET
synjob1=78609 WHERE (synautoinc=1) | numberfactory    |
AccessShareLock  | t
     6829 | 2005-06-07 08:42:30.303431-04 | UPDATE numberfactory SET
synjob1=78609 WHERE (synautoinc=1) | numberfactory    |
RowExclusiveLock | t
     6829 | 2005-06-07 08:42:30.303431-04 | UPDATE numberfactory SET
synjob1=78609 WHERE (synautoinc=1) | csr              |
AccessShareLock  | t
     6829 | 2005-06-07 08:42:30.303431-04 | UPDATE numberfactory SET
synjob1=78609 WHERE (synautoinc=1) |                  |
ExclusiveLock    | t
     6829 | 2005-06-07 08:42:30.303431-04 | UPDATE numberfactory SET
synjob1=78609 WHERE (synautoinc=1) |                  |
ShareLock        | f
     6829 | 2005-06-07 08:42:30.303431-04 | UPDATE numberfactory SET
synjob1=78609 WHERE (synautoinc=1) | salesperson      |
AccessShareLock  | t
     6829 | 2005-06-07 08:42:30.303431-04 | UPDATE numberfactory SET
synjob1=78609 WHERE (synautoinc=1) | jobbillingsetup  |
AccessShareLock  | t
     6856 | 2005-06-07 08:42:36.770765-04 | <IDLE> in
transaction                                       | job
| AccessShareLock  | t
     6856 | 2005-06-07 08:42:36.770765-04 | <IDLE> in
transaction                                       | jobtype
| AccessShareLock  | t
     6856 | 2005-06-07 08:42:36.770765-04 | <IDLE> in
transaction                                       | jobbillingsetup
| AccessShareLock  | t
     6856 | 2005-06-07 08:42:36.770765-04 | <IDLE> in
transaction                                       | jobstatus
| AccessShareLock  | t
     6856 | 2005-06-07 08:42:36.770765-04 | <IDLE> in
transaction                                       | salesperson
| AccessShareLock  | t
     6856 | 2005-06-07 08:42:36.770765-04 | <IDLE> in
transaction                                       |
| ExclusiveLock    | t
     6856 | 2005-06-07 08:42:36.770765-04 | <IDLE> in
transaction                                       | numberfactory
| AccessShareLock  | t
     6856 | 2005-06-07 08:42:36.770765-04 | <IDLE> in
transaction                                       | numberfactory
| RowExclusiveLock | t
     6856 | 2005-06-07 08:42:36.770765-04 | <IDLE> in
transaction                                       | csr
| AccessShareLock  | t
     6826 | 2005-06-07 08:42:36.787609-04 | <IDLE> in
transaction                                       | salesperson
| AccessShareLock  | t
     6826 | 2005-06-07 08:42:36.787609-04 | <IDLE> in
transaction                                       | shipvia
| AccessShareLock  | t
     6826 | 2005-06-07 08:42:36.787609-04 | <IDLE> in
transaction                                       | pricelist
| AccessShareLock  | t
     6826 | 2005-06-07 08:42:36.787609-04 | <IDLE> in
transaction                                       |
| ExclusiveLock    | t
     7803 | 2005-06-07 08:45:12.838824-04 | UPDATE numberfactory SET
synjob1=78610 WHERE (synautoinc=1) |                  |
ShareLock        | f
     7803 | 2005-06-07 08:45:12.838824-04 | UPDATE numberfactory SET
synjob1=78610 WHERE (synautoinc=1) | numberfactory    |
AccessShareLock  | t
     7803 | 2005-06-07 08:45:12.838824-04 | UPDATE numberfactory SET
synjob1=78610 WHERE (synautoinc=1) | numberfactory    |
RowExclusiveLock | t
     7803 | 2005-06-07 08:45:12.838824-04 | UPDATE numberfactory SET
synjob1=78610 WHERE (synautoinc=1) |                  |
ExclusiveLock    | t
     6857 | 2005-06-07 08:50:13.78227-04  | UPDATE numberfactory SET
synjob1=78610 WHERE (synautoinc=1) |                  |
ExclusiveLock    | t
     6857 | 2005-06-07 08:50:13.78227-04  | UPDATE numberfactory SET
synjob1=78610 WHERE (synautoinc=1) | numberfactory    |
AccessShareLock  | t
     6857 | 2005-06-07 08:50:13.78227-04  | UPDATE numberfactory SET
synjob1=78610 WHERE (synautoinc=1) | numberfactory    |
RowExclusiveLock | t
     6857 | 2005-06-07 08:50:13.78227-04  | UPDATE numberfactory SET
synjob1=78610 WHERE (synautoinc=1) |                  |
ShareLock        | f
     5234 | 2005-06-07 08:51:45.638224-04 | UPDATE numberfactory SET
synjob1=78610 WHERE (synautoinc=1) |                  |
ShareLock        | f
     5234 | 2005-06-07 08:51:45.638224-04 | UPDATE numberfactory SET
synjob1=78610 WHERE (synautoinc=1) | numberfactory    |
AccessShareLock  | t
     5234 | 2005-06-07 08:51:45.638224-04 | UPDATE numberfactory SET
synjob1=78610 WHERE (synautoinc=1) | numberfactory    |
RowExclusiveLock | t
     5234 | 2005-06-07 08:51:45.638224-04 | UPDATE numberfactory SET
synjob1=78610 WHERE (synautoinc=1) |                  |
ExclusiveLock    | t
     4157 | 2005-06-07 08:55:45.251806-04 | UPDATE numberfactory SET
synjob1=78610 WHERE (synautoinc=1) |                  |
ExclusiveLock    | t
     4157 | 2005-06-07 08:55:45.251806-04 | UPDATE numberfactory SET
synjob1=78610 WHERE (synautoinc=1) | numberfactory    |
AccessShareLock  | t
     4157 | 2005-06-07 08:55:45.251806-04 | UPDATE numberfactory SET
synjob1=78610 WHERE (synautoinc=1) | numberfactory    |
RowExclusiveLock | t
     4157 | 2005-06-07 08:55:45.251806-04 | UPDATE numberfactory SET
synjob1=78610 WHERE (synautoinc=1) |                  |
ShareLock        | f
     5774 | 2005-06-07 08:59:44.656757-04 | <IDLE> in
transaction                                       |
| ExclusiveLock    | t
     5774 | 2005-06-07 08:59:44.656757-04 | <IDLE> in
transaction                                       | salesperson
| AccessShareLock  | t
    10584 | 2005-06-07 09:34:29.612307-04 |
<IDLE>
|                  | ExclusiveLock    | t
    10584 | 2005-06-07 09:34:29.612307-04 |
<IDLE>                                                      |
pg_stat_activity | AccessShareLock  | t
    10584 | 2005-06-07 09:34:29.612307-04 |
<IDLE>                                                      |
pg_shadow        | AccessShareLock  | t
    10584 | 2005-06-07 09:34:29.612307-04 |
<IDLE>                                                      |
pg_database      | AccessShareLock  | t
    10584 | 2005-06-07 09:34:29.612307-04 |
<IDLE>                                                      |
pg_locks         | AccessShareLock  | t
    10584 | 2005-06-07 09:34:29.612307-04 |
<IDLE>                                                      |
pg_class         | AccessShareLock  | t
(44 rows)

deadlocks detected after i killed pid 6829:

Jun  7 09:39:09 epace postgres[6857]: [4-1] ERROR:  deadlock detected
Jun  7 09:39:09 epace postgres[6857]: [4-2] DETAIL:  Process 6857
waits for ShareLock on transaction 121712126; blocked by process 5234.
Jun  7 09:39:09 epace postgres[6857]: [4-3] ^IProcess 5234 waits for
ShareLock on transaction 121712121; blocked by process 6857.
Jun  7 09:39:10 epace postgres[6822]: [4-1] ERROR:  deadlock detected
Jun  7 09:39:10 epace postgres[6822]: [4-2] DETAIL:  Process 6822
waits for ShareLock on transaction 121712345; blocked by process 11650.
Jun  7 09:39:10 epace postgres[6822]: [4-3] ^IProcess 11650 waits for
ShareLock on transaction 121712143; blocked by process 6822.
Jun  7 09:39:11 epace postgres[4157]: [5-1] ERROR:  deadlock detected
Jun  7 09:39:11 epace postgres[4157]: [5-2] DETAIL:  Process 4157
waits for ShareLock on transaction 121712346; blocked by process 6826.
Jun  7 09:39:11 epace postgres[4157]: [5-3] ^IProcess 6826 waits for
ShareLock on transaction 121712145; blocked by process 4157.
Jun  7 09:39:12 epace postgres[9723]: [4-1] ERROR:  deadlock detected
Jun  7 09:39:12 epace postgres[9723]: [4-2] DETAIL:  Process 9723
waits for ShareLock on transaction 121712376; blocked by process 11659.
Jun  7 09:39:12 epace postgres[9723]: [4-3] ^IProcess 11659 waits for
ShareLock on transaction 121712371; blocked by process 9723.
Jun  7 09:39:13 epace postgres[6856]: [5-1] ERROR:  deadlock detected
Jun  7 09:39:13 epace postgres[6856]: [5-2] DETAIL:  Process 6856
waits for ShareLock on transaction 121712353; blocked by process 5774.
Jun  7 09:39:13 epace postgres[6856]: [5-3] ^IProcess 5774 waits for
ShareLock on transaction 121712514; blocked by process 6856.

thanks for the assistance.
-pete

--
peter royal -> proyal@pace2020.com


Вложения

Re: locking question - why is this not a deadlock?

От
Alvaro Herrera
Дата:
On Tue, Jun 07, 2005 at 10:26:00AM -0400, peter royal wrote:
> i am using PostgreSQL 7.4.1 (i am aware of my need to upgrade :)
>
> i have a situation where i *believe* postgresql should be detecting a
> deadlock, but it is not. in the locks view you see below, all of the
> processes waiting on the 'numberfactory' table are blocked. (which
> smells like a deadlock to me). i killed pid 6829, and then i
> immediately saw the log entries that are at the bottom of this
> message regarding deadlocks amongst the remaining locks. pid 6829 was
> in that state of locks for close to an hour.
>
> * is this a bug?
> * if so, is this a known bug? fixed in the latest 7.4 release or
> would i have to migrate to 8.0?

I don't think it's a bug, nor a deadlock situation.  The problem is the
"idle in transaction" server process, which holds some lock but isn't
doing anything useful with it.  Probably work would continue if the
transaction was closed.

This view is incomplete anyway, because you left out the xid column from
the pg_locks view, which point out exactly to the locks that are
blocking the other processes.

I believe this problem wouldn't ocurr (or would be resolved in a different
manner) in 8.1.  I'm too lazy to replicate your scenario to check though ...

> (I apologize for the wrapping, its also at http://pace2020.com/
> ~proyal/locks.html)

The <IDLE> tags display funny there, but it's definitely easier to read.

> deadlocks detected after i killed pid 6829:
>
> Jun  7 09:39:09 epace postgres[6857]: [4-1] ERROR:  deadlock detected
> Jun  7 09:39:09 epace postgres[6857]: [4-2] DETAIL:  Process 6857
> waits for ShareLock on transaction 121712126; blocked by process 5234.

Hmm, I guess this could be detected if the deadlock detection code was a
lot more complicated than it already is, but probably it won't matter on
8.1.

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"¿Cómo puedes confiar en algo que pagas y que no ves,
y no confiar en algo que te dan y te lo muestran?" (Germán Poo)

Re: locking question - why is this not a deadlock?

От
peter royal
Дата:
On Jun 7, 2005, at 1:15 PM, Alvaro Herrera wrote:
> I don't think it's a bug, nor a deadlock situation.  The problem is
> the
> "idle in transaction" server process, which holds some lock but isn't
> doing anything useful with it.  Probably work would continue if the
> transaction was closed.
>
> This view is incomplete anyway, because you left out the xid column
> from
> the pg_locks view, which point out exactly to the locks that are
> blocking the other processes.
>
> I believe this problem wouldn't ocurr (or would be resolved in a
> different
> manner) in 8.1.  I'm too lazy to replicate your scenario to check
> though ...

ah, thanks for the suggestion on what else to look for (first time
debugging something like this in postgresql). i'll be better prepared
to debug when it occurs again.

the 'idle in transaction' is a (mis) feature of the JDBC driver that
version of my software is using. (something that is fixed in the 8.0
JDBC driver thankfully, I'll likely start using that in the older
version of my sw to help alleviate this)

thanks again!
-pete

--
peter royal -> proyal@pace2020.com


Вложения