Обсуждение: Stuck in "DELETE waiting"
My application's connections against PostgreSQL 8.1.4 seem to get stuck in deletion operations. Some sample ps output: postgres 18198 10.5 20.7 1072088 863040 ? S 11:59 14:23 postgres: [...] DELETE waiting postgres 18204 11.5 20.8 1072692 867708 ? S 11:59 15:43 postgres: [...] DELETE waiting postgres 18208 14.2 22.3 1071968 928656 ? S 11:59 19:23 postgres: [...] DELETE waiting postgres 18214 7.5 20.3 1072120 845832 ? S 11:59 10:12 postgres: [...] DELETE waiting postgres 18216 12.8 23.5 1072000 977688 ? S 12:00 17:26 postgres: [...] DELETE waiting The processes have these locks: # select * from pg_locks where pid in (18198, 18204, 18208, 18214, 18216); locktype | database | relation | page | tuple | transactionid | classid | objid | objsubid | transaction | pid | mode | granted ---------------+----------+----------+------+-------+--------------- +---------+-------+----------+-------------+-------+------------------ +--------- relation | 1231506 | 1231625 | | | | | | | 2989801133 | 18214 | AccessShareLock | t relation | 1231506 | 1231625 | | | | | | | 2989801133 | 18214 | RowExclusiveLock | t transactionid | | | | | 2989710024 | | | | 2989710024 | 18204 | ExclusiveLock | t relation | 1231506 | 2840720 | | | | | | | 2991168469 | 18198 | AccessShareLock | t relation | 1231506 | 2840720 | | | | | | | 2989804263 | 18216 | AccessShareLock | t relation | 1231506 | 1231625 | | | | | | | 2990523423 | 18208 | AccessShareLock | t relation | 1231506 | 1231625 | | | | | | | 2990523423 | 18208 | RowExclusiveLock | t relation | 1231506 | 1231625 | | | | | | | 2989804263 | 18216 | AccessShareLock | t relation | 1231506 | 1231625 | | | | | | | 2989804263 | 18216 | RowExclusiveLock | t transactionid | | | | | 2989544980 | | | | 2989710024 | 18204 | ShareLock | f tuple | 1231506 | 1231625 | 1607 | 63 | | | | | 2989710024 | 18204 | ExclusiveLock | t tuple | 1231506 | 1231625 | 1607 | 63 | | | | | 2990523423 | 18208 | ExclusiveLock | f transactionid | | | | | 2990486433 | | | | 2991168469 | 18198 | ShareLock | f transactionid | | | | | 2989804263 | | | | 2989804263 | 18216 | ExclusiveLock | t tuple | 1231506 | 1231625 | 1607 | 63 | | | | | 2989801133 | 18214 | ExclusiveLock | f transactionid | | | | | 2991168469 | | | | 2991168469 | 18198 | ExclusiveLock | t transactionid | | | | | 2989801133 | | | | 2989801133 | 18214 | ExclusiveLock | t tuple | 1231506 | 1231625 | 2148 | 27 | | | | | 2991168469 | 18198 | ExclusiveLock | t relation | 1231506 | 1231625 | | | | | | | 2991168469 | 18198 | AccessShareLock | t relation | 1231506 | 1231625 | | | | | | | 2991168469 | 18198 | RowExclusiveLock | t relation | 1231506 | 2840720 | | | | | | | 2990523423 | 18208 | AccessShareLock | t relation | 1231506 | 2840720 | | | | | | | 2989801133 | 18214 | AccessShareLock | t relation | 1231506 | 1231625 | | | | | | | 2989710024 | 18204 | AccessShareLock | t relation | 1231506 | 1231625 | | | | | | | 2989710024 | 18204 | RowExclusiveLock | t tuple | 1231506 | 1231625 | 1607 | 63 | | | | | 2989804263 | 18216 | ExclusiveLock | f relation | 1231506 | 2840720 | | | | | | | 2989710024 | 18204 | AccessShareLock | t transactionid | | | | | 2990523423 | | | | 2990523423 | 18208 | ExclusiveLock | t (27 rows) They all seem to be trying to get exclusive locks on page 1607, one of which was granted to process 18204 but never relinquished. If I kill the application, the postgres processes stick around forever until I restart the postmaster process. My deadlock_timeout setting is set to the default. Alexander.
Alexander Staubo <alex@purefiction.net> writes: > My application's connections against PostgreSQL 8.1.4 seem to get > stuck in deletion operations. > ... > # select * from pg_locks where pid in (18198, 18204, 18208, 18214, > 18216); You really need to show all of pg_locks, not a subset. In this example it appears that 18204 and 18198 are waiting for transactions that are not part of this subset, so it's hard to say what's happening. It could be that you are getting bit by this recently identified bug: http://archives.postgresql.org/pgsql-general/2006-11/msg01004.php http://archives.postgresql.org/pgsql-committers/2006-11/msg00121.php but the symptoms known so far for that involve unexpected "deadlock detected" errors, not silent hangups. regards, tom lane
On Nov 23, 2006, at 16:27 , Tom Lane wrote: > Alexander Staubo <alex@purefiction.net> writes: >> My application's connections against PostgreSQL 8.1.4 seem to get >> stuck in deletion operations. >> ... >> # select * from pg_locks where pid in (18198, 18204, 18208, 18214, >> 18216); > > You really need to show all of pg_locks, not a subset. In this > example > it appears that 18204 and 18198 are waiting for transactions that are > not part of this subset, so it's hard to say what's happening. See below. Seems 18204 is waiting for a process that is "idle in transaction"; makes me wish that PostgreSQL could export more information about the age of in-progress transactions. I am turning on logging so I can determine what a future hanging process is doing. # select * from pg_locks ; locktype | database | relation | page | tuple | transactionid | classid | objid | objsubid | transaction | pid | mode | granted ---------------+----------+----------+------+-------+--------------- +---------+-------+----------+-------------+-------+------------------ +--------- relation | 1231506 | 1231625 | | | | | | | 2989801133 | 18214 | AccessShareLock | t relation | 1231506 | 1231625 | | | | | | | 2989801133 | 18214 | RowExclusiveLock | t transactionid | | | | | 2991244269 | | | | 2991244269 | 22971 | ExclusiveLock | t transactionid | | | | | 2990486433 | | | | 2990486433 | 18207 | ExclusiveLock | t relation | 1231506 | 1231650 | | | | | | | 2989544980 | 18202 | AccessShareLock | t relation | 1231506 | 1231594 | | | | | | | 2991244269 | 22971 | AccessShareLock | t transactionid | | | | | 2989710024 | | | | 2989710024 | 18204 | ExclusiveLock | t relation | 1231506 | 1231864 | | | | | | | 2990486433 | 18207 | AccessShareLock | t relation | 1231506 | 10342 | | | | | | | 2991277798 | 23406 | AccessShareLock | t relation | 1231506 | 1231864 | | | | | | | 2991277508 | 22991 | AccessShareLock | t relation | 1231506 | 1231864 | | | | | | | 2991277508 | 22991 | RowExclusiveLock | t transactionid | | | | | 2991277508 | | | | 2991277508 | 22991 | ExclusiveLock | t relation | 1231506 | 1231571 | | | | | | | 2991244269 | 22971 | AccessShareLock | t relation | 1231506 | 2840720 | | | | | | | 2991168469 | 18198 | AccessShareLock | t relation | 1231506 | 1231650 | | | | | | | 2990741550 | 18199 | AccessShareLock | t relation | 1231506 | 1231650 | | | | | | | 2990486433 | 18207 | AccessShareLock | t relation | 1231506 | 1231625 | | | | | | | 2989544980 | 18202 | AccessShareLock | t relation | 1231506 | 1231625 | | | | | | | 2989544980 | 18202 | RowExclusiveLock | t relation | 1231506 | 2840720 | | | | | | | 2989804263 | 18216 | AccessShareLock | t relation | 1231506 | 1231625 | | | | | | | 2990523423 | 18208 | AccessShareLock | t relation | 1231506 | 1231625 | | | | | | | 2990523423 | 18208 | RowExclusiveLock | t relation | 1231506 | 1231625 | | | | | | | 2989804263 | 18216 | AccessShareLock | t relation | 1231506 | 1231625 | | | | | | | 2989804263 | 18216 | RowExclusiveLock | t transactionid | | | | | 2989544980 | | | | 2989710024 | 18204 | ShareLock | f tuple | 1231506 | 1231625 | 1607 | 63 | | | | | 2989710024 | 18204 | ExclusiveLock | t relation | 1231506 | 1231864 | | | | | | | 2990741550 | 18199 | AccessShareLock | t relation | 1231506 | 1231864 | | | | | | | 2989544980 | 18202 | AccessShareLock | t tuple | 1231506 | 1231625 | 1607 | 63 | | | | | 2990523423 | 18208 | ExclusiveLock | f transactionid | | | | | 2990486433 | | | | 2991168469 | 18198 | ShareLock | f transactionid | | | | | 2989804263 | | | | 2989804263 | 18216 | ExclusiveLock | t relation | 1231506 | 1231625 | | | | | | | 2990741550 | 18199 | AccessShareLock | t relation | 1231506 | 1231625 | | | | | | | 2990741550 | 18199 | RowExclusiveLock | t relation | 1231506 | 1231678 | | | | | | | 2991244269 | 22971 | AccessShareLock | t tuple | 1231506 | 1231625 | 1607 | 63 | | | | | 2989801133 | 18214 | ExclusiveLock | f transactionid | | | | | 2990741550 | | | | 2990741550 | 18199 | ExclusiveLock | t transactionid | | | | | 2991168469 | | | | 2991168469 | 18198 | ExclusiveLock | t relation | 1231506 | 1231864 | | | | | | | 2991244269 | 22971 | AccessShareLock | t relation | 1231506 | 1231864 | | | | | | | 2991244269 | 22971 | RowShareLock | t relation | 1231506 | 1231596 | | | | | | | 2991244269 | 22971 | RowExclusiveLock | t transactionid | | | | | 2989801133 | | | | 2989801133 | 18214 | ExclusiveLock | t tuple | 1231506 | 1231625 | 2148 | 27 | | | | | 2991168469 | 18198 | ExclusiveLock | t transactionid | | | | | 2989544980 | | | | 2989544980 | 18202 | ExclusiveLock | t transactionid | | | | | 2991277798 | | | | 2991277798 | 23406 | ExclusiveLock | t relation | 1231506 | 1231625 | | | | | | | 2991168469 | 18198 | AccessShareLock | t relation | 1231506 | 1231625 | | | | | | | 2991168469 | 18198 | RowExclusiveLock | t relation | 1231506 | 1231644 | | | | | | | 2989544980 | 18202 | AccessShareLock | t relation | 1231506 | 1231625 | | | | | | | 2990486433 | 18207 | AccessShareLock | t relation | 1231506 | 1231625 | | | | | | | 2990486433 | 18207 | RowExclusiveLock | t relation | 1231506 | 1231644 | | | | | | | 2990486433 | 18207 | AccessShareLock | t relation | 1231506 | 1231644 | | | | | | | 2990741550 | 18199 | AccessShareLock | t relation | 1231506 | 1231580 | | | | | | | 2991244269 | 22971 | AccessShareLock | t relation | 1231506 | 1231668 | | | | | | | 2991244269 | 22971 | AccessShareLock | t relation | 1231506 | 2840720 | | | | | | | 2990523423 | 18208 | AccessShareLock | t relation | 1231506 | 2840720 | | | | | | | 2989801133 | 18214 | AccessShareLock | t relation | 1231506 | 1231625 | | | | | | | 2989710024 | 18204 | AccessShareLock | t relation | 1231506 | 1231625 | | | | | | | 2989710024 | 18204 | RowExclusiveLock | t tuple | 1231506 | 1231625 | 1607 | 63 | | | | | 2989804263 | 18216 | ExclusiveLock | f relation | 1231506 | 2840720 | | | | | | | 2989710024 | 18204 | AccessShareLock | t transactionid | | | | | 2990523423 | | | | 2990523423 | 18208 | ExclusiveLock | t (59 rows) Alexander.
Alexander Staubo <alex@purefiction.net> writes: > See below. Seems 18204 is waiting for a process that is "idle in > transaction"; makes me wish that PostgreSQL could export more > information about the age of in-progress transactions. I am turning > on logging so I can determine what a future hanging process is doing. Yeah, it looks like PIDs 18202 and 18207 are the bottlenecks, and they are not waiting on locks. regards, tom lane
Alexander Staubo <alex@purefiction.net> writes: > On Nov 23, 2006, at 16:27 , Tom Lane wrote: > > > Alexander Staubo <alex@purefiction.net> writes: > >> My application's connections against PostgreSQL 8.1.4 seem to get > >> stuck in deletion operations. > >> ... > >> # select * from pg_locks where pid in (18198, 18204, 18208, 18214, > >> 18216); > > > > You really need to show all of pg_locks, not a subset. In this > > example > > it appears that 18204 and 18198 are waiting for transactions that are > > not part of this subset, so it's hard to say what's happening. > > See below. Seems 18204 is waiting for a process that is "idle in > transaction"; makes me wish that PostgreSQL could export more > information about the age of in-progress transactions. I am turning > on logging so I can determine what a future hanging process is doing. Have a look at the query_start field in pg_stat_activity for the process holding locks that's causing backlog. If you've got an old idle transaction that did update or delete on some rows, perhaps terminate these backends after some grace period. Clever joins over pg_locks and pg_stat_activity can show about all that's needed generally for investigating MVCC problems. -- ------------------------------------------------------------------------------- Jerry Sievers 305 854-3001 (home) Production Database Administrator 305 321-1144 (mobil WWW E-Commerce Consultant
On Nov 24, 2006, at 01:37 , Jerry Sievers wrote: > Have a look at the query_start field in pg_stat_activity for the > process holding locks that's causing backlog. Doesn't this require "stats_command_string = on"? I don't have that enabled on production servers. Alexander.