Обсуждение: lock AccessShareLock on object 0/1260/0 is already held
Hello What is state of this issue? http://archives.postgresql.org/pgsql-hackers/2011-09/msg00423.php We detect this issue in our servers - we use 9.1.6 on Linux locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid| mode | granted ----------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-------+-----------------+---------relation | 5630403 | 1259 | | | | | | | | 20/0 | 14960 | AccessShareLock | t Interesting of this issue - it is blocks DBI login Exception thrown: DBI connect('database=xxx;host=yyy;port=5432','username',...) failed: FATAL: lock AccessShareLock on object 5630403/1259/0 is already held#012 but I am able to login to this database from command line without problems There is unfriendly error message - I cannot get information about statement that exactly raise this fatal state. The most old message that I found is Dec 10 15:01:51 cl-pdwh-d01 postgres[32548]: [6-1] (2012-12-10 15:01:51 CET db_hw7ny3roi257xe886fvc5krqgosp6g23) Dec 10 15:01:51 cl-pdwh-d01 postgres[32548]: [6-2] ERROR: canceling statement due to user request Dec 10 15:01:51 cl-pdwh-d01 postgres[32738]: [6-1] (2012-12-10 15:01:51 CET db_hw7ny3roi257xe886fvc5krqgosp6g23) Dec 10 15:01:51 cl-pdwh-d01 postgres[32738]: [6-2] FATAL: lock AccessShareLock on object 5630403/1259/0 is already held Dec 10 15:01:51 cl-pdwh-d01 postgres[32739]: [6-1] (2012-12-10 15:01:51 CET db_hw7ny3roi257xe886fvc5krqgosp6g23) Dec 10 15:01:51 cl-pdwh-d01 postgres[32739]: [6-2] FATAL: lock AccessShareLock on object 5630403/1259/0 is already held Dec 10 15:02:29 cl-pdwh-d01 postgres[957]: [6-1] (2012-12-10 15:02:29 CET db_hw7ny3roi257xe886fvc5krqgosp6g23) Dec 10 15:02:29 cl-pdwh-d01 postgres[957]: [6-2] FATAL: lock AccessShareLock on object 5630403/1259/0 is already held Regards Pavel Stehule
Pavel Stehule <pavel.stehule@gmail.com> writes: > What is state of this issue? > http://archives.postgresql.org/pgsql-hackers/2011-09/msg00423.php AFAICS we never identified the cause. It was pretty clear that something was failing to clean up shared-memory lock data structures, but not what that something was. The last productive suggestion was to add process-exit-time logging of whether unreleased locks remain, but if Dave ever did that, he didn't report back what he found. Maybe you could add such logging on your machines. regards, tom lane
2013/1/4 Tom Lane <tgl@sss.pgh.pa.us>: > Pavel Stehule <pavel.stehule@gmail.com> writes: >> What is state of this issue? >> http://archives.postgresql.org/pgsql-hackers/2011-09/msg00423.php > > AFAICS we never identified the cause. It was pretty clear that > something was failing to clean up shared-memory lock data structures, > but not what that something was. The last productive suggestion was > to add process-exit-time logging of whether unreleased locks remain, > but if Dave ever did that, he didn't report back what he found. probably I am able to find statement, that was canceled as last "success" statement from our application logs. And probably it will be LOCK ... or CREATE TABLE AS SELECT Recheck on session end can help us to drop this leaked locks, but I don't understand how it can help with finding with finding the cause? > > Maybe you could add such logging on your machines. > > regards, tom lane
Hello 2013/1/4 Pavel Stehule <pavel.stehule@gmail.com>: > 2013/1/4 Tom Lane <tgl@sss.pgh.pa.us>: >> Pavel Stehule <pavel.stehule@gmail.com> writes: >>> What is state of this issue? >>> http://archives.postgresql.org/pgsql-hackers/2011-09/msg00423.php >> >> AFAICS we never identified the cause. It was pretty clear that >> something was failing to clean up shared-memory lock data structures, >> but not what that something was. The last productive suggestion was >> to add process-exit-time logging of whether unreleased locks remain, >> but if Dave ever did that, he didn't report back what he found. > > > probably I am able to find statement, that was canceled as last > "success" statement from our application logs. And probably it will be > LOCK ... or CREATE TABLE AS SELECT > > Recheck on session end can help us to drop this leaked locks, but I > don't understand how it can help with finding with finding the cause? I am sending a patch that removes a described issue we use this patch on our productions servers with some others diagnostics + * GOODDATA + * + * we would to diagnose how much often is proces startup find + * orphaned PGPROCLOCKS. Don't use ereport, because process is + * not fully prepared in this moment. + */ + { + int i; + bool fix_orphaned_locks = false; + + for (i = 0; i < NUM_LOCK_PARTITIONS; i++) + { + if (!SHMQueueEmpty(&(MyProc->myProcLocks[i]))) + { + fprintf(stderr, "ProcKill: found orphaned PGPROCLOCK proc:%p, pid:%d\n", MyProc, MyProcPid); + fix_orphaned_locks = true; + break; + } + } + + if (fix_orphaned_locks) + { + LockReleaseAll(DEFAULT_LOCKMETHOD, true); + /* Release transaction-level advisory locks */ + LockReleaseAll(USER_LOCKMETHOD, false); + fix_orphaned_locks = false; + + for (i = 0; i < NUM_LOCK_PARTITIONS; i++) + { + if (!SHMQueueEmpty(&(MyProc->myProcLocks[i]))) + { + fprintf(stderr, "ProcKill: found orphaned PGPROCLOCK proc:%p, pid:%d again\n", MyProc, MyProcPid); + fix_orphaned_locks = true; + break; + } + } + + if (fix_orphaned_locks) + fprintf(stderr, "ProcKill: orphaned PGPROCLOCK proc:%p, pid:%d was removed\n", MyProc, MyProcPid); + else + fprintf(stderr, "ProcKill: orphaned PGPROCLOCK proc:%p, pid:%d was not removed\n", MyProc, MyProcPid); + } + } + but it dosn't detect and removed any orphaned locks, so this code is not necessary. I can't to explain why moving initialization from proces to global helps, but it just helps. 9.1 will be suppported next four years so maybe this patch can be usefull for somebody or maybe can be merged Regards Pavel > > > > >> >> Maybe you could add such logging on your machines. >> >> regards, tom lane