Обсуждение: Coping with nLocks overflow
We have recently seen one definite and one probable report of overflow of the nLocks field of a backend's local lock table: http://archives.postgresql.org/pgsql-bugs/2008-09/msg00021.php While it's still unclear exactly why 8.3 seems more prone to this than earlier releases, the general shape of the problem seems clear enough. We have many code paths that intentionally take a lock on some object and leave it locked until end of transaction. Repeat such a command on the same object enough times within one transaction, and voila, overflow. What's news, perhaps, is that we've reached a performance level where this can actually happen within transactions of lengths that people might try to run. I'm considering that a simple solution to this might be to widen nLocks from int to int64. This would definitely fix it on machines that have working int64 arithmetic, and if there are any left that do not, they're probably not fast enough to encounter the overflow in real-world usage anyway. For machines that aren't native 64-bit it would add a couple of cycles to lock acquisition/release, but that's hardly likely to be measurable compared to all the other work done in LockAcquire/LockRelease. Comments, objections? regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > We have recently seen one definite and one probable report of overflow > of the nLocks field of a backend's local lock table: > http://archives.postgresql.org/pgsql-bugs/2008-09/msg00021.php > ... > Comments, objections? In that case the problem could have been postponed by making nlocks unsigned. Not much point in that I guess. Alternatively perhaps we could indicate when taking a lock that we intend to hold the lock until the end of the transaction. In that case we don't need the usage counter at all and could just mark it with a special value which we never increment or decrement just wait until we release all locks at the end of transaction? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!
Gregory Stark <stark@enterprisedb.com> writes: > Alternatively perhaps we could indicate when taking a lock that we intend to > hold the lock until the end of the transaction. In that case we don't need the > usage counter at all and could just mark it with a special value which we > never increment or decrement just wait until we release all locks at the end > of transaction? I considered that, and also considered installing an overflow flag (the idea being that once nLocks overflows we'd just insist on holding the lock till transaction end). But the point isn't clear ... I mean, I think no one but me even believes anymore in the concept of keeping the code base minimally safe for INT64_IS_BUSTED machines ;-). Given the risk of creating a bug or masking future lock-acquisition bugs, I thought that adding complexity here wasn't warranted. regards, tom lane