Обсуждение: Very long " in transaction" query
Hi,
Recently, in our Production server, we found a "single query" being held up
in "<IDLE> in transaction" for more than 19 hours using the following query:
select date_trunc('second', current_timestamp - query_start) as runtime,
datname as database_name, current_query from pg_stat_activity where
current_query != '<IDLE>' order by 1 desc
but we're clueless which was the root cause of this issue and still hunting.
As we know, query output doesn't show up the actual query/statement.
We then ran the 3rd query available from PostgreSQL Wiki - Lock Monitoring
http://wiki.postgresql.org/wiki/Lock_Monitoring From query result output, I
could infer only the following but still not able to find out the real root
cause:
1) 2 tables are involved (table1 and table2)
2) Mostly table1's indexes are appearing in the output.
Pasted below result output containing only "<IDLE> in transaction". For
security reasons, I've masked/renamed table names and index names in
"relname" column. Though all index names are renamed to as
"table1_xxxxx_indx", all are different index names and not the same index.
Yes, we do have more than 30 indexes in table1.
Can somebody help me out what is going wrong/causing these "<IDLE> in
transaction"?
datname | relname | transactionid | mode |
granted | usename | substr | query_start
| age | procpid
---------+-------------------------------+---------------+-----------------+
---------+---------+--------------------------------+-----------------------
--------+------------------+---------
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table2 | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_pk | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_idx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx__indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1 | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | | 668748028 | ExclusiveLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx__indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx__indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
But no rows were returned for the 2nd query available in PostgreSQL Wiki -
Lock Monitoring.
We're running PostgreSQL v8.2.22 and pgpool v3.1.1 (only connection pooling
feature is used).
Regards,
Gnanam
On Thu, 3 May 2012 15:48:35 +0530, Gnanakumar wrote: > Hi, > > Recently, in our Production server, we found a "single query" being > held up > in "<IDLE> in transaction" for more than 19 hours using the following > query: [...] Usually these are application problems, not database ones. E.g. a piece of code not cathing an error correctly or breaking out of a loop or something like that and leaving the transaction open (not issuing COMMIT or ROLLBACK). Seens this quite often in production and it was always tracebla back to an application error. To find out what the session has last done before it started idling you would need to turn on statement logging. Then at least you might be able to trace down in which part of you application this was caused. Jan -- professional: http://www.oscar-consult.de private: http://neslonek.homeunix.org/drupal/
Gnanakumar wrote:
> Recently, in our Production server, we found a "single query" being
held up
> in "<IDLE> in transaction" for more than 19 hours using the following
query:
> select date_trunc('second', current_timestamp - query_start) as
runtime,
> datname as database_name, current_query from pg_stat_activity where
> current_query != '<IDLE>' order by 1 desc
>
> but we're clueless which was the root cause of this issue and still
hunting.
> As we know, query output doesn't show up the actual query/statement.
You won't be able to find the cause in PostgreSQL.
The cause is a database session that started a transaction, did some
work and never closed the transaction.
PostgreSQL can help you find out who the offending client is:
SELECT application_name, client_addr, client_hostname, client_port
FROM pg_stat_activity
WHERE procpid = 14740;
(Replace 14740 of the process ID of the "idle in transaction" backend).
Look on the client machine and find the process that holds TCP port
"client_port" open (on Linux you can use "lsof" for that).
Then you have found the culprit!
Yours,
Laurenz Albe
> SELECT application_name, client_addr, client_hostname, client_port > FROM pg_stat_activity > WHERE procpid = 14740; > (Replace 14740 of the process ID of the "idle in transaction" backend). > Look on the client machine and find the process that holds TCP port > "client_port" open (on Linux you can use "lsof" for that). Ours is a web-based application and all calls are made "only" from where the web server is running. No external calls are allowed in my case. Hence, the "only" client machine in my case is the web server.
Gnanakumar wrote: >> SELECT application_name, client_addr, client_hostname, client_port >> FROM pg_stat_activity >> WHERE procpid = 14740; >> (Replace 14740 of the process ID of the "idle in transaction" backend). >> Look on the client machine and find the process that holds TCP port >> "client_port" open (on Linux you can use "lsof" for that). > Ours is a web-based application and all calls are made "only" from where the > web server is running. No external calls are allowed in my case. Hence, > the "only" client machine in my case is the web server. Then you need to identify the web server process/thread that has client_port open and fix the problem there. If you get that process to close the transaction or terminate, your problem is solved. You can also terminate a database session from the server with the pg_terminate_backend function. The interesting thing to know would be where in your web application a transaction got left open, so that it does not happen again. Yours, Laurenz Albe
> Then you need to identify the web server process/thread that has > client_port open and fix the problem there. If you get that process > to close the transaction or terminate, your problem is solved. > You can also terminate a database session from the server with > the pg_terminate_backend function. Thanks for letting me know the existence of "pg_terminate_backend(pid int)" function. Unfortunately, this is available only from v8.4 onwards. We're running v8.2.22. > The interesting thing to know would be where in your web application > a transaction got left open, so that it does not happen again. Sure, we too are hunting out for the same.
On Fri, May 04, 2012 at 03:22:47PM +0530, Gnanakumar wrote: > > SELECT application_name, client_addr, client_hostname, client_port > > FROM pg_stat_activity > > WHERE procpid = 14740; > > > (Replace 14740 of the process ID of the "idle in transaction" backend). > > > Look on the client machine and find the process that holds TCP port > > "client_port" open (on Linux you can use "lsof" for that). > > Ours is a web-based application and all calls are made "only" from where the > web server is running. No external calls are allowed in my case. Hence, > the "only" client machine in my case is the web server. > You may also want to consider setting a statement_timeout to prevent this until you can find the problem with the application. Regards, Ken
"ktm@rice.edu" <ktm@rice.edu> wrote: > You may also want to consider setting a statement_timeout to > prevent this until you can find the problem with the application. How would a statement timeout help close a transaction on an idle connection? It's idle because no statements are being run. -Kevin
On Fri, May 04, 2012 at 08:57:08AM -0500, Kevin Grittner wrote: > "ktm@rice.edu" <ktm@rice.edu> wrote: > > > You may also want to consider setting a statement_timeout to > > prevent this until you can find the problem with the application. > > How would a statement timeout help close a transaction on an idle > connection? It's idle because no statements are being run. > > -Kevin Yes, you are correct. I misunderstood. Too bad we do not have a transaction_timeout. Regards, Ken
Hi Gnanam,
I assume your not ending your transactions with commit or rollback. When
using a connection pooler your connection is reused for next request. So
if one request does something that initiates an implizit BEGIN (like an
UPDATE/INSERT) without a COMMIT or ROLLBACK this connection remains in
IDLE IN TRANSACTION mode. Please ensure that you invoke a COMMIT after
each successful request or ROLLBACK after failed or control these inside
your application. I'm not familiar with pgpool but check if you can
configure your pooler for this behaviour
Thomas
Am 03.05.2012 12:18, schrieb Gnanakumar:
> Hi,
>
> Recently, in our Production server, we found a "single query" being held up
> in "<IDLE> in transaction" for more than 19 hours using the following query:
> select date_trunc('second', current_timestamp - query_start) as runtime,
> datname as database_name, current_query from pg_stat_activity where
> current_query != '<IDLE>' order by 1 desc
>
> but we're clueless which was the root cause of this issue and still hunting.
> As we know, query output doesn't show up the actual query/statement.
<snip>