Обсуждение: Incorrect response code after XA recovery
Hi, I would like to consult with you a problematic response put by PostgreSQL after transaction recovery run by Narayana (JBossTS). I work on tests for Narayana and I hit a issue with PostgreSQL. The db returns incorrect code XAException.XA_HEURHAZ whenthe TM does recovery after crash of the jboss eap app server. The exception is following: Caused by: org.postgresql.util.PSQLException: ERROR: prepared transaction with identifier "131072_AAAAAAAAAAAAAP//fwAAAd7TXOBR8jj5AAAAKDE=_AAAAAAAAAAAAAP//fwAAAd7TXOBR8jj5AAAALQAAAAAAAAAA"does not exist It's run on PostgreSQL 9.2 but the older versions seem to be affected as well. The problem occurs when TM runs on JTS transactions. The idea of the test: The test enlists two resources to a transaction. There is called prepare on resource of PostgreSQL. The app server crashesbefore prepare is called on second transaction participant. After restart of the app server TM tries to recover thetransaction. As the fail occurs during prepare phase rollback is expected. The OTS specification requires both bottom up and top down recovery to be triggered by the recovering resource. This causesthat two rollback calls are done against the DB. DB receives rollback call and does the rollback. Then for the secondtime it returns the exceptional code. As the DB already rollbacked the transaction and forgot about it the DB returnserror that no such transaction exists. But this seems to be against OTS specification. There are some more details in the following bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=988724 Do you have some experience with such behaviour? Can I suppose this being problem of PostgreSQL? Or is there already somebug for this issue in Postgres bugtracking system? Thank you Ondra
Ondrej Chaloupka <ochaloup@redhat.com> writes: > The OTS specification requires both bottom up and top down recovery to be triggered by the recovering resource. This causesthat two rollback calls are done against the DB. DB receives rollback call and does the rollback. Then for the secondtime it returns the exceptional code. As the DB already rollbacked the transaction and forgot about it the DB returnserror that no such transaction exists. But this seems to be against OTS specification. It's not likely that we would consider changing the behavior of ROLLBACK PREPARED. The alternatives we would have are (1) silently accept a ROLLBACK against a non-existent transaction ID, or (2) remember every rolled-back ID forever. Neither seems sane in the least. It seems to me that this is something client-side code, probably the XA manager, would need to deal with. The XA manager already has to track uncommitted 2-phase transactions, and would furthermore have the best idea of when it would be safe to forget about a rolled-back ID. Right offhand it appears to me that that Red Hat bug is filed against the correct component, and you need to push them harder to fix their bug/shortcoming rather than claim it's our problem. regards, tom lane
Tom Jenkinson <tom.jenkinson@redhat.com> writes: > A little bit of information in the linked bugzilla report is that the > exception being returned has an XA error code of XAER_RMERR "An error > occurred in rolling back the transaction branch. The resource manager is > free to forget about the branch when returning this error so long as all > accessing threads of control have been notified of the branch�s state." > That does not sound right to me, wouldn't XAER_NOTA "The specified XID > is not known by the resource manager" be more accurate? No idea, but in any case that's outside Postgres' purview. It's barely possible that the Postgres JDBC driver has something to do with that, but it sounds more like the XA manager's turf. regards, tom lane
Hi Tom, A little bit of information in the linked bugzilla report is that the exception being returned has an XA error code of XAER_RMERR "An error occurred in rolling back the transaction branch. The resource manager is free to forget about the branch when returning this error so long as all accessing threads of control have been notified of the branch’s state." That does not sound right to me, wouldn't XAER_NOTA "The specified XID is not known by the resource manager" be more accurate? Thanks, Tom On 29/07/13 14:50, Tom Lane wrote: > Ondrej Chaloupka <ochaloup@redhat.com> writes: >> The OTS specification requires both bottom up and top down recovery to be triggered by the recovering resource. This causesthat two rollback calls are done against the DB. DB receives rollback call and does the rollback. Then for the secondtime it returns the exceptional code. As the DB already rollbacked the transaction and forgot about it the DB returnserror that no such transaction exists. But this seems to be against OTS specification. > It's not likely that we would consider changing the behavior of ROLLBACK > PREPARED. The alternatives we would have are (1) silently accept a > ROLLBACK against a non-existent transaction ID, or (2) remember every > rolled-back ID forever. Neither seems sane in the least. > > It seems to me that this is something client-side code, probably the XA > manager, would need to deal with. The XA manager already has to track > uncommitted 2-phase transactions, and would furthermore have the best > idea of when it would be safe to forget about a rolled-back ID. > > Right offhand it appears to me that that Red Hat bug is filed against > the correct component, and you need to push them harder to fix their > bug/shortcoming rather than claim it's our problem. > > regards, tom lane
Hi Tom, On Mon 29 Jul 2013 15:46:12 BST, Tom Lane wrote: > Tom Jenkinson <tom.jenkinson@redhat.com> writes: >> A little bit of information in the linked bugzilla report is that the >> exception being returned has an XA error code of XAER_RMERR "An error >> occurred in rolling back the transaction branch. The resource manager is >> free to forget about the branch when returning this error so long as all >> accessing threads of control have been notified of the branch’s state." > >> That does not sound right to me, wouldn't XAER_NOTA "The specified XID >> is not known by the resource manager" be more accurate? > > No idea, but in any case that's outside Postgres' purview. It's barely > possible that the Postgres JDBC driver has something to do with that, > but it sounds more like the XA manager's turf. I am not sure what you mean here as I don't know the structure of how the PostGres project is packaged, all I know is that the PostGres JDBC driver component appears to be returning an XAException with the message "Error rolling back prepared transaction" and an errorCode of XAException.XAER_RMERR rather than XAER_NOTA. Is there a different component within your bug tracking system we should be using to raise this against the JDBC driver instead? Thanks, Tom
On Jul 29, 2013, at 16:57, Tom Jenkinson <tom.jenkinson@redhat.com> wrote: > Hi Tom, > > On Mon 29 Jul 2013 15:46:12 BST, Tom Lane wrote: >> Tom Jenkinson <tom.jenkinson@redhat.com> writes: >>> A little bit of information in the linked bugzilla report is that the >>> exception being returned has an XA error code of XAER_RMERR "An error >>> occurred in rolling back the transaction branch. The resource manager is >>> free to forget about the branch when returning this error so long as all >>> accessing threads of control have been notified of the branch’s state." >> >>> That does not sound right to me, wouldn't XAER_NOTA "The specified XID >>> is not known by the resource manager" be more accurate? >> >> No idea, but in any case that's outside Postgres' purview. It's barely >> possible that the Postgres JDBC driver has something to do with that, >> but it sounds more like the XA manager's turf. > > I am not sure what you mean here as I don't know the structure of how the PostGres project is packaged, all I know is thatthe PostGres JDBC driver component appears to be returning an XAException with the message "Error rolling back preparedtransaction" and an errorCode of XAException.XAER_RMERR rather than XAER_NOTA. Looking at the error codes, it appears that it isn't even the Postgres JDBC driver returning that error, but the XA manageryou're using, which is not a part of Postgres (nor is the JDBC driver, for that matter - that's a separate project). The errors you're quoting are from the XA manager and are about XA manager stuff. For all we know, the actual error appearsto be occuring in the XA manager and not in Postgres. It's possible that the XA manager error is a result of an errorthat Postgres returned, but since the XA manager prints its own error message and not the original one, you'll needto uncover those error messages before we can help you with them. For all we know at this point, the error is with your XA manager, not with Postgres. If you want to be sure, grep the source of the JDBC driver for those error codes; I doubt you'll find them in there. Google was kind enough to point me here: http://jdbc.postgresql.org/development/git.html Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll find there is no forest.
Hi Alban, I stripped down the code to a raw XA example using the latest postgres driver available in maven central. It demonstrates that regardless of what the codebase might suggest, it is certainly the case that postgres is returning XAER_RMERR in the scenario where the resource manager no longer knows about the Xid. The code is available here: https://github.com/tomjenkinson/xa-recovery/commit/944d45e86a91eacb9489843acfbf6a80f1b4b820 I hope that this helps, Tom On Mon 29 Jul 2013 18:52:31 BST, Alban Hertroys wrote: > On Jul 29, 2013, at 16:57, Tom Jenkinson <tom.jenkinson@redhat.com> wrote: > >> Hi Tom, >> >> On Mon 29 Jul 2013 15:46:12 BST, Tom Lane wrote: >>> Tom Jenkinson <tom.jenkinson@redhat.com> writes: >>>> A little bit of information in the linked bugzilla report is that the >>>> exception being returned has an XA error code of XAER_RMERR "An error >>>> occurred in rolling back the transaction branch. The resource manager is >>>> free to forget about the branch when returning this error so long as all >>>> accessing threads of control have been notified of the branch’s state." >>> >>>> That does not sound right to me, wouldn't XAER_NOTA "The specified XID >>>> is not known by the resource manager" be more accurate? >>> >>> No idea, but in any case that's outside Postgres' purview. It's barely >>> possible that the Postgres JDBC driver has something to do with that, >>> but it sounds more like the XA manager's turf. >> >> I am not sure what you mean here as I don't know the structure of how the PostGres project is packaged, all I know isthat the PostGres JDBC driver component appears to be returning an XAException with the message "Error rolling back preparedtransaction" and an errorCode of XAException.XAER_RMERR rather than XAER_NOTA. > > > Looking at the error codes, it appears that it isn't even the Postgres JDBC driver returning that error, but the XA manageryou're using, which is not a part of Postgres (nor is the JDBC driver, for that matter - that's a separate project). > > The errors you're quoting are from the XA manager and are about XA manager stuff. For all we know, the actual error appearsto be occuring in the XA manager and not in Postgres. It's possible that the XA manager error is a result of an errorthat Postgres returned, but since the XA manager prints its own error message and not the original one, you'll needto uncover those error messages before we can help you with them. > > For all we know at this point, the error is with your XA manager, not with Postgres. > > If you want to be sure, grep the source of the JDBC driver for those error codes; I doubt you'll find them in there. > Google was kind enough to point me here: http://jdbc.postgresql.org/development/git.html > > Alban Hertroys > -- > If you can't see the forest for the trees, > cut the trees and you'll find there is no forest. >
Tom Jenkinson escribió: > Hi Alban, > > I stripped down the code to a raw XA example using the latest > postgres driver available in maven central. It demonstrates that > regardless of what the codebase might suggest, it is certainly the > case that postgres is returning XAER_RMERR in the scenario where the > resource manager no longer knows about the Xid. > > The code is available here: > https://github.com/tomjenkinson/xa-recovery/commit/944d45e86a91eacb9489843acfbf6a80f1b4b820 Those error codes do certainly appear in the PGXAConnection.java source in the pgjdbc git. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services