There is a bit strange error after crashrecovery on prepare halt test on PostgreSQL database. After the rollback attempt the PostgreSQL returns incorrect code XAException.XA_HEURHAZ and throws exception. It happens currently on all versions of Postgresql dbs that we are testing and are installed in lab. This happens just for transaction manager running on JTS. Tom remarks following: "It turns out that stack trace is coming out of postgres so it may be an issue you need to discuss with them? I can tell you they are returning XAException.XA_HEURHAZ [1] when asked to rollback. When we ask them for Xids they are returning the xid, we then call rollback and it gets that exception and prints that stack trace." [1] WARN [com.arjuna.ats.jtax] (RequestProcessor-9) ARJUNA024023: XAResourceRecord.rollback caused an XA error: XAException.XAER_RMERR from resource XAResourceWrapperImpl@5062a5[xaResource=org.jboss.jca.adapters.jdbc.xa.XAManagedConnection@13ea022 pad=false overrideRmValue=null productName=PostgreSQL productVersion=9.0.8 jndiName=java:jboss/xa-datasources/VerificationDS] in transaction < 131072, 29, 36, 0000000000-1-1127001-34-4592-3281-1456-70004049, 292929292929292929292828156292930-5-16121-3110158522292929697829292929292929 >: org.postgresql.xa.PGXAException: Error rolling back prepared transaction at org.postgresql.xa.PGXAConnection.rollback(PGXAConnection.java:416) at org.jboss.jca.adapters.jdbc.xa.XAManagedConnection.rollback(XAManagedConnection.java:342) at org.jboss.jca.core.tx.jbossts.XAResourceWrapperImpl.rollback(XAResourceWrapperImpl.java:174) at com.arjuna.ats.internal.jta.resources.jts.orbspecific.XAResourceRecord.rollback(XAResourceRecord.java:350) [jbossjts-jacorb-4.17.4.Final-redhat-2.jar:4.17.4.Final-redhat-2] at com.arjuna.ArjunaOTS.OTSAbstractRecordPOA._invoke(OTSAbstractRecordPOA.java:230) [jbossjts-jacorb-4.17.4.Final-redhat-2.jar:4.17.4.Final-redhat-2] at org.jacorb.poa.RequestProcessor.invokeOperation(RequestProcessor.java:306) [jacorb-2.3.2-redhat-4.jar:2.3.2-redhat-4] at org.jacorb.poa.RequestProcessor.process(RequestProcessor.java:626) [jacorb-2.3.2-redhat-4.jar:2.3.2-redhat-4] at org.jacorb.poa.RequestProcessor.run(RequestProcessor.java:769) [jacorb-2.3.2-redhat-4.jar:2.3.2-redhat-4] Caused by: org.postgresql.util.PSQLException: ERROR: prepared transaction with identifier "131072_AAAAAAAAAAAAAP//fwAAAd7TXOBR8jj5AAAAKDE=_AAAAAAAAAAAAAP//fwAAAd7TXOBR8jj5AAAALQAAAAAAAAAA" does not exist at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2157) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1886) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255) at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:555) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:403) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:331) at org.postgresql.xa.PGXAConnection.rollback(PGXAConnection.java:406) ... 7 more
I assume that the error is on the side of the AS and not in the PostgreSQL. There are two rollback calls for the same TX. The stack trace of the first rollback call: Thread [Periodic Recovery] (Suspended (breakpoint at line 350 in XAResourceRecord)) XARecoveryResourceImple(XAResourceRecord).rollback() line: 350 XARecoveryResourceImple(XAResourceRecord).doRecovery(boolean) line: 1335 XARecoveryResourceImple(XAResourceRecord).recover() line: 1158 XARecoveryResourceImple.recover() line: 88 XARecoveryModule(XARecoveryModule).xaRecoverySecondPass(XAResource) line: 661 XARecoveryModule(XARecoveryModule).bottomUpRecovery() line: 419 XARecoveryModule(XARecoveryModule).periodicWorkSecondPass() line: 194 PeriodicRecovery.doWorkInternal() line: 789 PeriodicRecovery.run() line: 371 The stack trace of the second rollback call which causes the error is printed to the log: Daemon Thread [RequestProcessor-10] (Suspended (breakpoint at line 350 in XAResourceRecord)) XARecoveryResourceImple(XAResourceRecord).rollback() line: 350 XARecoveryResourceImple(OTSAbstractRecordPOA)._invoke(String, InputStream, ResponseHandler) line: 230 RequestProcessor.invokeOperation() line: 306 RequestProcessor.process() line: 626 RequestProcessor.run() line: 769 Small part the log from PostgreSQL. LOG: execute <unnamed>: SET extra_float_digits = 3 LOG: execute <unnamed>: SHOW TRANSACTION ISOLATION LEVEL LOG: execute <unnamed>: SELECT 1 LOG: execute <unnamed>: SET extra_float_digits = 3 LOG: execute <unnamed>: SHOW TRANSACTION ISOLATION LEVEL LOG: execute <unnamed>: SELECT 1 LOG: execute <unnamed>: SELECT gid FROM pg_prepared_xacts LOG: execute <unnamed>: SELECT gid FROM pg_prepared_xacts LOG: execute <unnamed>: ROLLBACK PREPARED '131072_AAAAAAAAAAAAAP//fwAAAdN2921R8nmwAAAAKDE=_AAAAAAAAAAAAAP//fwAAAdN2921R8nmwAAAALQAAAAAAAAAA' LOG: execute <unnamed>: SELECT 1 LOG: execute <unnamed>: SELECT 1 LOG: execute <unnamed>: SELECT gid FROM pg_prepared_xacts LOG: execute <unnamed>: SELECT gid FROM pg_prepared_xacts LOG: execute <unnamed>: ROLLBACK PREPARED '131072_AAAAAAAAAAAAAP//fwAAAdN2921R8nmwAAAAKDE=_AAAAAAAAAAAAAP//fwAAAdN2921R8nmwAAAALQAAAAAAAAAA' ERROR: prepared transaction with identifier "131072_AAAAAAAAAAAAAP//fwAAAdN2921R8nmwAAAAKDE=_AAAAAAAAAAAAAP//fwAAAdN2921R8nmwAAAALQAAAAAAAAAA" does not exist STATEMENT: ROLLBACK PREPARED '131072_AAAAAAAAAAAAAP//fwAAAdN2921R8nmwAAAAKDE=_AAAAAAAAAAAAAP//fwAAAdN2921R8nmwAAAALQAAAAAAAAAA'
Hi Ondra, I looked at the code and it is expected that a resource can have rollback called on it twice with JTS due to the way the OTS specification works where it requires both bottom up and top down recovery to be triggered by the recovering resource. The initial value returned by replay_completion is used by the resource to try to recover itself, but the call triggers the transaction to call recovery on everything in the intentions list which will be at least the resource as well but could be others. This means that all resources must be able to have rollback called on them multiple times. It will not cause data integrity problems so I would mark this as expected behavior. Thanks, Tom
Hi Tom, does it mean that PostgreSQL should be able to handle it? Or is it responsibility of JCA or resource adapter? Do you have an idea? Thank you Ondra
Hi Ondra, It means PostGres should be able to handle it. Thanks, Tom
It was confirmed that this is problem of the PostgreSQL jdbc driver (there was a discussion on pgsql-bugs mailing list starting 2013-07-29 under subject 'Incorrect response code after XA recovery'). The pull request was put to fix this: https://github.com/pgjdbc/pgjdbc/pull/76
Hi Ondra, An issue has also been created in the bug tracking system for pgjdbc using GitHub. It is issue #78. https://github.com/pgjdbc/pgjdbc/issues/78 I have sent updates to the pgjdbc and bugs mailing list to inform the community the background to the testcase provided and the pull request. Regards, Jeremy
(In reply to Jeremy Whiting from comment #6) > Hi Ondra, > An issue has also been created in the bug tracking system for pgjdbc using > GitHub. It is issue #78. > > https://github.com/pgjdbc/pgjdbc/issues/78 > > I have sent updates to the pgjdbc and bugs mailing list to inform the > community the background to the testcase provided and the pull request. > > Regards, > Jeremy Hi Jeremy, thank you for fulfilling info. Ondra
Hi Tom, I would like ask you how to process with this bugzilla. I think that it could be set to CLOSED/NOT A BUG (or something similar) as it was not an issue of app server. Information about this would be nice to have in release notes of EAP. I've added some doc info. Would you be so kind to check my text? Thank you Ondra
Hi Ondra, in terms of your text, would it make sense to reference the specific issue in the pgsql issue tracking system? Also, can you confirm it actually was fixed as your text implies it was but I don't think Jeremey's pull request was merged yet?
Hi Tom, I think that it makes sense to refer to the issue. I've changed the doc text. Ouch, my mistake. I haven't checked the status of the issue properly. But for the current state is important to inform users about reason of the exception messages in the log. I'll check the driver behaviour when it will be a built with the fix available. I've changed the doc text in this way. Ondra
Thanks Ondra, looks good to me
(In reply to Ondrej Chaloupka from comment #10) > Hi Tom, > > I think that it makes sense to refer to the issue. I've changed the doc text. > > Ouch, my mistake. I haven't checked the status of the issue properly. > But for the current state is important to inform users about reason of the > exception messages in the log. > I'll check the driver behaviour when it will be a built with the fix > available. I've changed the doc text in this way. > > Ondra Hi Ondra, My patch to the driver was merged into Git master branch of the postgresql jdbc driver. Customers can either use a driver built from the latest master branch or wait for the next release. The current one is '9.2-1002' so anything after that. Regards, Jeremy
Hi, To clarify, this should be a docs issue I think? There is no fix to the Transaction Manager required. Maybe @Lucas can be the assignee? Tom
Or maybe this could be set to closed/not fixed. As the doc test was correctly updated (http://documentation-devel.engineering.redhat.com/docs/en-US/JBoss_Enterprise_Application_Platform/6.1/html/6.1.1_Release_Notes/ar01s07s03.html) and there is currently no other work from our side?
I wonder, who can close the BZ? Is it ondra as the reporter?
I'm not sure with the bz workflow but I hope that I can do it. In case that this would not be right, please, set another correct status. Thanks. Summary: the problem is not in the Narayana TM but on the side of the PostgreSQL resource adapter. The adapter was changed and the fix should be available in some future release of RAR. Check the comments above.