|Summary:||[RFE] EAP6-17 Inconsistency for recovery when db connection fails running with CMR resource|
|Product:||[JBoss] JBoss Enterprise Application Platform 6||Reporter:||Ondrej Chaloupka <ochaloup>|
|Component:||Transaction Manager||Assignee:||Michael <mmusgrov>|
|Status:||CLOSED CURRENTRELEASE||QA Contact:||Ondrej Chaloupka <ochaloup>|
|Severity:||high||Docs Contact:||Russell Dickenson <rdickens>|
|Version:||6.3.0||CC:||hhovsepy, kkhan, mmusgrov, smumford|
|Target Release:||EAP 6.3.0|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2014-06-28 15:40:06 UTC||Type:||Bug|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Cloudforms Team:||---||Target Upstream Version:|
|Bug Depends On:||1085877|
Description Ondrej Chaloupka 2014-03-24 20:47:22 UTC
Created attachment 878167 [details] server.log This seems to be (as what I can see) similar issue to the bz1080035. CMR resource failed to commit and returns outcome which has meaning of continue of the transaction. Test xa resource is commited. Then db connection is restored. And the recovery then does not push the cmr resource to be commited. Scenario: prepareHalt Steps: a. enlistment jdbc cmr resource b. enlistment test xa resource c. prepare jdbc cmr resource d. killing connection to database e. prepare test xa resource (no db connection) f. commit jdbc cmr resource -> failing as connection is down f-a. jdbc cmr resource returns throws org.jboss.jca.core.spi.transaction.local.LocalXAException and continue with TwoPhaseOutcome.FINISH_ERROR g. commiting test xa resource h. start the connection to database i. start recovery On recovery doCommit on the cmr resource is called but it seems that it does not cause any change in database. Database does contain data as rollback would be called (the same data as at the start of the transaction). The server log contains error like: ERROR [com.arjuna.ats.arjuna] (Periodic Recovery) Update was not successful, expected: 4 actual:1'
Comment 2 Ondrej Chaloupka 2014-03-25 08:21:46 UTC
Created attachment 878338 [details] server.log #prepareHaltBefore I'm hitting the same trouble when running similar scenario but with different order of enlistment of the resources: 1) stop db connection 2) prepare test XA resource 3) prepare db cmr resource Then the prepare fails with exception like: ERROR [com.arjuna.ats.arjuna] (EJB default - 6) Could not commit the preparedConnection: org.jboss.jca.core.spi.transaction.local.LocalXAException: IJ001156: Could not commit local transaction There is no error 2pc outcome and transaction continue in work. So the test XA resource is commited but after recovery the CMR resource is left in the state like rollback would be.
Comment 3 Michael 2014-04-01 14:33:15 UTC
The current implementation does not correctly handle the XAException returned from the commit. You can track my fix with the following external tracker: https://issues.jboss.org/browse/JBTM-2132 which I believe will resolve this test failure. Do note however, that if the resource returns [XAER_RMFAIL] "An error occurred that makes the resource manager unavailable." we carry on committing the remaining resources (since we have no way of knowing what the resource actually did). In the particular scenario you describe I believe that the resource will return XA_RBROLLBACK which, when I have fixed JBTM-2132, will rollback the remaining resources.
Comment 4 Ondrej Chaloupka 2014-04-02 04:47:13 UTC
Hi Mike, I see. If I understand correctly when CMR is used then the org.jboss.jca.core.tx.jbossts.LocalXAResourceImpl always goes in play and and so the XA_RBROLLBACK is returned. Talked about the first scenario described in comment #c0 I'm just curious whether CMR works in the same way as XA resources. I mean whether the XAER_RMFAIL is returned and transaction continues in committing if periodic recovery will ensures that CMR resource is committed or whether there will be info in transaction log with some heuristic status? Thanks
Comment 5 Michael 2014-04-07 18:59:02 UTC
After updating the code to handle the LocalXAExcepton I am now seeing the correct behaviour. Steps f) g) are the new correct behaviour: Steps: a. enlistment jdbc cmr resource b. enlistment test xa resource c. prepare jdbc cmr resource d. killing connection to database e. prepare test xa resource (no db connection) f. commit jdbc cmr resource -> failing as connection is down f-a. jdbc cmr resource returns throws HEURISTIC_ROLLBACK and not FINISH_ERROR (as it did before my fix). g. now the TM correctly rolls back the test xa resource (instead of erroneously committing it as it did before) and the log is removed (ie recovery does not need to do anything with the completed transaction) This is the correct behaviour. Note that the test JPAProxyCMRCrashRecoveryTestCase#prepareHalt for reproducing the bug needs updating since it contains a final step where is checks the log and finds the text: "failed with exception XAException.XA_RBROLLBACK: org.jboss.jca.core.spi.transaction.local.LocalXAException: IJ001156: Could not commit local transaction" The test says this is unexpected, however, this line is printed as a result of step f) where commit is called on the CMR resource which fails as expected because the connection is down: ie this log line is expected.
Comment 6 JBoss JIRA Server 2014-04-09 15:03:35 UTC
Tom Jenkinson <firstname.lastname@example.org> updated the status of jira JBTM-2132 to Closed
Comment 7 Ondrej Chaloupka 2014-04-23 13:45:07 UTC
Verified for EAP 6.3.0.ER2.
Comment 8 Scott Mumford 2014-05-13 23:49:18 UTC
Micheal, could you please provide a draft release note in the Doc Text field above, as I'm unable to clearly discern what the problem was, what caused it and how it was fixed from this or linked tickets. Unfortunately time is of the essence here if this issue is to make it into the 6.3.0 Beta Release Notes document.
Comment 9 Michael 2015-02-02 12:58:58 UTC
(In reply to Scott Mumford from comment #8) > Micheal, could you please provide a draft release note in the Doc Text field > above, as I'm unable to clearly discern what the problem was, what caused it > and how it was fixed from this or linked tickets. > > Unfortunately time is of the essence here if this issue is to make it into > the 6.3.0 Beta Release Notes document. Hi Scott I am assuming this is historical and you no longer need a response.