Bug 1080457
Summary: | [RFE] EAP6-17 Inconsistency for recovery being run after the CMR resource is already commited | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [JBoss] JBoss Enterprise Application Platform 6 | Reporter: | Ondrej Chaloupka <ochaloup> | ||||||
Component: | Transaction Manager | Assignee: | Michael <mmusgrov> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Ondrej Chaloupka <ochaloup> | ||||||
Severity: | high | Docs Contact: | Russell Dickenson <rdickens> | ||||||
Priority: | unspecified | ||||||||
Version: | 6.3.0 | CC: | hhovsepy, kkhan, mmusgrov, smumford | ||||||
Target Milestone: | ER2 | ||||||||
Target Release: | EAP 6.3.0 | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: |
Previous versions of JBoss EAP 6 contained a bug in the implementation of the Commit Markable Resource (CMR) recovery module that could cause changes to be rolled back, rather than recovered.
The expected behavior is for the CMR recovery module to move the record into a different part of the recovery store so it is ignored by other recovery modules, as only the CMR module is aware of the resource changes. If the connection to the database failed then the datasource did not get added to the collection `queriedResourceManagers` and the record did not get moved. As a result, a different recovery module would attempt to recover the transaction and the recovery would not occur as expected.
In this release of the product the code has been modified to ensure that the datasource is added as required, even if the connection fails.
|
Story Points: | --- | ||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2014-06-28 15:42:07 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 1085877 | ||||||||
Bug Blocks: | 1051640 | ||||||||
Attachments: |
|
Created attachment 878462 [details] ds.properties file for oracle This could be reproduced by test case of crash recovery testsuite git clone -b lrco git://git.app.eng.bos.redhat.com/jbossqe/eap-tests-transactions.git cd eap-tests-transactions/jbossts export JBOSS_HOME=path/to/jboss-eap mvn clean verify -Dtest=JPAProxyCMRCrashRecoveryTestCase#commitHaltRecoveryProxyHalted -Dno.cleanup.at.teardown -Djbossts.noJTS -Dds.properties=path/to/ds.properties Jboss eap 6.3.0.DR5 could be downloaded from: http://download.devel.redhat.com/devel/candidates/JBEAP/JBEAP-6.3.0.DR5/jboss-eap-6.3.0.DR5.zip I've just found that the order of operations is slightly different. The difference is in order of prepare calls. CMR resource is the first resource which is prepared. The correct order looks this: 1) enlist test xa resource 2) enlist cmr db resource 3) prepare cmr db resource 4) prepare test xa resource 5) commit cmr db resource ... (then see #c0) Just for better explanation: The proxy is simple socket proxy java program which serves just to transfer data from input to output and gives us possibility to kill the connection - simulate connection failures. The term stop proxy means that connection to database is going to be down. This is a bug in the implementation. The CMR recovery module should move the record into a different part of the recovery store so that it is ignored by the other recovery modules. If the connection to the database fails then the datasource does not get added to the collection queriedResourceManagers so in "Stage 2" of the algorithm the record does not get moved. The result of this bug is that one of the next recovery modules tries to recovery the transaction instead of the CMR recovery module which is the only one that knows how to determine what the CMR resource did. The fix is to move the line where we update the queriedResourceManagers list into the finally block so the datasource gets added even if the connection fails. The problem was that our code for detecting "orphaned" transactions was processing the transaction log and ignoring the "commit marker" on the embedded record. The linked external bug tracker (JBTM-2132) fixes that oversight. Tom Jenkinson <tom.jenkinson> updated the status of jira JBTM-2132 to Closed Verified for EAP 6.3.0.ER2. I've added a draft release note based on the information found in this ticket. Michael, can you please review the draft and amend as required? I found parsing the data challenging. (In reply to Scott Mumford from comment #8) > I've added a draft release note based on the information found in this > ticket. > Michael, can you please review the draft and amend as required? I found > parsing the data challenging. Hi Scott I am assuming this is historical and you no longer need a response. |
Created attachment 878461 [details] server.log It seems that recovery with CMR as part of the transaction could cause data inconsistency. The test looks: 1) enlist test xa resource 2) enlist cmr db resource 3) prepare test xa resource 4) prepare cmr db resource 5) commit cmr db resource 6) crash app server 7) start server with recovery being stopped (byteman waiting on signal) 8) stop proxy 9) do recovery of test xa resource The test XA resource is rollbacked instead of being commited as CMR resource already was.