Bug 1080035

Summary: Inconsistency for recovery when db connection fails for Oracle database when running on JTS
Product: [JBoss] JBoss Enterprise Application Platform 6 Reporter: Ondrej Chaloupka <ochaloup>
Component: Transaction ManagerAssignee: Gytis Trikleris <gtrikler>
Status: CLOSED CURRENTRELEASE QA Contact: Hayk Hovsepyan <hhovsepy>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.3.0CC: hhovsepy, kkhan, mmusgrov, ochaloup, tom.jenkinson
Target Milestone: DR12   
Target Release: EAP 6.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1143956    
Bug Blocks:    
Attachments:
Description Flags
server.log
none
JPAProxyCrashRecoveryTestCase_prepareHalt_jts_server.log.html
none
server.log with -Dno.recovery.scan
none
server.db2-10.log
none
server.log for oracle as html page
none
server.log for oracle as html page none

Description Ondrej Chaloupka 2014-03-24 14:46:28 UTC
Created attachment 878088 [details]
server.log

I'm hitting inconsistency for crash recovery when JTS is used. When using Oracle database and connection fails before commit is done the recovery finishes with rollback but the first resource was in fact committed.

Scenario: prepareHalt
Steps:
a. enlistment jdbc xa resource
b. enlistment test xa resource
c. prepare jdbc xa resource
d. killing connection to database
e. prepare test xa resource (no db connection)
f. commit jdbc xa resource -> failing as connection is down
f-a. jdbc xa resource returns XAException.XAER_RMFAIL
g. commiting test xa resource
h. start the connection to database
i. start recovery
As it's mentioned under issue bz1077216 there is needed to run 3 rounds of recovery to go through and for all the xids being recovered. But during recovery the db resource is rollbacked instead of being commited as the test xa resource was after the connection fell down.

This has to be run with Oracle database as other ones that were tested (e.g. PostgreSQL) return XAException.XAER_RMERR which causes the transaction being aborted and so the test xa resource is rolled back which means correct behavior for following recovery process.

Comment 5 Ondrej Chaloupka 2014-07-01 15:13:27 UTC
Created attachment 913790 [details]
JPAProxyCrashRecoveryTestCase_prepareHalt_jts_server.log.html

Hi Mike,

I've checked this issue and it seems to me being trouble somewhere. I'm not sure whether it's problem of TM or jdbc driver but the fact is that when the scenario is run on Oracle then the result seems to bring inconsistency in data.

TM tries to do commit on XA resource during recovery but there is still returned finish error [1]. And at the end rollback is called.

The same testcase works fine with JTA.

I'm not sure about this issue but I think that it should be part of release notes.

Ondra
ps. adding server log as attachment. this time in html format as I discovered  vim feature of exporting syntax highlihting to html.


[1]
TRACE [com.arjuna.ats.jts] (Periodic Recovery) ExtendedResourceRecord: Successfully stringed to object, next try to narrow
TRACE [com.arjuna.ats.jts] (Periodic Recovery) ExtendedResourceRecord: Failed to narrow to ArjunaSubtranAwareResource
TRACE [com.arjuna.ats.arjuna] (Periodic Recovery) BasicAction.doCommit for 0:ffff7f000001:6e82ae10:53b29ee8:3a received TwoPhaseOutcome.FINISH_ERROR from class com.arjuna.ats.internal.jts.resources.ExtendedResourceRecord
TRACE [com.arjuna.ats.arjuna] (Periodic Recovery) RecordList::insert(RecordList: empty) : appending /StateManager/AbstractRecord/ExtendedResourceRecord for 0:ffff7f000001:6e82ae10:53b29ee8:42
TRACE [com.arjuna.ats.arjuna] (Periodic Recovery) BasicAction::doCommit() result for action-id (0:ffff7f000001:6e82ae10:53b29ee8:3a) on record id: (0:ffff7f000001:6e82ae10:53b29ee8:42) is (TwoPhaseOutcome.FINISH_ERROR) node id: (1)

Comment 8 Ondrej Chaloupka 2014-08-13 12:37:48 UTC
Created attachment 926404 [details]
server.log with -Dno.recovery.scan

Comment 9 Ondrej Chaloupka 2014-09-02 13:24:37 UTC
Created attachment 933761 [details]
server.db2-10.log

I was trying this with DB2 database (db2 9.7 and 10) and it suffers with this issue as well. Just the returned XA exception (after connection is down) is XAException.XA_RETRY (Oracle throws XAException.XAER_RMFAIL)

Comment 11 Ondrej Chaloupka 2014-09-05 07:28:24 UTC
Created attachment 934693 [details]
server.log for oracle as html page

Hi Gytis,

I was thinking to add some more information on this issue. I enhanced the server log with some comments. The comments are introduced by >>>>>.

I hope that could help you to understand the flow of our test case.

Thanks 
Ondra

Comment 12 Ondrej Chaloupka 2014-09-05 07:43:28 UTC
Created attachment 934695 [details]
server.log for oracle as html page

Comment 13 Gytis Trikleris 2014-09-05 09:31:15 UTC
Thanks Ondra. Did you commit it to any branch for me to use?

Comment 14 Ondrej Chaloupka 2014-09-05 10:01:17 UTC
Hi Gytis,

if you mean the way how to debug then it's already available in 6.3.0.
If you mean the enhanced report then it's just notes manually added to the log and processed by VI.
If you mean something else then probably not.

:)
Ondra

Comment 15 Gytis Trikleris 2014-09-05 10:04:42 UTC
I mean the enhanced report :) thanks for it, it makes it easier to read the log.

Comment 16 Gytis Trikleris 2014-09-17 14:38:04 UTC
I've merged the fix for this to https://github.com/jbosstm/narayana/tree/4.17. Now need to wait for JBossTS 4.17.23.Final to be released.

Comment 17 JBoss JIRA Server 2014-09-18 08:15:52 UTC
Mark Little <mark.little> updated the status of jira JBTM-2255 to Reopened

Comment 18 Gytis Trikleris 2014-09-18 08:37:14 UTC
Fix mentioned in the comment 16 was reverted, since this isn't the best option to solve the problem as explained in the JIRA.

Comment 19 tom.jenkinson 2014-09-26 11:10:36 UTC
Re-fixed in upstream

Comment 22 tom.jenkinson 2014-12-05 12:37:39 UTC
Went into 4.17.23

Comment 23 Hayk Hovsepyan 2014-12-05 14:35:16 UTC
Verified on revision EAP 6.4.0.DR12

Comment 24 JBoss JIRA Server 2014-12-11 15:54:01 UTC
Tom Jenkinson <tom.jenkinson> updated the status of jira JBTM-2255 to Closed