Bug 882243
| Summary: | Failover doesn't work properly with XA | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Gordon Sim <gsim> | ||||
| Component: | qpid-java | Assignee: | Weston M. Price <wprice> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Valiantsina Hubeika <vhubeika> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 2.0 | CC: | cdewolf, esammons, gsim, iboverma, jross, lzhaldyb, mcressma, ppecka, tross, vhubeika | ||||
| Target Milestone: | 2.3 | Keywords: | Patch | ||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | qpid-java-0.18-7 | Doc Type: | Bug Fix | ||||
| Doc Text: |
Cause:
Messages sent under an XA transaction are replayed on failover.
Consequence:
Transaction atomicity is lost.
Fix:
Such messages are no longer replicated.
Result:
Transaction atomicity guarantees are honoured.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2013-03-06 18:53:03 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 917988 | ||||||
| Attachments: |
|
||||||
|
Description
Gordon Sim
2012-11-30 13:41:29 UTC
Weston, please assess. Currently reviewing. This is an area that at the very least we need more testing to consistently reproduce effectively. However, I agree with Gordon's assessment, most likely something in the JMS client that is not being handled correctly. Note, one blocker on this is Gordon being on vacation being that he is the 'owner' or at least the expert on the DTX code. My environment: Broker OS: Linux carthage 3.6.7-4.fc16.x86_64 #1 SMP Tue Nov 20 20:33:31 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux Broker Build: [wmprice@carthage ~]$ qpid-install/sbin/qpidd -v qpidd (qpidc) version 0.18 built from 0.18-mrg branch in internal git repo on mrg1 Store Build: [wmprice@carthage qpid-store]$ svn info Path: . URL: http://anonsvn.jboss.org/repos/rhmessaging/store/branches/qpid-0.18 Repository Root: http://anonsvn.jboss.org/repos/rhmessaging Repository UUID: 06e15bec-b515-0410-bef0-cc27a458cf48 Revision: 4530 Node Kind: directory Schedule: normal Last Changed Author: mcressman Last Changed Rev: 4527 Last Changed Date: 2013-01-02 15:15:03 -0500 (Wed, 02 Jan 2013) Qpid JMS/JCA Build: 0.18-mrg branch from our internal git repository JEE Server: EAP 5.1 In my setup, I am running two brokers on the same OS instance with different ports. Each broker has it's own data directory and do not share a store etc. The app server is running on a separate OS (OSX) independent of the broker hosts. I am using the 0.18 version of the JCA adapter, deploying the examples and running within EAP. Currently, when running in a cluster with XA, I am unable to reproduce this issue. However, this isn't saying much as there is no DTX* type information printed to the logs which is pretty confusing as within the debugger I can see the XA transaction complete successfully. The client does failover properly, but the messages sent to the previous node are not replayed. Again, I don't really trust this as I can't see any XA/DTX information in the logs at all so I am a bit miffed at this point. At any rate, I have repeatable environment that is automated to setup and run this scenario when Gordon returns. Adjust log settings and now DTX info is showing up correctly and issue becomes apparent right away. Actually, I am only seeing the following type of info the logs:
2013-01-16 15:18:58 [Broker] debug preparing: {Xid: format=131075; global-id=1--3f57fe9c:f13b:50f70b08:63; branch-id=-3f57fe9c:f13b:50f70b08:65; }
2013-01-16 15:19:04 [Broker] debug committing: {Xid: format=131075; global-id=1--3f57fe9c:f13b:50f70b08:63; branch-id=-3f57fe9c:f13b:50f70b08:65; }
I am not seeing any type of DtxSelect/DtxBegin/DtxEnd etc. I am not sure if something has changed within the Broker logging or if my settings are wrong. I am using:
--log-enable trace+:Dtx --log-enable trace+:Protocol
I have tried various options to no avail.
At any rate, I have also noticed that this issue seems to only occur when multiple XA resources are used within the same XA transaction. I am reviewing this further.
Thanks to Rajith we have a patch. I applied and tested the fix both on trunk as well as our internal 0.18 branch. One minor modification was required to build against 0.18 so I am submitting a modified version of Rajith's patch if we need it. I will simply attach it to the BZ. All tests (unit, system and XA/HA failover with JCA) look good. Created attachment 682763 [details]
Patch for XA/HA failover
Patch for XA/HA failover issue for the 0.18-mrg internal branch.
VERIFIED qpid-java-client-0.18-6.el6.noarch qpid-java-common-0.18-6.el6.noarch qpid-java-example-0.18-6.el6.noarch qpid-jca-0.18-7.el6.noarch qpid-jca-xarecovery-0.18-7.el6.noarch qpid-jca-zip-0.18-7.el6.noarch Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-0561.html |