Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 519476 - Invalid accept data sent by Java client after failover.
Invalid accept data sent by Java client after failover.
Status: CLOSED ERRATA
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-java (Show other bugs)
1.1.6
All Linux
urgent Severity high
: 1.3
: ---
Assigned To: Rajith Attapattu
Jiri Kolar
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-08-26 14:36 EDT by Alan Conway
Modified: 2010-10-14 12:01 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, the Java client sent invalid accept data after a failover. This was caused by a race condition where data from an old disconnected connection was incorrectly sent to a new failed-over connection. With this update, the Java client no longer sends invalid data after a failover.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-10-14 12:01:19 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2010:0773 normal SHIPPED_LIVE Moderate: Red Hat Enterprise MRG Messaging and Grid Version 1.3 2010-10-14 11:56:44 EDT

  None (edit)
Description Alan Conway 2009-08-26 14:36:13 EDT
A program intended to reproduce bug 516501 turned up a new bug, possibly a client-side bug in Java failover. It appears that there is a race condition where ack data from an old, disconnected connection is incorrectly sent on a new failed-over connection. The symptom is an error of the form "connfirmed N but sent 0"

The reproducer code is https://bugzilla.redhat.com/attachment.cgi?id=357364, here's the description from bug 516501

Comment #5 From  Rajith Attapattu (rattapat@redhat.com)  2009-08-13 15:28:17 EDT   (-) [reply] -------      Private

Created an attachment (id=357364) [details]
Reproducer

The attachment contains a JMS based reproducer.
Just untar the package and run the scramble_brokers.sh script.

It basically starts a jms producer and jms consumer that uses ** sync_ack ** in
the bg and then changes the 4 node cluster membership rapidly to force
failover.

I tried with a 2 node cluster to keep things simple but the probability of the
error happening was pretty low. Also in this case it was hitting a known issue
in the JMS clients FailoverExchangeMethod.

The script is running the java clients with log level at WARN. You can easily
change that in the script to debug ..etc.
You could also get the brokers to log into a file.

Feel free to modify the tests as you see fit.
Please ping me if you make any improvements to the test script and I could
incorporate those changes. into my nightly runs.
Comment 2 Rajith Attapattu 2009-12-14 17:19:36 EST
I am currently unable to reproduce this issue with the latest package set.
I even tried with a broker prior to r794736.
I have done a fair amount of testing and I am yet to see this issue.
Comment 3 Jiri Kolar 2010-03-16 06:47:41 EDT
Any progress? I there any known reproducer?
Comment 4 Rajith Attapattu 2010-03-17 12:19:47 EDT
Not that know of.
This issue seems to be fixed, but sadly know way of verifying.
Comment 5 Jiri Kolar 2010-03-24 09:26:30 EDT
Tested:
on -2 bug does not appear and on 1.2 also not. We (Rajith,Me) were not able to reproduce it anymore. Probably fixed on broker side, but nobody know when.

Discussed with Rajith and Alan and both proposed mark it as verified

validated on packages:

# rpm -qa | grep -E '(qpid|openais|rhm)' | sort -u

openais-0.80.6-16.el5
openais-debuginfo-0.80.6-16.el5
python-qpid-0.7.917557-4.el5
qpid-cpp-client-0.7.916826-2.el5
qpid-cpp-client-devel-0.7.916826-2.el5
qpid-cpp-client-rdma-0.7.916826-2.el5
qpid-cpp-client-ssl-0.7.916826-2.el5
qpid-cpp-mrg-debuginfo-0.7.916826-2.el5
qpid-cpp-server-0.7.916826-2.el5
qpid-cpp-server-cluster-0.7.916826-2.el5
qpid-cpp-server-devel-0.7.916826-2.el5
qpid-cpp-server-rdma-0.7.916826-2.el5
qpid-cpp-server-ssl-0.7.916826-2.el5
qpid-cpp-server-store-0.7.916826-2.el5
qpid-cpp-server-xml-0.7.916826-2.el5
qpid-dotnet-0.4.738274-2.el5
qpid-java-client-0.7.918215-1.el5
qpid-java-common-0.7.918215-1.el5
qpid-tools-0.7.917557-4.el5


->VERIFIED
Comment 6 Jiri Kolar 2010-04-09 09:41:11 EDT
tested on RHEL  5.5 i386 / x86_64  and RHEL  4.8 i386 / x86_64
Comment 7 Martin Prpič 2010-10-10 05:34:17 EDT
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously, the Java client sent invalid accept data after a failover. This was caused by a race condition where data from an old disconnected connection was incorrectly sent to a new failed-over connection. With this update, the Java client no longer sends invalid data after a failover.
Comment 9 errata-xmlrpc 2010-10-14 12:01:19 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0773.html

Note You need to log in before you can comment on or make changes to this bug.