Bug 597362 - Sporadic failure of check-long in cluster_tests.py test_failover
Summary: Sporadic failure of check-long in cluster_tests.py test_failover
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: 1.1.9
Hardware: All
OS: Linux
high
medium
Target Milestone: 1.3
: ---
Assignee: Alan Conway
QA Contact: Lubos Trilety
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-05-28 18:02 UTC by Alan Conway
Modified: 2010-10-14 16:04 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
C: Bug in code handling cluster update. C: Broker joining a cluster at the same time that the cluster is handling an error failed with "update connection closed prematurely" F: Fixed the update code. R: Brokers can join a cluster successfully even when the cluster is handling an error.
Clone Of:
Environment:
Last Closed: 2010-10-14 16:04:22 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2010:0773 0 normal SHIPPED_LIVE Moderate: Red Hat Enterprise MRG Messaging and Grid Version 1.3 2010-10-14 15:56:44 UTC

Description Alan Conway 2010-05-28 18:02:24 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible: easy

Steps to Reproduce:
1. while make check-long LONG_TESTS=run_long_cluster_tests; do true; done
  
Actual results:
Eventually halts with an error 

Expected results:
No error

Additional info:

Based on the logs it appears that a new broker was joining the cluster when an error caused the updater to retract its update. The new broker exits with an error "update connection closed prematurely"

Comment 1 Alan Conway 2010-05-31 15:34:14 UTC
Fixed on trunk r949767 and ported to mrg_1.3.x branch:

http://mrg1.lab.bos.redhat.com/git/?p=qpid.git;a=commitdiff;h=ad3a7e4927edc01a262a0ee3903f55592b41d337

Comment 3 Lubos Trilety 2010-10-05 07:02:29 UTC
Reproduction scenario:
Run indefinite loop with the test on both RHEL5 (i386, x86_64) machines.

Tested with (version):
qpid-cpp-client-devel-docs-0.7.946106-17.el5
qpid-cpp-mrg-debuginfo-0.7.946106-17.el5
qpid-cpp-server-0.7.946106-17.el5
rh-tests-distribution-MRG-Messaging-qpid_common-1.6-55
qpid-cpp-server-cluster-0.7.946106-17.el5
qpid-cpp-server-store-0.7.946106-17.el5
qpid-java-client-0.7.946106-10.el5
python-qpid-0.7.946106-14.el5
qpid-tools-0.7.946106-11.el5
qpid-cpp-server-xml-0.7.946106-17.el5
qpid-cpp-client-0.7.946106-17.el5
qpid-cpp-client-ssl-0.7.946106-17.el5
qpid-cpp-server-ssl-0.7.946106-17.el5
qpid-cpp-server-devel-0.7.946106-17.el5
qpid-cpp-client-devel-0.7.946106-17.el5
qpid-java-common-0.7.946106-10.el5

Tested on:
RHEL5 i386,x86_64  - passed

>>> VERIFIED

Comment 4 Alan Conway 2010-10-12 19:22:44 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
C: Bug in code handling cluster update. 
C: Broker joining a cluster at the same time that the cluster is handling an error failed with "update connection closed prematurely"
F: Fixed the update code.
R: Brokers can join a cluster successfully even when the cluster is handling an error.

Comment 6 errata-xmlrpc 2010-10-14 16:04:22 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0773.html


Note You need to log in before you can comment on or make changes to this bug.