Bug 655078

Summary: Modified cluster_tests causes broker shut down with invalid-argument error.
Product: Red Hat Enterprise MRG Reporter: Alan Conway <aconway>
Component: qpid-cppAssignee: Alan Conway <aconway>
Status: CLOSED CURRENTRELEASE QA Contact: ppecka <ppecka>
Severity: high Docs Contact:
Priority: high    
Version: 1.3CC: freznice, gsim, iboverma, tross
Target Milestone: 1.3.2-RC1   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qpid-cpp-mrg-0.7.946106-26 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 654872    
Attachments:
Description Flags
Patch to cluster_tests.py to reproduce the problem. none

Description Alan Conway 2010-11-19 14:56:38 UTC
Description of problem:

By modifying cluster_tests.py (patch attached) the test below fails consistently with an invalid-argument error. The patch causes the test to do more broker kills & restarts.

This might be the same issue as bug 654872


Version-Release number of selected component (if applicable): trunk r1036871

How reproducible: easy

Steps to Reproduce:

make check TESTS=run_cluster_tests CLUSTER_TESTS="*test_management -DDURATION=4"

Actual results: broker exits with invalid-argument error

Expected results: no error


Additional info:

Comment 1 Alan Conway 2010-11-19 14:57:20 UTC
Created attachment 461562 [details]
Patch to cluster_tests.py to reproduce the problem.

Comment 2 Alan Conway 2010-12-01 21:38:30 UTC
Fixed on trunk by the following 3 revisions:

------------------------------------------------------------------------
r1041181 | aconway | 2010-12-01 16:33:12 -0500 (Wed, 01 Dec 2010) | 8 lines

Modified cluster_tests causes broker shut down with invalid-argument error.

Described in https://bugzilla.redhat.com/show_bug.cgi?id=655078.  The
management agent's deleted-object list was not being replicated to new
members joining the cluster, so management generated fewer deleted
object notifications on the newer member, causing it to fail with an
invalid-argument error. The list is now being replicated correctly.

------------------------------------------------------------------------
r1041180 | aconway | 2010-12-01 16:32:52 -0500 (Wed, 01 Dec 2010) | 6 lines

Add missing call to Message::setTimestamp in ManagementAgent::sendBufferLH.

Without this, messages generated here will not be expired consistently
in a cluster which may cause a broker to become inconsistent and exit
with an invalid-argument error.

------------------------------------------------------------------------
r1041179 | aconway | 2010-12-01 16:32:43 -0500 (Wed, 01 Dec 2010) | 7 lines

Enable cluster-safe assertions on transition to CATCHUP

Delaying until READY was causing multiple clientConnect management
events to be raised, because broker::Connection::setUserId relies on
sys::isCluster to avoid producing duplicate events with
cluster::Connection::announce

------------------------------------------------------------------------

Comment 3 Alan Conway 2010-12-08 20:06:08 UTC
Complete fix to this also requires:

------------------------------------------------------------------------
r1043621 | aconway | 2010-12-08 14:21:05 -0500 (Wed, 08 Dec 2010) | 9 lines

Defer update of managaement agent to end of update process.

Move updating of the management agent to the very end of the update
process, after all objects used by the update process itself have been
deleted. Before the fix deletions from the update process itself
(deleting the qpid.cluster-update queue and its binding to the default
exchange) were sporadically appearing as extra delete messages on the
updatees management agent and causing inconsistency.

------------------------------------------------------------------------

Comment 4 Alan Conway 2010-12-08 20:09:37 UTC
And:
------------------------------------------------------------------------
r1041582 | kgiusti | 2010-12-02 16:03:42 -0500 (Thu, 02 Dec 2010) | 2 lines

bugfix in deleted obj import/export api

------------------------------------------------------------------------

Comment 6 ppecka 2011-02-01 10:53:00 UTC
VERIFIED RHEL 5.6 i386 / x86_64:

packages used
qpid-cpp-mrg-0.7.946106-27.el5.src.rpm
qpid-tools-0.7.946106-12.el5.src.rpm
openais-0.80.6-28.el5

--> VERIFIED