Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 501015

Summary:	Management and cluster do not work together.
Product:	Red Hat Enterprise MRG	Reporter:	Alan Conway <aconway>
Component:	qpid-cpp	Assignee:	Alan Conway <aconway>
Status:	CLOSED ERRATA	QA Contact:	Jan Sarenik <jsarenik>
Severity:	urgent	Docs Contact:
Priority:	urgent
Version:	1.1.1	CC:	cctrieloff, freznice, gsim, jsarenik
Target Milestone:	1.3
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:	The management component is now capable of working in a cluster.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2010-10-14 15:58:45 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	499872, 500174, 557138, 557832
Bug Blocks:	494399

Comment 2 Alan Conway 2010-01-20 13:47:17 UTC

We have a hack in place that suppresses exceptions when the session receives completions for transfers not yet sent (which is the usual manifestation of the unpredictability). I.e. we have in essence disabled consistency checking for management sessions. This solved immediate problems but would quickly stop working if sessions/connections could be used for management and other things (as will be more likely with QMFv2 where using management becomes quite straightforward).

Comment 3 Alan Conway 2010-01-20 14:49:50 UTC

The problem with management updates in a timer is indepedent of the object ID problem. Created a separate Bug 557138

Comment 4 jrd 2010-01-22 19:21:48 UTC

I put a proposed patch which fixes this and Bug 557832 in the latter bz.

Comment 5 jrd 2010-03-08 16:47:53 UTC

Alan, I believe this one got resolved, correct?  Please bounce it back to me if not.  Thanks...

Comment 6 Alan Conway 2010-03-08 19:51:56 UTC

We still don't have consistent object IDs in QMFv1 so we still have to disable some management commands as per the description. With QMFv2 we should be OK, but we need to test it.

Comment 7 Jan Sarenik 2010-03-31 11:14:11 UTC

How can I test this, gentlemen?

Comment 8 Alan Conway 2010-03-31 12:34:02 UTC

Run a 4 node cluster with --mgmt-sub-interval=1 to get frequent management updates. Run perftest, qpid-config -b, qpid-queue-stats and sesame in loops. Kill & restart one of the brokers a few times while this is all running. Let the clients run for an hour & verify no failures.

Comment 9 Gordon Sim 2010-03-31 12:44:47 UTC

Stopping and restarting one or more of the nodes while the test described in comment 8 is running is also useful (tests the join/update protocol).

Comment 10 Jan Sarenik 2010-04-19 14:09:35 UTC

should be --mgmt-pub-interval=1

Comment 13 Jan Sarenik 2010-06-04 12:47:09 UTC

qpid-queue-stat is running since I started it and number of other AMQP
and also QMF clients were run on this four-node cluster consisting
of 2 RHEL5 i386 nodes and 2 RHEL5 x86_64 nodes. Every 5 minutes
a broker on random one of them is restarted via "service qpidd restart".
Still the qpid-queue-stat runs fine until now. I am setting this
bug to VERIFIED state.

qpid-cpp-server-cluster-0.7.946106-2.el5

Comment 14 Jaromir Hradilek 2010-10-07 15:53:41 UTC

    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
The management component is now capable of working in a cluster.

Comment 16 errata-xmlrpc 2010-10-14 15:58:45 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0773.html