Bug 501015 - Management and cluster do not work together.
Summary: Management and cluster do not work together.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: 1.1.1
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: 1.3
: ---
Assignee: Alan Conway
QA Contact: Jan Sarenik
URL:
Whiteboard:
Depends On: 499872 500174 557138 557832
Blocks: 494399
TreeView+ depends on / blocked
 
Reported: 2009-05-15 14:17 UTC by Alan Conway
Modified: 2010-10-14 15:58 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
The management component is now capable of working in a cluster.
Clone Of:
Environment:
Last Closed: 2010-10-14 15:58:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2010:0773 0 normal SHIPPED_LIVE Moderate: Red Hat Enterprise MRG Messaging and Grid Version 1.3 2010-10-14 15:56:44 UTC

Comment 2 Alan Conway 2010-01-20 13:47:17 UTC
We have a hack in place that suppresses exceptions when the session receives completions for transfers not yet sent (which is the usual manifestation of the unpredictability). I.e. we have in essence disabled consistency checking for management sessions. This solved immediate problems but would quickly stop working if sessions/connections could be used for management and other things (as will be more likely with QMFv2 where using management becomes quite straightforward).

Comment 3 Alan Conway 2010-01-20 14:49:50 UTC
The problem with management updates in a timer is indepedent of the object ID problem. Created a separate Bug 557138

Comment 4 jrd 2010-01-22 19:21:48 UTC
I put a proposed patch which fixes this and Bug 557832 in the latter bz.

Comment 5 jrd 2010-03-08 16:47:53 UTC
Alan, I believe this one got resolved, correct?  Please bounce it back to me if not.  Thanks...

Comment 6 Alan Conway 2010-03-08 19:51:56 UTC
We still don't have consistent object IDs in QMFv1 so we still have to disable some management commands as per the description. With QMFv2 we should be OK, but we need to test it.

Comment 7 Jan Sarenik 2010-03-31 11:14:11 UTC
How can I test this, gentlemen?

Comment 8 Alan Conway 2010-03-31 12:34:02 UTC
Run a 4 node cluster with --mgmt-sub-interval=1 to get frequent management updates. Run perftest, qpid-config -b, qpid-queue-stats and sesame in loops. Kill & restart one of the brokers a few times while this is all running. Let the clients run for an hour & verify no failures.

Comment 9 Gordon Sim 2010-03-31 12:44:47 UTC
Stopping and restarting one or more of the nodes while the test described in comment 8 is running is also useful (tests the join/update protocol).

Comment 10 Jan Sarenik 2010-04-19 14:09:35 UTC
should be --mgmt-pub-interval=1

Comment 13 Jan Sarenik 2010-06-04 12:47:09 UTC
qpid-queue-stat is running since I started it and number of other AMQP
and also QMF clients were run on this four-node cluster consisting
of 2 RHEL5 i386 nodes and 2 RHEL5 x86_64 nodes. Every 5 minutes
a broker on random one of them is restarted via "service qpidd restart".
Still the qpid-queue-stat runs fine until now. I am setting this
bug to VERIFIED state.

qpid-cpp-server-cluster-0.7.946106-2.el5

Comment 14 Jaromir Hradilek 2010-10-07 15:53:41 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
The management component is now capable of working in a cluster.

Comment 16 errata-xmlrpc 2010-10-14 15:58:45 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0773.html


Note You need to log in before you can comment on or make changes to this bug.