Description of problem: Qpidd checks for cluster quorum only when sending to a client. It should register with cman for notification and shut down immediately on loss of quorum. With the current behaviour an idle qpidd could fail to notice a short-lived loss of quorum, which could put it in an invalid state due to missed activity. How reproducible: easy Steps to Reproduce: 1. confiure a cluster, start cman and qpidd --cluster-cman 2. stop cluster nodes till the cluster is inquorate. 3. start nodes till the cluster is quorate again. Actual results: qpidd fails not notice the loss of quorum. Expected results: qpidd shuts down with a "lost quorum" message.
*** Bug 510880 has been marked as a duplicate of this bug. ***
Fixed in revision 801740
*** Bug 471290 has been marked as a duplicate of this bug. ***
Note: to stop a cluster node use: sudo cman_tool close force
Typo in previous comment, should be: sudo cman_tool leave force
Back ported for 1.1.7 release: http://git.et.redhat.com/git/qpid.git/?p=qpid.git;a=commitdiff;h=91be684610ffa5361f405bb97b66a5bf58130aa0
Note that this also affects performance. Prior to this fix turning on cman degraded performance significantly (e.g. latencytest). With this fix, enabling cman support has no effect on performance.
Excuse me, it takes me more time than I expected. For last two days I am setting up a cman cluster, reading the docs and today I already verified one architecture, just need to reproduce it tomorrow and validate on the other architecture.
Verified on qpidd-cluster-0.5.752581-28.el5 i386 and x86_64 Reproduced on qpidd-cluster-0.5.752581-26.el5 --------------------------------------------------------------------- root@mrg-qe-10:~# cat /etc/cluster/cluster.conf <?xml version="1.0"?> <cluster name="jasanclust" config_version="6"> <clusternodes><clusternode name="mrg-qe-09.lab.eng.brq.redhat.com" votes="1" nodeid="1"><fence><method name="single"><device name="manual" ipaddr="10.34.33.62"/></method></fence></clusternode><clusternode name="mrg-qe-10.lab.eng.brq.redhat.com" votes="1" nodeid="2"><fence><method name="single"><device name="manual" ipaddr="10.34.33.63"/></method></fence></clusternode><clusternode name="mrg-qe-11.lab.eng.brq.redhat.com" votes="1" nodeid="3"><fence><method name="single"><device name="manual" ipaddr="10.34.33.64"/></method></fence></clusternode><clusternode name="mrg-qe-12.lab.eng.brq.redhat.com" votes="1" nodeid="4"><fence><method name="single"><device name="manual" ipaddr="10.34.33.65"/></method></fence></clusternode></clusternodes> <fencedevices><fencedevice name="manual" agent="manual"/></fencedevices> <rm> <failoverdomains/> <resources/> </rm> </cluster> root@mrg-qe-10:~# cat /etc/sysconfig/cman FENCED_START_TIMEOUT=1 FENCED_MEMBER_DELAY=1 FENCE_JOIN="no" root@mrg-qe-10:~# cat /etc/ais/openais.conf totem { version: 2 secauth: off threads: 0 rrp_mode: none interface { ringnumber: 0 bindnetaddr: 10.34.33.0 mcastaddr: 226.94.11.1 mcastport: 5405 } } logging { debug: off timestamp: on } amf { mode: disabled }
Release note added. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Qpidd now shuts down immediately when cluster quorum is lost (501537)
Release note updated. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1,3 @@ -Qpidd now shuts down immediately when cluster quorum is lost (501537)+Messaging enhancement + +Qpidd now shuts down immediately if the cluster quorum is lost.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2009-1633.html