Red Hat Bugzilla – Bug 234589
rgmanager not working when using a quorum disk
Last modified: 2010-10-22 10:07:45 EDT
Description of problem:
If using qdiskd with a quorum disk rgmanager ist not able to start services.
Without starting qdiskd rgmanager works fine.
Version-Release number of selected component (if applicable):
RHEL 5: cman-2.0.60-1.el5, rgmanager-2.0.23-1 (x86_64)
Use a quorum disk
Steps to Reproduce:
1. configure quorum disk in cluster.conf
2. start cman
3. start qdiskd
4. start rgmanager
no services running, clustat hangs when starting, system-config-cluster hangs
when starting, /var/log/messages:
Mar 30 14:09:27 pg-ba-001 clurgmgrd: <err> #34: Cannot get status for
Mar 30 14:09:43 pg-ba-001 clurgmgrd: <err> #34: Cannot get status for
I attached my cluster.conf. Registration of quorum succeeds in cman.
Created attachment 151269 [details]
Cluster Configuration File
clurgmgrd appears to be suffering the same fate as ccs_tool in bug #223519,
treating the quorum disk as an actual node. When clurgmgrd first starts, it
attempts to make contact with the quorum disk "node" to determine the status
of the services its running. This times out, causing an "abort":
 info: State change: Local UP
 info: State change: sys-b UP
 info: State change: /dev/dm-3 UP #Note: Quorum Disk
aight, need responses from 3 guys
VF: Push 2.12453 #1 (X#00020001)
VF: Checking for consensus...
VF: Timed out waiting for 1 responses
VF: Broadcasting ABORT (X#00020002)
I was able to construct a proof of concept by adding code to
rgmanager/src/daemons/main.c:membership_update() that sets cn_member to 0 for
the cml_members element which has a cn_nodeid of 0. Afterwords, the resource
manager appears to function as expected. Additionally, clustat no longer
hangs with a “Timed out waiting for a response from Resource Group Manager”
I hope that this information assists in leading to a proper patch, as mine was
a rather brute force solution.
Created attachment 152699 [details]
Hi, this should fix it.
Actually, it sounds like exactly what you did, but in a different location. ;)
Thanks for that!
Will there be an official errata for this problem?
I can't confirm one way or the other at this point, but it looks like it will be
in update 1 for certain.
Fixing Product Name. Cluster Suite was integrated into the Enterprise Linux for
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
Do you have any news for me if this fix will be in an upcoming errata or in the
next Update for RHEL5?
Update 1 for RHEL5 :)
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.