Bug 688201

Summary: cman quorum timeout is too short
Product: Red Hat Enterprise Linux 6 Reporter: Lon Hohberger <lhh>
Component: clusterAssignee: Fabio Massimo Di Nitto <fdinitto>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: high Docs Contact:
Priority: high    
Version: 6.0CC: ccaulfie, cluster-maint, djansa, jkortus, lhh, rpeterso, teigland
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: cluster-3.0.12-38.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-05-19 12:54:49 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Lon Hohberger 2011-03-16 14:56:56 UTC
Description of problem:

Tim Wilkinson hit an issue where the CMAN quorum timeout default of 20 seconds was causing the initscript to short-circuit prematurely.

His cluster would form and become quorate in approx. 28 seconds.

We worked around this by increasing the CMAN_QUORUM_TIMEOUT to 45 seconds, but I hardly think it's unreasonable to wait this long for quorum to form generally, and will prevent people from scratching their heads in the future about why they have to run the initscript a second time to bring up any missing daemons.


Version-Release number of selected component (if applicable): 3.0.12-23.el6


How reproducible: Varies; seems to be hardware dependent.


Steps to Reproduce:
1. Boot cluster
2. Quorum doesn't form in 20 seconds



Actual results:
 * fenced, dlm_controld, gfs_controld do not start up because of a short-circuit in the initscript (for safety)

Expected results:
 * all daemons running

Additional info:

The goal is to simply have the default CMAN_QUORUM_TIMEOUT in /etc/init.d/cman increased from 20 to 40 seconds to account for variances in hardware.

Comment 5 errata-xmlrpc 2011-05-19 12:54:49 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0537.html