Bug 218688

Summary: cman_tool reports aisexec not starting erroneously
Product: Red Hat Enterprise Linux 5 Reporter: Kiersten (Kerri) Anderson <kanderso>
Component: cmanAssignee: Christine Caulfield <ccaulfie>
Status: CLOSED CURRENTRELEASE QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.0CC: cluster-maint, sdake
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RC Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-02-08 01:27:31 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Kiersten (Kerri) Anderson 2006-12-06 20:20:06 UTC
Description of problem:
During system startup with the cman init script enabled, cman_tool reports the
following message:
[kanderso@dhcp83-120 tmp]$ cat cman_tool-aisexec.txt 
tarting cluster: 
   Loading modules... DLM (built Dec  4 2006 15:58:12) installed
GFS2 (built Dec  4 2006 15:58:52) installed
done
   Mounting configfs... done
   Starting ccsd... done
   Starting cman... failed
/usr/sbin/cman_tool: aisexec daemon didn't start

Once the node is up, it appears that aisexec does in fact start correctly,
cman_tool status returns current state and the node is participating in the cluster.

The result of this is that the full startup script does not get executed and the
remaining cluster services do not get started properly.  Running service cman
start successfully brings up the cluster.
Version-Release number of selected component (if applicable):


How reproducible:
This has happened twice on the xen cluster so far when running revolver

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Kiersten (Kerri) Anderson 2006-12-06 20:42:16 UTC
I upped the loop count in cman_tool/join.c to 100 from 20 to see if that makes
it go away for the short term.  Have restarted my tests, will let you know if I
see it again.

Comment 2 Christine Caulfield 2006-12-07 09:55:17 UTC
There is actually a small bug in cman_tool join such that it doesn't spot that
aisexec has started correctly or crashed. Fixing this bug means that the loop
counter can go much higher as aisexec failures will be properly detected.

Checking in join.c;
/cvs/cluster/cluster/cman/cman_tool/join.c,v  <--  join.c
new revision: 1.48; previous revision: 1.47
done


Comment 3 Christine Caulfield 2006-12-12 16:52:41 UTC
Checked in to RHEL5 & RHEL50 branches

Checking in join.c;
/cvs/cluster/cluster/cman/cman_tool/join.c,v  <--  join.c
new revision: 1.47.2.1; previous revision: 1.47
done
Checking in join.c;
/cvs/cluster/cluster/cman/cman_tool/join.c,v  <--  join.c
new revision: 1.47.4.1; previous revision: 1.47
done


Comment 4 RHEL Program Management 2007-02-08 01:27:31 UTC
A package has been built which should help the problem described in 
this bug report. This report is therefore being closed with a resolution 
of CURRENTRELEASE. You may reopen this bug report if the solution does 
not work for you.


Comment 5 Nate Straz 2007-12-13 17:22:26 UTC
Moving all RHCS ver 5 bugs to RHEL 5 so we can remove RHCS v5 which never existed.