Bug 608993

Summary: rgmanager: fail to recognize ricci service crash.
Product: Red Hat Enterprise Linux 5 Reporter: yeylon <yeylon>
Component: rgmanagerAssignee: Lon Hohberger <lhh>
Status: CLOSED NOTABUG QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: low    
Version: 5.5.zCC: cluster-maint, edamato, srevivo, ykaul
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-08-09 15:59:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description yeylon@redhat.com 2010-06-29 07:08:17 UTC
Description of problem:
in case of ricci service crash rgmanager fail to see ans alert about the issue.

1. kill ricci service

 853  ps -aux | grep ricci
  854  kill -9 5638
[root@green-vdsb ~]# service ricci status
ricci dead but pid file exists

2. the system will not alert about the issue.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Lon Hohberger 2010-06-29 10:12:44 UTC
rgmanager is entirely configuration-driven; it does not monitor most things.

ricci is an optional service which is not required to run rgmanager nor the rest of the cluster software.

It's possible to make rgmanager monitor system-local services.  For example, if you had a 3-node cluster and you wanted to start and monitor specific services on each of them (ex: ricci on all 3 nodes, httpd on node 2, ricci & nfsd on node3), you could do the following:

  <failoverdomains>
    <failoverdomain name="node1" restricted="1">
      <failoverdomain name="node1" restricted="1"/>
    </failoverdomain>
    <failoverdomain name="node2" restricted="1">
      <failoverdomain name="node2" restricted="1"/>
    </failoverdomain>
    <failoverdomain name="node3" restricted="1">
      <failoverdomain name="node3" restricted="1"/>
    </failoverdomain>
  </failoverdomains>
  <rm>
    <resources>
      <script name="ricci" file="/etc/init.d/ricci" />
      <script name="httpd" file="/etc/init.d/httpd" />
      <script name="nfs" file="/etc/init.d/nfs" />
    </resources>
    <service name="monitor-node1" domain="node1">
      <script ref="ricci" />
    </service>
    <service name="monitor-node2" domain="node2">
      <script ref="ricci" />
      <script ref="httpd" />
    </service>
    <service name="monitor-node3" domain="node3">
      <script ref="ricci" />
      <script ref="nfs" />
    </service>
  </rm>

Note that services which you expect rgmanager to start and monitor, you should chkconfig off.