Bug 608993 - rgmanager: fail to recognize ricci service crash.
Summary: rgmanager: fail to recognize ricci service crash.
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: rgmanager
Version: 5.5.z
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Lon Hohberger
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-06-29 07:08 UTC by yeylon@redhat.com
Modified: 2016-04-18 06:33 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-08-09 15:59:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description yeylon@redhat.com 2010-06-29 07:08:17 UTC
Description of problem:
in case of ricci service crash rgmanager fail to see ans alert about the issue.

1. kill ricci service

 853  ps -aux | grep ricci
  854  kill -9 5638
[root@green-vdsb ~]# service ricci status
ricci dead but pid file exists

2. the system will not alert about the issue.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Lon Hohberger 2010-06-29 10:12:44 UTC
rgmanager is entirely configuration-driven; it does not monitor most things.

ricci is an optional service which is not required to run rgmanager nor the rest of the cluster software.

It's possible to make rgmanager monitor system-local services.  For example, if you had a 3-node cluster and you wanted to start and monitor specific services on each of them (ex: ricci on all 3 nodes, httpd on node 2, ricci & nfsd on node3), you could do the following:

  <failoverdomains>
    <failoverdomain name="node1" restricted="1">
      <failoverdomain name="node1" restricted="1"/>
    </failoverdomain>
    <failoverdomain name="node2" restricted="1">
      <failoverdomain name="node2" restricted="1"/>
    </failoverdomain>
    <failoverdomain name="node3" restricted="1">
      <failoverdomain name="node3" restricted="1"/>
    </failoverdomain>
  </failoverdomains>
  <rm>
    <resources>
      <script name="ricci" file="/etc/init.d/ricci" />
      <script name="httpd" file="/etc/init.d/httpd" />
      <script name="nfs" file="/etc/init.d/nfs" />
    </resources>
    <service name="monitor-node1" domain="node1">
      <script ref="ricci" />
    </service>
    <service name="monitor-node2" domain="node2">
      <script ref="ricci" />
      <script ref="httpd" />
    </service>
    <service name="monitor-node3" domain="node3">
      <script ref="ricci" />
      <script ref="nfs" />
    </service>
  </rm>

Note that services which you expect rgmanager to start and monitor, you should chkconfig off.


Note You need to log in before you can comment on or make changes to this bug.