Bug 157248

Summary: RGManager uses start ordering for stop operations
Product: [Retired] Red Hat Cluster Suite Reporter: Eric Kerin <eric>
Component: rgmanagerAssignee: Lon Hohberger <lhh>
Status: CLOSED CURRENTRELEASE QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: high    
Version: 4CC: cluster-maint
Target Milestone: ---Keywords: Regression
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-07-12 15:38:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Eric Kerin 2005-05-09 19:52:58 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.7) Gecko/20050417 Fedora/1.7.7-1.3.1

Description of problem:
In rgmanager checked out from the RHEL4 branch, A Service's children are not stopped in the stop order specified in the service.sh meta-data section.  They are actually stopped in the order of the start property, rather than the stop property.  It occures this way in both  clurgmgrd, and rg_test.

Relevant parts of cluster config file:
__BEGIN__
         <resources>
                <service name="postgresql"/>
                <group name="postgresql" domain="mainfailover"/>
                <ip address="172.19.30.204" monitor_link="yes"/>
                <fs fstype="ext3" name="Postgresql Drive" mountpoint="/var/lib/pgsql" device="/dev/SharedGroup00/PostgresVol00" o
ptions="noatime,acl"/>
                <script name="Postgresql Service" file="/etc/init.d/postgresql"/>
        </resources>

        <service ref="postgresql">
                <group name="postgresql"/>
                <ip ref="172.19.30.204"/>
                <script ref="Postgresql Service"/>
                <fs ref="Postgresql Drive"/>
        </service>
__END__




Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Configure a cluster service with a service, and a filesystem that is used by the service
2. Start the service with clusvcadm -e <service>
3. attempt to stop the service with clusvcadm -s <service>
  

Actual Results:  The services failed to stop, and the service went into failed, and a manual stop was required.  checking /var/log/messages will show that it tried to stop the filesystem before stopping the service.

Expected Results:  The service is stopped, then the filesystem is stopped. Cluster resources in the stopped state.

Additional info:

This patch fixed it for me, it may not take every stop situation into account.  But does cause resource moves to work correctly, as well as clusvcadm -s requests to not require manual intervention.

Index: restree.c
===================================================================
RCS file: /cvs/cluster/cluster/rgmanager/src/daemons/restree.c,v
retrieving revision 1.10.2.3
diff -u -r1.10.2.3 restree.c
--- restree.c   5 May 2005 20:41:08 -0000       1.10.2.3
+++ restree.c   9 May 2005 19:37:42 -0000
@@ -675,7 +675,11 @@
                for (x = 0; rule->rr_childtypes &&
                     rule->rr_childtypes[x].rc_name; x++) {

-                       lev = rule->rr_childtypes[x].rc_startlevel;
+                       if(op == RS_STOP)
+                               lev = rule->rr_childtypes[x].rc_stoplevel;
+                       else
+                               lev = rule->rr_childtypes[x].rc_startlevel;
+
                        if (!lev || lev != l)
                                continue;

Comment 1 Lon Hohberger 2005-05-09 20:14:32 UTC
That patch is correct.

Comment 2 Lon Hohberger 2005-05-09 20:24:42 UTC
Patches in head and RHEL4 branch

Comment 3 Lon Hohberger 2005-05-09 20:41:09 UTC
BTW, you'll generally want "clusvcadm -d" rather than "clusvcadm -s".  Stopping
is temporary; services in the "stopped" state are evaluated by rgmanager after
cluster membership changes to see if they should be started.

A stopped service will be started again; a disabled service won't.

(Just FYI)

Comment 5 Eric Kerin 2005-05-09 21:01:26 UTC
Good to know.  

I've synced my local copy with the patches you put in CVS, since I missed the
second place that it reads the start/stop ordering.

Also, while trying to configure my cluster I noticed a few cluster.conf examples
in the source tree that are out of date.  Would you rather patches go to you,
bugzilla or the linux-cluster mailing list?

Comment 6 Lon Hohberger 2005-05-09 22:07:33 UTC
For the example stuff, linux-cluster and CC me.