157248 – RGManager uses start ordering for stop operations

Bug 157248 - RGManager uses start ordering for stop operations

Summary: RGManager uses start ordering for stop operations

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Cluster Suite
Classification:	Retired
Component:	rgmanager
Sub Component:
Version:	4
Hardware:	All
OS:	Linux
Priority:	high
Severity:	medium
Target Milestone:	---
Assignee:	Lon Hohberger
QA Contact:	Cluster QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2005-05-09 19:52 UTC by Eric Kerin
Modified:	2009-04-16 19:51 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2005-07-12 15:38:01 UTC
Embargoed:

Attachments	(Terms of Use)

Description Eric Kerin 2005-05-09 19:52:58 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.7) Gecko/20050417 Fedora/1.7.7-1.3.1

Description of problem:
In rgmanager checked out from the RHEL4 branch, A Service's children are not stopped in the stop order specified in the service.sh meta-data section.  They are actually stopped in the order of the start property, rather than the stop property.  It occures this way in both  clurgmgrd, and rg_test.

Relevant parts of cluster config file:
__BEGIN__
         <resources>
                <service name="postgresql"/>
                <group name="postgresql" domain="mainfailover"/>
                <ip address="172.19.30.204" monitor_link="yes"/>
                <fs fstype="ext3" name="Postgresql Drive" mountpoint="/var/lib/pgsql" device="/dev/SharedGroup00/PostgresVol00" o
ptions="noatime,acl"/>
                <script name="Postgresql Service" file="/etc/init.d/postgresql"/>
        </resources>

        <service ref="postgresql">
                <group name="postgresql"/>
                <ip ref="172.19.30.204"/>
                <script ref="Postgresql Service"/>
                <fs ref="Postgresql Drive"/>
        </service>
__END__




Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Configure a cluster service with a service, and a filesystem that is used by the service
2. Start the service with clusvcadm -e <service>
3. attempt to stop the service with clusvcadm -s <service>
  

Actual Results:  The services failed to stop, and the service went into failed, and a manual stop was required.  checking /var/log/messages will show that it tried to stop the filesystem before stopping the service.

Expected Results:  The service is stopped, then the filesystem is stopped. Cluster resources in the stopped state.

Additional info:

This patch fixed it for me, it may not take every stop situation into account.  But does cause resource moves to work correctly, as well as clusvcadm -s requests to not require manual intervention.

Index: restree.c
===================================================================
RCS file: /cvs/cluster/cluster/rgmanager/src/daemons/restree.c,v
retrieving revision 1.10.2.3
diff -u -r1.10.2.3 restree.c
--- restree.c   5 May 2005 20:41:08 -0000       1.10.2.3
+++ restree.c   9 May 2005 19:37:42 -0000
@@ -675,7 +675,11 @@
                for (x = 0; rule->rr_childtypes &&
                     rule->rr_childtypes[x].rc_name; x++) {

-                       lev = rule->rr_childtypes[x].rc_startlevel;
+                       if(op == RS_STOP)
+                               lev = rule->rr_childtypes[x].rc_stoplevel;
+                       else
+                               lev = rule->rr_childtypes[x].rc_startlevel;
+
                        if (!lev || lev != l)
                                continue;

Comment 1 Lon Hohberger 2005-05-09 20:14:32 UTC

That patch is correct.

Comment 2 Lon Hohberger 2005-05-09 20:24:42 UTC

Patches in head and RHEL4 branch

Comment 3 Lon Hohberger 2005-05-09 20:41:09 UTC

BTW, you'll generally want "clusvcadm -d" rather than "clusvcadm -s".  Stopping
is temporary; services in the "stopped" state are evaluated by rgmanager after
cluster membership changes to see if they should be started.

A stopped service will be started again; a disabled service won't.

(Just FYI)

Comment 5 Eric Kerin 2005-05-09 21:01:26 UTC

Good to know.  

I've synced my local copy with the patches you put in CVS, since I missed the
second place that it reads the start/stop ordering.

Also, while trying to configure my cluster I noticed a few cluster.conf examples
in the source tree that are out of date.  Would you rather patches go to you,
bugzilla or the linux-cluster mailing list?

Comment 6 Lon Hohberger 2005-05-09 22:07:33 UTC

For the example stuff, linux-cluster and CC me.

Note You need to log in before you can comment on or make changes to this bug.