Bug 716231

Summary: Dependencies in independent_tree resources does not work as expected
Product: Red Hat Enterprise Linux 6 Reporter: Lon Hohberger <lhh>
Component: rgmanagerAssignee: Lon Hohberger <lhh>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: high Docs Contact:
Priority: high    
Version: 6.1CC: amoralej, cluster-maint, djansa, edamato, fdinitto, jcastillo, syeghiay
Target Milestone: betaKeywords: Regression
Target Release: 6.2   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: rgmanager-3.0.12.1-2.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 711521 Environment:
Last Closed: 2011-12-06 11:59:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lon Hohberger 2011-06-23 17:42:06 UTC
+++ This bug was initially created as a clone of Bug #711521 +++

Description of problem:

Two script resources (parent and sibbling) are included in a service with __independent_tree="1". 

With rgmanager version 2.0.52-9.el5, when child resource is detected as failed, both resources are restarted insted of only the child one.

With rgmanager version 2.0.52-6.el5_5.8 it works as expected.

Extract of cluster.conf:

		<service nfslock="1" autostart="1" domain="node1-first" exclusive="0" max_restarts="3" name="test" recovery="relocate" restart_expire_time="900">
			<script file="/root/test1" name="script1" __independent_subtree="1">
				<script file="/root/test2" name="script2" __independent_subtree="1"/>
			</script>
		</service>



Version-Release number of selected component (if applicable):

2.0.52-9.el5

How reproducible:

Always

Steps to Reproduce:
1. Create a service with parent and child resource and mark both with independent_tree to 1

2. make sibling resource to fail

  
Actual results:

Both parent and sibling (in the example script1 and script2) are restarted:

Jun  7 19:57:50 node1 clurgmgrd[14575]: <warning> Some independent resources in service:test failed; Attempting inline recovery 
Jun  7 19:57:51 node1 logger: stop test2
Jun  7 19:57:51 node1 logger: stop test1
Jun  7 19:57:51 node1 logger: start test1
Jun  7 19:57:51 node1 logger: start test2
Jun  7 19:57:51 node1 clurgmgrd[14575]: <notice> Inline recovery of service:test complete 


Expected results:

Only the child resource (script2) is restarted. Output with version 2.0.52-6.el5_5.8 


Jun  7 19:52:20 node1 clurgmgrd[11160]: <warning> Some independent resources in service:test failed; Attempting inline recovery 
Jun  7 19:52:20 node1 logger: stop test2
Jun  7 19:52:20 node1 logger: start test2
Jun  7 19:52:20 node1 clurgmgrd[11160]: <notice> Inline recovery of service:test succeeded 

Additional info:

--- Additional comment from lhh on 2011-06-23 10:18:52 EDT ---

Reproduced.

--- Additional comment from lhh on 2011-06-23 10:37:53 EDT ---

Created attachment 506331 [details]
Fix

--- Additional comment from lhh on 2011-06-23 10:44:47 EDT ---

Example service configuration:

                <service name="test">
                        <script name="a" file="/tmp/test1.sh" __independent_subtree="1">
                                <script name="b" file="/tmp/test2.sh" __independent_subtree="2"/>
                        </script>
                </service>

--- Additional comment from lhh on 2011-06-23 10:45:38 EDT ---

Oops, that's for regression testing against the non-critical services.  Here's the reproducer I used:

                <service name="test">
                        <script name="a" file="/tmp/test1.sh"
__independent_subtree="1">
                                <script name="b" file="/tmp/test2.sh"
__independent_subtree="1"/>
                        </script>
                </service>

--- Additional comment from lhh on 2011-06-23 10:46:14 EDT ---

Created attachment 506568 [details]
test1.sh from referenced service configurations.  Place in /tmp.

--- Additional comment from lhh on 2011-06-23 10:46:43 EDT ---

Created attachment 506583 [details]
test2.sh from referenced service configurations.  Place in /tmp.

--- Additional comment from lhh on 2011-06-23 11:00:02 EDT ---

Problem introduced here:

http://git.fedorahosted.org/git/?p=cluster.git;a=blobdiff;f=rgmanager/src/daemons/restree.c;h=ea458d696362e3605c6253731aa579cd3ccc3a4d;hp=3a03f913959eaac798563fa7dd0af0163bb918b5;hb=06993e7d6253dbb9a0e83c8edeba4d7a99f61954;hpb=f17eaaf6827237cd13d9086e7b1fbd6eaf702db1

I now must perform a full retest of 605733 to ensure changing the line back to what it was prior does not cause a regression in the Non-Critical functionality.

Comment 6 errata-xmlrpc 2011-12-06 11:59:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1595.html