Hide Forgot
+++ This bug was initially created as a clone of Bug #711521 +++ Description of problem: Two script resources (parent and sibbling) are included in a service with __independent_tree="1". With rgmanager version 2.0.52-9.el5, when child resource is detected as failed, both resources are restarted insted of only the child one. With rgmanager version 2.0.52-6.el5_5.8 it works as expected. Extract of cluster.conf: <service nfslock="1" autostart="1" domain="node1-first" exclusive="0" max_restarts="3" name="test" recovery="relocate" restart_expire_time="900"> <script file="/root/test1" name="script1" __independent_subtree="1"> <script file="/root/test2" name="script2" __independent_subtree="1"/> </script> </service> Version-Release number of selected component (if applicable): 2.0.52-9.el5 How reproducible: Always Steps to Reproduce: 1. Create a service with parent and child resource and mark both with independent_tree to 1 2. make sibling resource to fail Actual results: Both parent and sibling (in the example script1 and script2) are restarted: Jun 7 19:57:50 node1 clurgmgrd[14575]: <warning> Some independent resources in service:test failed; Attempting inline recovery Jun 7 19:57:51 node1 logger: stop test2 Jun 7 19:57:51 node1 logger: stop test1 Jun 7 19:57:51 node1 logger: start test1 Jun 7 19:57:51 node1 logger: start test2 Jun 7 19:57:51 node1 clurgmgrd[14575]: <notice> Inline recovery of service:test complete Expected results: Only the child resource (script2) is restarted. Output with version 2.0.52-6.el5_5.8 Jun 7 19:52:20 node1 clurgmgrd[11160]: <warning> Some independent resources in service:test failed; Attempting inline recovery Jun 7 19:52:20 node1 logger: stop test2 Jun 7 19:52:20 node1 logger: start test2 Jun 7 19:52:20 node1 clurgmgrd[11160]: <notice> Inline recovery of service:test succeeded Additional info: --- Additional comment from lhh on 2011-06-23 10:18:52 EDT --- Reproduced. --- Additional comment from lhh on 2011-06-23 10:37:53 EDT --- Created attachment 506331 [details] Fix --- Additional comment from lhh on 2011-06-23 10:44:47 EDT --- Example service configuration: <service name="test"> <script name="a" file="/tmp/test1.sh" __independent_subtree="1"> <script name="b" file="/tmp/test2.sh" __independent_subtree="2"/> </script> </service> --- Additional comment from lhh on 2011-06-23 10:45:38 EDT --- Oops, that's for regression testing against the non-critical services. Here's the reproducer I used: <service name="test"> <script name="a" file="/tmp/test1.sh" __independent_subtree="1"> <script name="b" file="/tmp/test2.sh" __independent_subtree="1"/> </script> </service> --- Additional comment from lhh on 2011-06-23 10:46:14 EDT --- Created attachment 506568 [details] test1.sh from referenced service configurations. Place in /tmp. --- Additional comment from lhh on 2011-06-23 10:46:43 EDT --- Created attachment 506583 [details] test2.sh from referenced service configurations. Place in /tmp. --- Additional comment from lhh on 2011-06-23 11:00:02 EDT --- Problem introduced here: http://git.fedorahosted.org/git/?p=cluster.git;a=blobdiff;f=rgmanager/src/daemons/restree.c;h=ea458d696362e3605c6253731aa579cd3ccc3a4d;hp=3a03f913959eaac798563fa7dd0af0163bb918b5;hb=06993e7d6253dbb9a0e83c8edeba4d7a99f61954;hpb=f17eaaf6827237cd13d9086e7b1fbd6eaf702db1 I now must perform a full retest of 605733 to ensure changing the line back to what it was prior does not cause a regression in the Non-Critical functionality.
Merged to RHEL6 branch: http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=f1129012827b8bf33d9e7ac535049d048f726757
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2011-1595.html