From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050921 Red Hat/1.0.7-1.4.1 Firefox/1.0.7 Description of problem: if a service is on the not-most-preferred member of an ordered failover domain (and a more-preferred node is online) and the service itself incurs an error, it will not restart the service and the service will get stuck in the 'recovering' state. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1.Set the priority for a server to 2 in fence 2.Move the service to this server with priority set to 2. 3.ifdown a monitored interface. Actual Results: Service stays in recovering Expected Results: Service should fail over Additional info: This issue has been reproduced with lon. The fix should be a one liner: @@ -1246,6 +1249,7 @@ tolerance = FOD_GOOD; if (req != RG_RESTART && + req != RG_START_RECOVER && (node_should_start_safe(my_id(), membership, svcName) < tolerance)) { cml_free(membership);
fixes in CVS head, STABLE, RHEL4
this does not seem to 100% fix the problem-- investigating more
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2006-0557.html