Bug 621018
Summary: | Luci does not maintain service restart limit configuration | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Alan Staples <alan.staples> |
Component: | conga | Assignee: | Ryan McCabe <rmccabe> |
Status: | CLOSED NOTABUG | QA Contact: | Cluster QE <mspqa-list> |
Severity: | low | Docs Contact: | |
Priority: | low | ||
Version: | 5.5 | CC: | alan.staples, cluster-maint, jha |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2011-01-26 14:58:25 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Alan Staples
2010-08-04 01:38:21 UTC
The max_restarts attribute is only relevant when the recovery policy is restart. The UI needs to be fixed to disable these fields when the recovery policy is something other than restart. (In reply to comment #2) > The max_restarts attribute is only relevant when the recovery policy is > restart. The UI needs to be fixed to disable these fields when the recovery > policy is something other than restart. The luci GUI states "Maximum number of restart failures before relocating", which indicates to me that this should only be valide with a relocate policy actually. That makes sense to me - attempt to restart before relocating the server since restarting may likely fix the problem and relocating is a relatively expensive process. What you're saying is that this is actually the maximum number of restart attempts for a service before disabling the service group on that particular node? I can't find reference to this parameter or even the feature in the current Red Hat Cluster Administration Guide. What you stated above is correct, to the best of my knowledge: restart X times, then relocate if restart fails each time. I can't find any good documentation, either, but here's a snippet from the rgmanager patch that added the feature, that confirms the explanation above: + /* Check restart counter/timer for this resource */ + if (check_restart(svcName) > 0) { + clulog(LOG_NOTICE, "Restart threshold for %s exceeded; " + "attempting to relocate\n", svcName); + return handle_relocate_req(svcName, RG_START_RECOVER, -1, + new_owner); Restart counters only apply when you are using the "restart" recovery policy. Restart recovery policy is per-host, and is zeroed each time the service is relocated - either manually or as a consequence of a failure recovery action. That is, when "max_restarts" is exceeded within the given "restart_expire_time", rgmanager will relocate the failing service to another host in the cluster, at which point the restart counter is reset. While this is not Red Hat documentation, it is quite accurate in describing how rgmanager's recovery policies work: http://sources.redhat.com/cluster/wiki/ServicePolicies |