Bug 515858
| Summary: | RHEL 5: Documentation: Provide information about cluster service status check and failover timeout | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Paul Kennedy <pkennedy> | |
| Component: | Documentation-cluster | Assignee: | Steven J. Levine <slevine> | |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | ecs-bugs | |
| Severity: | medium | Docs Contact: | ||
| Priority: | low | |||
| Version: | 5.5 | CC: | adstrong, jskeoch, lhh, mhideo, slevine, ssaha | |
| Target Milestone: | rc | Keywords: | Documentation | |
| Target Release: | --- | |||
| Hardware: | All | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | Bug Fix | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 717008 (view as bug list) | Environment: | ||
| Last Closed: | 2011-07-25 13:08:58 UTC | Type: | --- | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 717008 | |||
|
Description
Paul Kennedy
2009-08-06 04:06:35 UTC
This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. I'm reassigning this to me: I have documented these issues (at least for Conga) in RHEL 6, so I can look at whether this applies to RHEL 5. Lon: This bug -- which has been around for while -- just came into my purview and I'm not sure where to take it (I misinterpreted it at first glance). Is there an answer to questions 1 and 2 in the bug description? It doesn't look as if these are documented. This is a RHEL 5 bug, but I can find nothing about this in the current RHEL 6 documentation either. Any advice about where to take this? * rgmanager checks the status of individual resources, not whole services. This is a change from clumanager on RHEL3, which periodically checked the status of the whole service. Every 10 seconds, rgmanager scans the resource tree, looking for resources which have passed their "status check" interval.
* Each resource agent specifies the amount of time between periodic status checks. Each resource utilizes these timeout values unless explicitly overridden in cluster.conf using the special <action> tag:
<action name="status" depth="*" interval="10" />
This tag is a special child of the resource itself in cluster.conf. For example, if you had a file system resource for which you wanted to override the status check interval. This becomes a bit confusing when placed next to child resources, unfortunately:
<fs name="test" device="/dev/sdb3">
<action name="status" depth="*" interval="10" />
<nfsexport...>
</nfsexport>
</fs>
* Some agents provide multiple "depths" of checking. For example, a normal file system status check (depth 0) is simply "is it mounted in the right place?". A more intensive check is depth 10, which is "can I read a file from this?". Yet, a more intensive check is depth 20, which is "can I write to this file system?". In the previous example, I used '*', which means "use these values for all depths". The result is that the "test" file system is checked at the highest-defined depth provided by the resource-agent (in this case, 20) every 10 seconds.
* There is no timeout for starting, stopping, or failing over resources. Some resources take an indeterminately long amount of time to start or stop. Unfortunately, a failure to stop (including a timeout) renders the service inoperable (failed state). You can, if desired, turn on timeout enforcement on each resource in a service individually by adding __enforce_timeouts="1" to the reference in the cluster.conf. See the following for an example.
https://access.redhat.com/kb/docs/DOC-43572
This is the latest build: Red_Hat_Enterprise_Linux-Cluster_Administration-5-web-en-US-5-30_el6eng The typo is fixed here: http://documentation-stage.bne.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Cluster_Administration/ap-status-check-CA.html |