Description of problem: In some situations, the reason a status check fails is not reported correctly to the system logs prior to causing a failover or a service restart. The reason a service is restarted or migrated without intervention is required for proper debugging of service problems. There are a couple of things that should happen: (a) Each resource agent should log errors whenever they are about to return a failure code. (b) Rgmanager should log an error whenever a resource agent status check fails. (c) Rgmanager should pass in "<type>:'<attr>'" as OCF_INSTANCE_NAME to resource agents, and resource agents should log OCF_INSTANCE_NAME whenever an error occurs. (d) Error messages and logs should be more uniform in the way they are reported throughout the resource-agents.
*** Bug 202569 has been marked as a duplicate of this bug. ***
*** Bug 199678 has been marked as a duplicate of this bug. ***
Moving back to assigned. We need to be sure we notify callers about fatal signal errors when a child process dies.
This is just about done, except for clusterfs.sh needing to be updated.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0149.html