Description of problem: Trying to enable a service that has been frozen (using the -F flag --enable according to failover domain rules---) returns an error and leaves the service in 'failed [Z]' state. After that, it is not possible to disable/enable/stop/restart/unfreeze this service. Version-Release number of selected component (if applicable): rgmanager-2.0.46-1.el5 How reproducible: Always Steps to Reproduce: # hostname -s clu107 # clustat | grep disc service:disc clu107-gst started # clusvcadm -d disc Local machine disabling service:disc...Success # clustat | grep disc service:disc (clu107-gst) disabled # clusvcadm -Z disc Local machine freezing service:disc...Success # clustat | grep disc service:disc (clu107-gst) disabled [Z] # clusvcadm -e disc Local machine trying to enable service:disc...Failure: Service is frozen # clustat | grep disc service:disc (clu107-gst) disabled [Z] # clusvcadm -e disc -F Local machine trying to enable service:disc...Failure # clustat | grep disc service:disc (clu107-gst) failed [Z] # clusvcadm -d disc Local machine disabling service:disc...Failure: Service is frozen # clusvcadm -e disc Local machine trying to enable service:disc...Failure: Service is frozen # clusvcadm -s disc Local machine stopping service:disc...Failure: Service is frozen # clusvcadm -R disc Local machine trying to restart service:disc...Failure: Service is frozen # clusvcadm -U disc Local machine unfreezing service:disc...Temporary failure; try again # clustat | grep disc service:disc (clu107-gst) failed [Z] Actual results: The service 'disc' gets unusable. Expected results: The service should get back to 'disabled [Z]', and then to 'disabled' Additional info:
I tried to restart cman, qdiskd and rgmanager on node clu107. cman and qdiskd restarted OK, but rgmanager got stuck on start. Tried to stop only rgmanager on the three remaining nodes but it get stuck on stop. After a power reset (reboot was unable to restart the system) all nodes were able to start all cluster modules, the service 'disc' was not frozen and all seems to work fine.
It should immediately fail (and not try to enable) if frozen.
Created attachment 334342 [details] fix
http://git.fedorahosted.org/git/?p=cluster.git;a=commit;h=6c86348ae517dcb8d449343ae3d69f8cf41568ef
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1339.html