Red Hat Bugzilla – Bug 253341
failure to start cluster service which had been modifed for correction
Last modified: 2009-04-16 18:24:27 EDT
Description of problem:
This may be an issue with rgmanger.
I defined a nfs cluster service consisting of two gfs filesystems. I created
both gfs resources with invalid devices and then attempted to start those
services. The service failed to start, as expected, with the error about not
being able to match a device to what's in the fs resource defination.
I then corrected the mistake using conga, updated the cluster, and verified that
the .conf file was correct on all nodes. From here, I was unable to successfully
restart the service. Everytime I tried, it remained in the failed state. I then
attemped to restart rgmanager by hand and that worked. I then reloaded luci and
all was happy.
So why wasn't conga able to start that service without manual intervention? If I
had used conga's big hammer and restarted the whole entire cluster, it would
have probably worked, but an entire cluster restart shouldn't be necessary.
Version-Release number of selected component (if applicable):
Created attachment 161775 [details]
log from luci server
Here is what luci was trying while i was playing around with this.
To clarify, when you say you restarted rgmanager, do you mean you did a 'service
rgmanager restart' or did you start it with rgmanager (i.e., clusvcadm -e svcname)?
I restarted it with 'service rgmanager restart'.
This is a conga bug. Lon says failed services need to be explicitly disabled
before being started again. Patch to be checked in shortly.
fix verified in 0.10.0-5.el5.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.