Description of problem: Edge case - creating/deleting cluster that includes luci node = ccsd caught in loop on that node This is probably an unsupported configuration - but can we do anythiung about ccsd hanging? If not - then it's a release note entry. Version-Release number of selected component (if applicable): luci-0.8-30.el5 ricci-0.8-30.el5 How reproducible: 100% Steps to Reproduce: 1. Create a new, multi-node cluster - include the node where luci is being run in the cluster 2. After the cluster is (successfully) created, and all the nodes reboot, delete the cluster from the luci/cluster tab. 3. The /etc/cluster/cluster.conf file is not deleted from the luci node 4. ccsd is caught in this loop on the luci node: Jan 31 11:56:29 tng3-3 ccsd[2299]: Error while processing connect: Connection refused Jan 31 11:56:34 tng3-3 ccsd[2299]: Cluster is not quorate. Refusing connection. Actual results: The error listed above. Expected results: No errors. Additional info: This entry is written to the /var/lib/ricci queue on the luci node: more /var/lib/ricci/queue/1884935388 <?xml version="1.0"?> <batch batch_id="1884935388" status="2"> <module name="cluster" status="1"> <request API_version="1.0"> <function_call name="stop_node"> <var mutable="false" name="cluster_shutdown" type="boolean" value ="false"/> <var mutable="false" name="purge_conf" type="boolean" value="true "/> </function_call> </request> </module> </batch>
Luci can be run on one of managed nodes; some interuption of service should be expected though, during node restarts. Problem, described above, should not be caused by the fact that luci is running on the node. It seems that problem lies somewhere else. I have just tested cluster deployment/removal running luci on one of the nodes, and it worked as expected. I used luci-0.8-30.el5. Could you gather some more info, so we can get to the bottom of the problem.
Fixing Product Name. Cluster Suite was merged into Enterprise Linux for version 5.0.
Closing this bug after being in NEEDINFO for 6 months