Description of problem: Tried to delete a node and add it back into the cluster. Received this error message when trying to add the node back in: The following errors occurred: * A Ricci error occurred on tng3-4.lab.msp.redhat.com: ccs_tool failed * Unable to update the cluster node list for my_rh_cluster Further investigation revealed this: [root@tng3-3 ~]# cman_tool nodes Node Sts Inc Joined Name 1 X 12 tng3-5.lab.msp.redhat.com 2 M 12 2007-01-31 17:26:00 tng3-4.lab.msp.redhat.com 3 M 4 2007-01-31 17:26:00 tng3-3.lab.msp.redhat.com [root@tng3-4 ~]# cman_tool nodes Node Sts Inc Joined Name 1 X 8 tng3-5.lab.msp.redhat.com 2 M 4 2007-01-31 12:25:03 tng3-4.lab.msp.redhat.com 3 M 12 2007-01-31 12:25:09 tng3-3.lab.msp.redhat.com Version-Release number of selected component (if applicable): How reproducible: Create three-node cluster. Remove one node and add it back. Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Tried adding different node. Same results.
This looks like a bug somewhere deeper in the cluster stack. Could you try to manually propagate the conf file with ccs_tool and see what the error message is?
After I remove the node, then try to add it back manually, here's what I see: [root@huey cluster]# ccs_tool update .cluster.conf Unable to open connection to dewey.lab.boston.redhat.com: Bad file descriptor Failed to update config file. [root@huey cluster]# cman_tool nodes Node Sts Inc Joined Name 1 M 12 2007-02-01 11:06:44 louey.lab.boston.redhat.com 2 M 4 2007-02-01 11:06:44 huey.lab.boston.redhat.com 3 X 12 dewey.lab.boston.redhat.com [root@dewey cluster]# pwd;ls -la /etc/cluster total 32 drwxr-xr-x 2 root root 4096 Feb 1 11:10 . drwxr-xr-x 91 root root 12288 Feb 1 11:07 .. -rw-r----- 1 root root 461 Feb 1 11:07 .cluster.conf [root@dewey cluster]# ps auxww|egrep "[a]isexec|[c]cs|[c]man|[c]lurg";service cman status ccsd is stopped Anyone have any insight into what's going wrong here?
Sorry. About 5 minutes after I posted here, i realized that the cluster needs a restart when going from 3->2 and 2->3 nodes. That's what's causing the problem.
Fixing Product Name. Cluster Suite was merged into Enterprise Linux for version 5.0.
Marking this modified and depending on https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=240508 That fix will remove the need for any special handling on the part of conga for clusters going from > 2 nodes to 2 and from 2 to > 2.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2007-0640.html