Description of problem: I forgot I was using the other nic interface and it took me a lot of debugging before I rememebered to use the correct name [root@taft-04 ~]# fence_scsi -n taft-03 Unable to execute sg_persist (/dev/sdb1). [root@taft-04 ~]# fence_scsi -n taft-03-e2 A message like "taft-03 doesn't exist in this cluster" would be more helpful. Version-Release number of selected component (if applicable): fence-1.32.50-2.fencescsi.test.patch
The easiest way to fix this is to just check the nodeid after the script calls get_node_id() when generating the key. The get_node_id() routine does an XML query agaist the cluster.conf to get the nodeid for the nodename. So if the node is not part of the cluster, the nodeid will be zero. The downside to this is that we can't distinguish between a missing nodeid or a nodename that doesn't exist. Is that ok? The error would simple be something like "Unable to determine nodeid for node <nodename>". Not exactly the same thing was saying "Hey! This node doesn't exist, but definitely an improvement.
Fixed in RHEL5. As mentioned above, the script will simply check to see the nodeid we get from the XML query of cluster.conf. If nodeid is zero, then either the node does not exist in this cluster or the nodeid is not set. Either case is invalid, so we report and error and exit.
Sorry, meant to say fixed in RHEL4. Although it is fixed in RHEL5, too .. that is a different BZ.