Description of problem: I had ccsd running and a cman quorum on the morph-cluster. [root@morph-03 root]# cat /proc/cluster/nodes Node Votes Exp Sts Name 1 1 6 M morph-01 2 1 6 M morph-04 3 1 6 M morph-02 4 1 6 M morph-05 5 1 6 M morph-03 6 1 6 M morph-06 I then went on morph-02, bumped the version number of the cluster.conf file, sent ccsd a SIGHUP to go propagate the new file, and it failed: Sep 10 16:29:34 morph-02 ccsd[2272]: Failed to receive COMM_UPDATE_NOTICE_ACK from morph-06 Sep 10 16:29:34 morph-02 ccsd[2272]: Failed to update remote nodes. Sep 10 16:29:34 morph-02 ccsd[2272]: Update failed. Sep 10 16:29:34 morph-02 ccsd[2272]: Select failed: Interrupted system call from morph-06: Sep 10 16:29:36 morph-06 ccsd[2276]: Unexpected communication type... ignoring. Sep 10 16:29:36 morph-06 ccsd[2276]: Error while responding to cluster message: Invalid argument How reproducible: Always Actual Results: the version never got bumped on the other nodes Expected Results: the version should get bumped on the other nodes
please provide the following: > ls -l /etc/cluster/cluster.conf
The recent changes to magma should fix this. Magma was not allowing msg_send to send more than 1024 bytes. This seems to be related to the problem you are seeing. (This is how I reproduced it.)
The files do now get updated however I'm curious why there is always a failed select message after the update? Select failed: Interrupted system call
Since a signal is used to do the update, the select gets interrupted. This is not really an error, so I suppose it could be checked/taken out.
fix verified and nolonger a "Select failed" message.
Updating version to the right level in the defects. Sorry for the storm.