Description of problem: When cman starts (either via the cman service or through 'cman_tool -t 120 -w join') it will broadcast on the wrong network (on a box with more than one NIC.) The computer is directly connected to a network (172.19.0.0/32) that another cluster node is also on. When parsing the file (/etc/cluster/cluster.conf) and in subsequent communication attempts, this node only attemptes to contact other cluster members on its OTHER network (192.168.0.0/32) (ifconfig output, /etc/sysconfig/network-scripts/ifcfg-eth* and cluster.conf provided as attachments.) Computer t2 is on networks: 172.16.0.0/16 and 172.19.0.0/16. Computer t3 is on networks: 192.168.0.0/16 and 172.19.0.0/16. DNS resolution is complete for both forward and reverse look-ups on both t2 and t3. Version-Release number of selected component (if applicable): GFS-kernel-smp 2.6.11.8-20050601.152643.FC4.25 cman-kernel-smp 2.6.11.5-20050601.152643.FC4.23 dlm-kernel-smp 2.6.11.5-20050601.152643.FC4.22 gnbd-kernel-smp 2.6.11.2-20050420.133124.FC4.58 kernel-smp 2.6.16-1.2069_FC4 kernel-smp-devel 2.6.16-1.2069_FC4 How reproducible: always (from the same side of the cluster - t3) Steps to Reproduce: 1. start cman Actual results: cman reports quorum but only after timing out past its 120 second limit. Tcpdump reports broadcast messages on port 6809 to 192.168.255.255 (rather than on its configured cluster.conf network) Expected results: broadcast messages on port 6809 to 172.19.255.255 Additional info: I can provide strace info if required...
Created attachment 127312 [details] attachments as described in Description (ifcfg-eth*, cluster.conf, route, etc)
in original description networks labeled as 172.19.0.0/32 and 192.168.0.0/32 should have both been /16's not /32's.
Can you tell me what t3.m_clust & t2.m_clust resolve to? It's those names that cman will use when it determines which interface to bind to.
Is this still a problem or shall I close this bug ?