Hide Forgot
Description of problem: I have been trying to set up a cluster of 3 on 6.1 using a cisco switch, and therefore a fixed multicast address - 239.192.15.224 in this case. All the docs etc. say to add to the cluster.conf: <cman> <multicast addr="239.192.15.224"/> </cman> This seems to work and a cman_tool status brings back the correct multicast address, but has a Quorum status of "Activity Blocked", because the cluster nodes never join and you have clusters of 1 forming. *However* if I manually run "cman_tool leave" and then "cman_tool join -m 239.192.15.224", the cluster forms. Version-Release number of selected component (if applicable): 6.1 How reproducible: Every time. Steps to Reproduce: 1. Set manual multicasting address. 2. service cman restart - will timeout waiting for quorum 3. on 2+ nodes: * cman_tool leave * cman_tool join -m <addr> Actual results: Cluster does not form, timeouts waiting for quorum. Expected results: Should form a cluster without manual intervention. Additional info:
Please provide full cluster.conf and collect /var/logs/cluster from all nodes when the failure appear. This looks very similar to bugzilla 720100 that has been fixed in 6.1 updates already. Also check that you are running latest updates.
Agreed, this looks like the same issue as 720100. However, I am running: cman-3.0.12-41.el6.x86_64 And looking at the bug for 720100, it looks like the fix should have been rolled in here? If you let me know what the ttl change in cluster.conf should be, I can test and let you know if it is a dupe? Thanks
I have changed my cluster.conf to have <multicast ... ttl="255/>, and verified that it is the same issue as 720100.
(In reply to comment #3) > Agreed, this looks like the same issue as 720100. > > However, I am running: > > cman-3.0.12-41.el6.x86_64 > > And looking at the bug for 720100, it looks like the fix should have been > rolled in here? cluster-3.0.12-41.el6_1.1 this one has the fix. your version does not. > > If you let me know what the ttl change in cluster.conf should be, I can test > and let you know if it is a dupe? You can cross check the package changelogs too. If this is not the problem, I will still need your logs and full cluster.conf
(In reply to comment #4) > I have changed my cluster.conf to have <multicast ... ttl="255/>, and verified > that it is the same issue as 720100. Ok sorry your comment arrived while I was writing mine. Your packages are not updated. See my previous comment. *** This bug has been marked as a duplicate of bug 720100 ***