Bug 494803
Summary: | On a two node cluster, cman status shows different outputs, then clvmd hangs. | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | Alex Urbanowicz <aurbanowicz> | ||||
Component: | cman | Assignee: | Christine Caulfield <ccaulfie> | ||||
Status: | CLOSED DUPLICATE | QA Contact: | Cluster QE <mspqa-list> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 4 | CC: | cluster-maint, edamato | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2009-04-08 15:02:56 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Can you try the patch mentioned in bz#487397 please ? (In reply to comment #1) > Can you try the patch mentioned in bz#487397 please ? The bug comes and goes periodically, but the patched cman package referenced in 487397 seems to make it go away permanently. Thank you very much! I'm pleased that helped. I'll close this bug now. *** This bug has been marked as a duplicate of bug 487397 *** |
Created attachment 338659 [details] /var/log/messages excerpt from the desynchronized node blade301 and 302 (concatenated). Description of problem: I have a two node cluster with following config: <?xml version="1.0"?> <cluster config_version="5" name="gfs-project-mysql"> <fence_daemon post_fail_delay="0" post_join_delay="33"/> <clusternodes> <clusternode name="blade301-cluster" nodeid="1" votes="1"> <fence> <method name="1"> <device name="rsysrq" nodename="blade301-cluster" password="x" port="9" operation="1bbbb"/> </method> <method name="2"> <device name="manual" nodename="blade301-cluster"/> </method> </fence> </clusternode> <clusternode name="blade302-cluster" nodeid="2" votes="1"> <fence> <method name="1"> <device name="rsysrq" nodename="blade302-cluster" password="x" port="9" operation="1bbbb"/> </method> <method name="2"> <device name="manual" nodename="blade302-cluster"/> </method> </fence> </clusternode> </clusternodes> <cman expected_votes="1" two_node="1"/> <fencedevices> <fencedevice agent="fence_rsysrq" name="rsysrq"/> <fencedevice agent="fence_manual" name="manual"/> </fencedevices> <rm> <failoverdomains/> <resources/> </rm> </cluster> After starting by hand, the cluster synchronizes properly with running clvmd and gfs on both nodes. After restarting one of the nodes, cman nodes display different information on the nodes: [root@blade301 alex]# date Wed Apr 8 09:30:37 CEST 2009 [root@blade301 alex]# cman_tool nodes Node Sts Inc Joined Name 1 M 372 2009-04-07 15:48:23 blade301-cluster 2 M 392 2009-04-07 16:29:33 blade302-cluster [root@blade301 alex]# cman_tool status Version: 6.1.0 Config Version: 5 Cluster Name: gfs-orange-mysql Cluster Id: 18 Cluster Member: Yes Cluster Generation: 392 Membership state: Cluster-Member Nodes: 2 Expected votes: 1 Total votes: 2 Quorum: 1 Active subsystems: 7 Flags: 2node Dirty Ports Bound: 0 Node name: blade301-cluster Node ID: 1 Multicast addresses: 239.192.0.18 Node addresses: 10.100.216.16 [root@blade302 alex]# date Wed Apr 8 09:30:42 CEST 2009 [root@blade302 alex]# cman_tool nodes Node Sts Inc Joined Name 1 X 0 blade301-cluster 2 M 388 2009-04-07 16:29:34 blade302-cluster [root@blade302 alex]# cman_tool status Version: 6.1.0 Config Version: 5 Cluster Name: gfs-orange-mysql Cluster Id: 18 Cluster Member: Yes Cluster Generation: 392 Membership state: Cluster-Member Nodes: 1 Expected votes: 1 Total votes: 1 Quorum: 1 Active subsystems: 8 Flags: 2node Dirty Ports Bound: 0 11 Node name: blade302-cluster Node ID: 2 Multicast addresses: 239.192.0.18 Node addresses: 10.100.216.17 When the cluster is in this state, any CLVM related operation hangs on both nodes and it is impossible to use the gfs volume. The logs (attached) do not show any indication of the state. The logs and included outputs are from state after blade301 node was rebooted. Version-Release number of selected component (if applicable): cman-2.0.98-1.el5 How reproducible: Steps to Reproduce: 1. set up cluster using the above config with cman, clvmd and gfs running, rgmanager not running, iscsi storage as a backend 2. start the cluster 3. if the cluster synchronizes properly, reboot one of the nodes Actual results: Expected results: Additional info: