Bug 426842

Summary: Ethernet Channel Bonding Not working in Cluster Suite
Product: [Retired] Red Hat Cluster Suite Reporter: Balaji.S <balajisundar>
Component: cmanAssignee: Christine Caulfield <ccaulfie>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Joshua Wulf <jwulf>
Severity: urgent Docs Contact:
Priority: low    
Version: 4CC: ccaulfie, cluster-maint, lcarlon, lhh
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-09-11 08:06:09 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
The following text file contains Ethernet Channel Bonding Configuration Details and Other Details
none
My Cluster Configuration File none

Description Balaji.S 2007-12-27 08:16:34 UTC
Description of problem:
Before Ethernet Channel Bonding cluster services are active in primary node and
other nodes acts as passive node and member status is Online for both the nodes
of cluster

But after Ethernet Channel Bonding configuration cluster services are active on
both the nodes and member status in current node is Online and other node as Offline




Version-Release number of selected component (if applicable):
rhel-4-u3-rhcs-i386

How reproducible:
1. Configure Cluster 
2. Report the system and then cluster become active in primary node and other
node as passive and member status becomes Online for both the cluster nodes
3. Configure Ethernet Channel Bonding 
4. Report the system and then cluster services are active on both the nodes and
member status of current node is Online and other node as Offline

Steps to Reproduce:
1.In am followed the RHEL Cluster Suite Configuration document "rh-cs-en-4.pdf"
2.I have configured Ethernet Channel Bonding in Each Cluster Nodes to avoid the
network single point failure
3.Reboot the system for the changes to take effect and after Ethernet Channel
Bonding configuration cluster services are active on both the nodes and
member status of current node is Online and other node is Offline


I am attached one text file with this.
That text file contains Ethernet Channel Bonding Configuration Details and Other
  Details

  
Actual results:
Cluster service will be active on both the cluster nodes and member status of
current node is Online and other node is Offline

Expected results:
Cluster will be active in primary node and passive in secondary node and member
status of both nodes become Online 


Additional info:
I am not sure why this is happening. Can some one throw light on this.

Comment 1 Balaji.S 2007-12-27 08:16:34 UTC
Created attachment 290444 [details]
The following text file contains Ethernet Channel Bonding Configuration Details and Other Details

Comment 4 Lon Hohberger 2008-01-08 22:05:27 UTC
Balaji,

There could be a couple of causes - right now we think this might be a
documentation issue; i.e. the instructions are wrong or unclear.  

We're reviewing the documentation to see if there are any errors, and will let
you know if we find any.

Comment 5 Balaji.S 2008-02-29 07:25:57 UTC
(In reply to comment #4)
> Balaji,
> 
> There could be a couple of causes - right now we think this might be a
> documentation issue; i.e. the instructions are wrong or unclear.  
> 
> We're reviewing the documentation to see if there are any errors, and will let
> you know if we find any.

Sir,

 Can you help me to solve the above problem and it is very urgent or send me any
other Alternate Configuration details for Ethernet Channel Bonding

Regards
-S.Balaji
   

Comment 11 Lon Hohberger 2008-04-07 12:55:02 UTC
The documentation looks correct, and your configuration looks correct.

I'm not sure what would cause it to fail in your tests - as I understand it,
without channel bonding, everything works, but with channel bonding, we end up
with two sub-clusters (for some reason, both become quorate).

Note -- more configuration information here:

https://www.redhat.com/archives/linux-cluster/2008-February/msg00349.html



Comment 12 Balaji.S 2008-04-08 03:54:33 UTC
Dear All,
 I have refered
(https://www.redhat.com/archives/linux-cluster/2008-February/msg00349.html) this
Configuration Details as you send and that configuration details and query is
send by myself and but the solution is not given.
 
 I need clarification about Ethernet Channel Bonding will work with Fence Device
or without fence device.
 
Regards
-S.Balaji
  
 
(In reply to comment #5)
> (In reply to comment #4)
> > Balaji,
> > 
> > There could be a couple of causes - right now we think this might be a
> > documentation issue; i.e. the instructions are wrong or unclear.  
> > 
> > We're reviewing the documentation to see if there are any errors, and will let
> > you know if we find any.
> 
> Sir,
> 
>  Can you help me to solve the above problem and it is very urgent or send me any
> other Alternate Configuration details for Ethernet Channel Bonding
> 
> Regards
> -S.Balaji
>    

Comment 13 Lon Hohberger 2008-04-08 15:12:41 UTC
Ethernet bonding is supposed to be transparent to applications (this includes
cman and fencing), so your configuration *should* work with fencing.  It looks
like your bonded interface is correctly configured, which is why we're having
trouble identifying the problem.

Attaching your cluster.conf to the bugzilla would be helpful.  



Comment 14 Balaji.S 2008-04-09 04:51:04 UTC
Created attachment 301748 [details]
My Cluster Configuration File

Comment 15 Christine Caulfield 2008-04-09 08:05:38 UTC
The configuration file looks OK, but it's hard to be sure without knowing what
IP addresses the names resolve to. I do recommend that you use fully-qualified
host names in cluster.conf as this can avoid some ambiguities.

Can you please attach the output of the commands

 cman_tool status
 cman_tool nodes

to the bugzilla please ?
It's also worth you checking that the IP addresses shown in "cman_tool status"
match those that you are expecting.

It might also be worthwhile doing a tcpump of the communications between the two
nodes (please do this on both nodes so we can see if/where the communications is
missing). The command is

  tcpdump -xs0 -w cman-tcpdump.dmp port 6809

You might also like to read this document which discusses how cman uses the
network, and how to do some diagnostics for yourself:

http://people.redhat.com/ccaulfie/docs/CSNetworking.pdf


Comment 16 Lon Hohberger 2008-04-09 15:13:48 UTC
What's odd is the node(s) became quorate and reportedly started services in a
two_node="1" cluster w/o any fencing.  (Looked @ cluster.conf)

(see comment #1)

Comment 17 Balaji.S 2008-11-06 09:48:11 UTC
Dear All,
  For your Verification 
  cman_tool status and  cman_tool nodes outputs after Ethernet Channel Bonding in RHEL Cluster Suite details are added below

Before Bonding:
corviewprimary: cman_tool status
Protocol version: 5.0.1
Config version: 106
Cluster name: EMSCluster
Cluster ID: 11444
Cluster Member: Yes
Membership state: Cluster-Member
Nodes: 2
Expected_votes: 1
Total_votes: 2
Quorum: 1
Active subsystems: 4
Node name: corviewprimary
Node ID: 1
Node addresses: 192.168.13.110

corviewsecondary: cman_tool status
Protocol version: 5.0.1
Config version: 106
Cluster name: EMSCluster
Cluster ID: 11444
Cluster Member: Yes
Membership state: Cluster-Member
Nodes: 2
Expected_votes: 1
Total_votes: 2
Quorum: 1
Active subsystems: 4
Node name: corviewsecondary
Node ID: 2
Node addresses: 192.168.13.179


After Bonding:
corviewprimary: cman_tool status
Protocol version: 5.0.1
Config version: 106
Cluster name: EMSCluster
Cluster ID: 11444
Cluster Member: Yes
Membership state: Cluster-Member
Nodes: 1
Expected_votes: 1
Total_votes: 1
Quorum: 1
Active subsystems: 4
Node name: corviewprimary
Node ID: 1
Node addresses: 192.168.13.110


corviewsecondary: cman_tool status
Protocol version: 5.0.1
Config version: 106
Cluster name: EMSCluster
Cluster ID: 11444
Cluster Member: Yes
Membership state: Cluster-Member
Nodes: 1
Expected_votes: 1
Total_votes: 1
Quorum: 1
Active subsystems: 4
Node name: corviewsecondary
Node ID: 1
Node addresses: 192.168.13.179

cman_tools nodes at primary after bonding
Node  Votes Exp Sts  Name
   1    1    1   M   corviewprimary

cman_tools nodes at secondary after bonding
Node  Votes Exp Sts  Name
   1    1    1   M   corviewsecondary

Regards
-S.Balaji

Comment 18 Christine Caulfield 2008-11-06 10:24:54 UTC
Thank you.

So that proves that cman is using the correct IP addresses for bonding, and it's now very clear that traffic is not flowing between the two systems - at least not cman traffic. Have you checked that you can ping and (if appropriate) ssh between the two system with bonding enabled?

If you have iptables set up then it is worth reviewing those rules so that they are applicable to the bonded interfaces.

If all this looks OK, then I'm a little stumped. It could be that, for some reason, only multicast traffic is not being passed correctly over the bonded interfaces. 'ip maddr list' might be able to shed some light on this.

Comment 20 Christine Caulfield 2009-09-11 08:06:09 UTC
Nothing has happened on this bug for 10 months now so I'm going to close it. If there is any more information or a reproducible recurrence then feel free to re-open it.