Bug 110775 - need to modify docs for bonding driver to ensure physical interfaces have driver modules loaded first
need to modify docs for bonding driver to ensure physical interfaces have dri...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: rhel-rg (Show other bugs)
3.0
All Linux
medium Severity medium
: ---
: ---
Assigned To: Andrius Benokraitis
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-11-24 10:37 EST by Neil Horman
Modified: 2015-07-14 00:24 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-10-11 12:29:24 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Neil Horman 2003-11-24 10:37:10 EST
Description of problem:
Issue Tracker 29775 - If bonding driver enslaves a device which does
not yet have a driver module loaded (i.e referenced interface does not
exist), then bonding driver may fail to tx/rx data.  As we cannot
determine what physical drivers need to be loaded before the bonding
interface is brought up, the user documentation should indicate that
the user configuring the system should modify modules.dep/modules.conf
to ensure that the appropriate phyiscial drivers are loaded before the
bonding driver loads

Version-Release number of selected component (if applicable):


How reproducible:
always

Steps to Reproduce:
1.ensure that bonding interface is brought up before drivers for a
slave interface is installed.
2.examine /var/log/messages for failure notices
3.
  
Actual results:


Expected results:


Additional info:
Comment 1 Neil Horman 2003-11-24 11:26:20 EST
I was thinking as an addition somewhere in section 8.2 of teh RHEL3
reference guide
Comment 3 Andrius Benokraitis 2003-12-12 11:31:51 EST
Neil, I see where you are going with this, but I'm still not sure I
fully understand you. Either this is a "duh" or maybe I'm not
following you. Correct me if I'm wrong but you won't see any ethernet
devices until the physical drivers are loaded. Are you saying that you
could see an eth0 or eth1 without the right *.o files? You would think
an admin would test the traffic on each device individually before
setting up the bonding interface. Please clarify.
Comment 4 Neil Horman 2003-12-12 13:28:57 EST
The problem arises from the an environment in which the bonded
interface is brought up before any of the physical interface are
brought up.  Consider a system with the following files in the
/etc/sysconfig/network-scripts:
ifcfg-bond0
ifcfg-eth0
ifcfg-eth1

if bond0 has interfaces eth1 and eth0 enslaved to it, and eth1 and
eth0 are configuerd to not be brought up first, then bond0 won't work
propery, because the drivers for eth0 and eth1 may not be loaded yet.
 There is a missing dependency between the bonding driver and whatever
driver is used to create eth0 and eth1.
Comment 5 Andrius Benokraitis 2003-12-12 16:44:01 EST
So in a nutshell: in /etc/sysconfig/network-script: ifcfg-bond0 has to
be lised last to load for bonding to work? What would be the
step-by-step solution to making sure the problem does not occur?
Comment 6 Neil Horman 2003-12-15 08:11:09 EST
Its not so much that the bonding driver need to brought up last, but
rather that the modules driving the slave interfaces need to be
dependencies of the bonding driver in modules.conf.  I'll do some
testing and come up with a good set of instructions for you.
Comment 7 Andrius Benokraitis 2003-12-16 14:43:03 EST
thank you very much Neil -- please post your solution here so that I
can add it to the errata page.
Comment 8 Larry Troan 2004-01-15 22:04:27 EST
FROM ISSUE TRACKER
Event posted 12-18-2003 04:13pm by brian.b with duration of 0.00   
The bonding driver needs to be loaded first so SNMP reports 
information correctly... Below note from bonding.txt. 

If running SNMP agents, the bonding driver should be loaded before 
any network
drivers participating in a bond. This requirement is due to the the 
interface
index (ipAdEntIfIndex) being associated to the first interface found 
with a
given IP address. That is, there is only one ipAdEntIfIndex for each 
IP
address. For example, if eth0 and eth1 are slaves of bond0 and the 
driver for
eth0 is loaded before the bonding driver, the interface for the IP 
address
will be associated with the eth0 interface. This configuration is 
shown below,
the IP address 192.168.1.1 has an interface index of 2 which indexes 
to eth0
in the ifDescr table (ifDescr.2).

    interfaces.ifTable.ifEntry.ifDescr.1 = lo
    interfaces.ifTable.ifEntry.ifDescr.2 = eth0
    interfaces.ifTable.ifEntry.ifDescr.3 = eth1
    interfaces.ifTable.ifEntry.ifDescr.4 = eth2
    interfaces.ifTable.ifEntry.ifDescr.5 = eth3
    interfaces.ifTable.ifEntry.ifDescr.6 = bond0
    ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.10.10.10 = 5
    ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.192.168.1.1 = 2
    ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.74.20.94 = 4
    ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.127.0.0.1 = 1

This problem is avoided by loading the bonding driver before any 
network
drivers participating in a bond. Below is an example of loading the 
bonding
driver first, the IP address 192.168.1.1 is correctly associated with
ifDescr.2.

    interfaces.ifTable.ifEntry.ifDescr.1 = lo
    interfaces.ifTable.ifEntry.ifDescr.2 = bond0
    interfaces.ifTable.ifEntry.ifDescr.3 = eth0
    interfaces.ifTable.ifEntry.ifDescr.4 = eth1
    interfaces.ifTable.ifEntry.ifDescr.5 = eth2
    interfaces.ifTable.ifEntry.ifDescr.6 = eth3
    ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.10.10.10 = 6
    ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.192.168.1.1 = 2
    ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.74.20.94 = 5
    ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.127.0.0.1 = 1

While some distributions may not report the interface name in ifDescr,
the association between the IP address and IfIndex remains and SNMP
functions such as Interface_Scan_Next will report that association.  
    Event posted 01-08-2004 11:49am by rlandry with duration of 
0.00   
Sounds like a simple change to modules.conf would avoid this issue.  
I'll investigate specifics and post a follow up.

rlandry assigned to issue for Sustaining Engineering.  

-------------------------------------------
Event posted 01-09-2004 03:26pm by rlandry with duration of 0.00   
Assuming the heart of the issue is that you need the eth modules 
loaded before bond to ensure that when bond attempts to configure the 
interfaces it can do so as it doesn't use the standard ifup scripts 
to do so; does adding...

add probeall bond0 if0[...ifN]  (replace ifX with interfaces 
associated to bond0)

...or...

add below bond0 if0
add below bond0 [...ifN]

resolve the issue?  
Comment 9 Andrius Benokraitis 2004-01-19 16:13:26 EST
Before resolving, can you please tell me what files the end user would
have to modify to have bonding work. The above solution does not
provide a sufficient solution to document for end users. I am assuming
the above is source code in various files?

Thank you!
Comment 10 ginnie nuckles 2004-02-05 11:46:50 EST
I have a bond0 defined as being eth2 and eth3 both e1000 devices.
However my primary ethernet network connection is via eth0
If I define the bond this way and boot my machine I loose all 
connectivity to my eth0 connection and my devices do not bond 
together.

If I remove the options parameter ie.

options bond0 mode=1 miimon=100 primary=eth2

from my modules.conf and boot my system then I can connect via my 
eth0 connection. Not only that .. but I can then add the options 
parameter into the modules.conf and bring up my bond0 
with /sbin/ifconfig and now I have all my networks connected and 
working ???? what is going on here ??  ps. also tried the probeall 
parm stated in the bonding.txt but it doesnt make any difference. My 
system logs all show NO problems see below ????? thanks 

Feb  5 08:34:34 admindev1 kernel: FDC 0 is a National Semiconductor 
PC87306
Feb  5 08:34:34 admindev1 kernel: tg3.c:v2.3 (November 5, 2003)
Feb  5 08:34:34 admindev1 kernel: eth0: Tigon3 [partno(NA) rev 1002 
PHY(5703)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet
00:0b:cd:ee:67:8d
Feb  5 08:34:34 admindev1 kernel: eth1: Tigon3 [partno(NA) rev 1002 
PHY(5703)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet
00:0b:cd:ee:67:8c
Feb  5 08:34:34 admindev1 kernel: Intel(R) PRO/1000 Network Driver - 
version 5.2.16a
Feb  5 08:34:34 admindev1 kernel: Copyright (c) 1999-2003 Intel 
Corporation.
Feb  5 08:34:34 admindev1 kernel: eth2: Intel(R) PRO/1000 Network 
Connection
Feb  5 08:34:34 admindev1 kernel: eth3: Intel(R) PRO/1000 Network 
Connection
Feb  5 08:34:34 admindev1 kernel: ip_tables: (C) 2000-2002 Netfilter 
core team
Feb  5 08:34:34 admindev1 kernel: bonding.c:v1.0.4f-1 (November 23, 
2003)
Feb  5 08:34:34 admindev1 kernel: bond0 registered with MII link 
monitoring set to 100 ms, in fault-tolerance (active-backup)
mode.
Feb  5 08:34:34 admindev1 kernel: bond0 registered without ARP 
monitoring
Feb  5 08:34:34 admindev1 kernel: Intel(R) PRO/1000 Network Driver - 
version 5.2.16a
Feb  5 08:34:34 admindev1 kernel: Copyright (c) 1999-2003 Intel 
Corporation.
Feb  5 08:34:34 admindev1 kernel: eth0: Intel(R) PRO/1000 Network 
Connection
Feb  5 08:34:34 admindev1 kernel: eth1: Intel(R) PRO/1000 Network 
Connection
Feb  5 08:34:34 admindev1 kernel: ip_tables: (C) 2000-2002 Netfilter 
core team
Feb  5 08:34:34 admindev1 kernel: ip_tables: (C) 2000-2002 Netfilter 
core team
Feb  5 08:34:34 admindev1 kernel: e1000: eth0 NIC Link is Up 100 Mbps 
Full Duplex

Comment 11 Andrius Benokraitis 2004-02-10 15:00:29 EST
Neil, Please verify comments above and confirm what documentation is
needed for *specific* and most accurate/direct process to configure
bonding. i.e what files are used such as /etc/modules.conf and/or
/etc/sysconfig/network-scripts/* and what changes need to be made to
the specific files.

Thanks!
Comment 12 Larry Troan 2004-03-19 09:05:22 EST
FROM ISSUE TRACKER...
Event posted 02-19-2004 02:32pm by rlandry  	
I'm at a loss as well.  My post from 2/10 shows what I did to resolve
what I understood to be the issue.  If it is in someway out of context
or solving the wrong problem or something I'll need to know what;
otherwise it looks like "not a redhat problem" to me.  Proper config
of modules.conf allows one to go from no nic modules to functioning
bonding.

----------------------------
Event posted 02-19-2004 04:50pm by ltroan	
brian.b assigned to issue for HP-ProLiant.
Status set to: Waiting on Client

----------
Event posted 03-04-2004 09:25pm by ltroan 	
Brian, waiting for HP to respond to 2/19 posting above. If there is
nothing else to do, please close this Issue Tracker.
   
-------------
Event posted 03-12-2004 01:18pm by brian.b	
I was running with No Firewall and was not using DHCP (IP addresses
were hard coded), but the significant item here is "one could probably
allow bond0 on boot; however the ifup would occur before pcmcia was
started so it would always fail"  I believe that this is the real
point that the bond0 and the other modules come back properly on a
reboot.  This appears to be restating the problem.  Why would a
customer set up a team and then turn it off for reboot?

Status set to: Waiting on Tech

-----------------------------
Event posted 03-15-2004 04:18pm by brian.b 	
DHCP is sometimes used on a bonded interface, see below posts. A
bonded DHCP interface has been successfully tested in the past -
possibly with a competitor's install, but the config has been tested
and did work at that time.


-----Original Message-----
From: bonding-devel-admin@lists.sourceforge.net
[mailto:bonding-devel-admin@lists.sourceforge.net] On Behalf Of List User
Sent: Monday, November 24, 2003 1:42 AM
To: bonding-devel@lists.sourceforge.net
Subject: [Bonding-devel] question about bonding module

Sorry if this is a wrong list, please advise where to send such questions.
RedHat 9, kernel version: 2.4.20.80
I am bonding etc1 and eth2 (machine has 3 NICs). Each NIC gets IP address
via DHCP. Bond0 is configured for 10.0.0.1. DHCP server givesIPs from
192.168.222.0 network.

1. When in the slave mode, neither eth1 nor eth2 gets IP from the DHCP
server (I don't think DHCP request is even send)
2. When interface is NOT in the slave mode, DHCP works fine.
why?
thanks



-----Original Message-----
From: bonding-devel-admin@lists.sourceforge.net
[<mailto:bonding-devel-admin@lists.sourceforge.net>] On Behalf Of Jay
Vosburgh
Sent: Tuesday, November 25, 2003 1:25 PM
To: List User
Cc: bonding-devel@lists.sourceforge.net
Subject: Re: [Bonding-devel] question about bonding module



>RedHat 9, kernel version: 2.4.20.80
>I am bonding etc1 and eth2 (machine has 3 NICs). Each NIC gets IP =
>address via DHCP. Bond0 is configured for 10.0.0.1. DHCP server
>givesIPs from 192.168.222.0 network.
>
>1. When in the slave mode, neither eth1 nor eth2 gets IP from the DHCP
>server (I don't think DHCP request is even send) 2. When interface is
>NOT in the slave mode, DHCP works fine. why?

Because you've configured eth1 and eth2 as slaves.  In this situation,
they get their IP address information from the master (the details
depend upon which bonding mode you use).  Slave interfaces have no
network identity separate from the master, so having them obtain IP
addresses from DHCP is a meaningless thing to do.

Perhaps what you want is for the master device, bond0, to get its
address from DHCP?

-J

---
-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
Comment 13 Larry Troan 2004-03-19 09:07:19 EST
Neil, did the above provide all the information Andrius requested?
Comment 14 Andrius Benokraitis 2004-03-19 11:00:21 EST
All, please visit:
http://www.redhat.com/docs/manuals/enterprise/RHEL-3-Manual/ref-guide/s1-modules-ethernet.html

Scroll to the bottom (Section A.3.2.1) and please tell me exactly what
needs to be where if it is indeed a Red Hat problem. All previous
posts are great technical information but I cannot identify what parts
are most important to be included in the Reference Guide.

Also, is this issue an error in the Guide or just added functionality
that can wait till the RHEL4-rgs?
Comment 15 Andrius Benokraitis 2004-10-11 12:29:24 EDT
After discussing with Neil, the document is good AS IS with new 2.6
kernel. Closing as CURRENTRELEASE with his changes for RHEL4.

Note You need to log in before you can comment on or make changes to this bug.