Bug 613828

Summary:	bond0 only works in promisc mode
Product:	Red Hat Enterprise Linux 5	Reporter:	jghobrial
Component:	kernel	Assignee:	Andy Gospodarek <agospoda>
Status:	CLOSED DUPLICATE	QA Contact:	Network QE <network-qe>
Severity:	urgent	Docs Contact:
Priority:	urgent
Version:	5.5	CC:	agospoda, anton, cdupuis, clusterman, cww, dhoward, dwu, gbarros, GR-Linux-NIC-Dev, herrold, hjia, imusayev, ivan.borghetti, jentrena, jpirko, jwest, jwilson, peterm, rajesh.borundia, redhat.com, syeghiay, tao, tcamuso, vincew
Target Milestone:	rc	Keywords:	Reopened, ZStream
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2010-11-02 15:21:39 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description jghobrial 2010-07-12 21:53:02 UTC

Description of problem:
rr bonding fails to allow the network interfaces to function properly unless promisc is used on the bond0 interface. I have eth0 with a routable IP and eth1, eth2, and eth3 are slaves to bond0 which has a routable IP. If I start the machine with bond0 up I cannot reach the machine nor can the machine talk to the network.

If I start the machine without bond0 then eth0 works properly. If I: ifconfig bond0 promisc && ifup bond0 then the interface works as expected.

Version-Release number of selected component (if applicable):

How reproducible:
bond 3 interfaces using the standard RHEL 5 syntax. No network traffic flows out or into the machine. Leave bond0 down when booting, ifconfig bond0 promisc && ifup bond0, works.

Steps to Reproduce:
1.
2.
3.
  
Actual results:
No network traffic if bond0 is started at boot without promisc mode. There is network traffic if bond0 is started from rc.local using: ifconfig bond0 promisc && ifup bond0

Expected results:
Network traffic.

Additional info:

Comment 1 jghobrial 2010-07-14 20:10:47 UTC

My real issue is one of the NICs had no link and was contributing to this problem.

Too bad that there are no explicit error messages related to a link not being available when a bond is started and the machine would have no network connections even if the bond was not properly running.

Comment 2 jghobrial 2010-07-16 14:50:39 UTC

After further investigation. I can confirm that rebooting with the bond started upon boot using the normal RedHat startup methods causes the network to not work at all. I'll do some further debugging.

Comment 3 Andy Gospodarek 2010-07-16 16:06:27 UTC

This definitely seems odd. I've seen RR-mode bonding work just fine, so I know it is not totally broken. With a fix of possible bonding and routing issues mixed together this might be a bit tricky to debug.

Are the two IP addresses used (one for eth0 and one for bond0) in the same subnet? Are they in the same broadcast domain? What about the hosts trying to connect? Where are they?

The key will be to first make sure that bonding is working properly. This will be best accomplished by taking down eth0 and only using bond0 with a host on the same network. Is there any chance that your NICs do not support configuration of their MAC address, so bond0 only works correctly when the slaves are told to receive all traffic rather than traffic destined for the bond0 interface's MAC address?

Once you can confirm this is working you can bring up eth0 and take a look at some of the sysctrl options that control ARP as it can be problematic if eth0 and bond0 are on the same broadcast domain.

Things like:

arp_filter - BOOLEAN
1 - Allows you to have multiple network interfaces on the same
subnet, and have the ARPs for each interface be answered
based on whether or not the kernel would route a packet from
the ARP'd IP out that interface (therefore you must use source
based routing for this to work). In other words it allows control
of which cards (usually 1) will respond to an arp request.

0 - (default) The kernel can respond to arp requests with addresses
from other interfaces. This may seem wrong but it usually makes
sense, because it increases the chance of successful communication.
IP addresses are owned by the complete host on Linux, not by
particular interfaces. Only for more complex setups like load-
balancing, does this behaviour cause problems.

arp_filter for the interface will be enabled if at least one of
conf/{all,interface}/arp_filter is set to TRUE,
it will be disabled otherwise

arp_announce - INTEGER
Define different restriction levels for announcing the local
source IP address from IP packets in ARP requests sent on
interface:
0 - (default) Use any local address, configured on any interface
1 - Try to avoid local addresses that are not in the target's
subnet for this interface. This mode is useful when target
hosts reachable via this interface require the source IP
address in ARP requests to be part of their logical network
configured on the receiving interface. When we generate the
request we will check all our subnets that include the
target IP and will preserve the source address if it is from
such subnet. If there is no such subnet we select source
address according to the rules for level 2.
2 - Always use the best local address for this target.
In this mode we ignore the source address in the IP packet
and try to select local address that we prefer for talks with
the target host. Such local address is selected by looking
for primary IP addresses on all our subnets on the outgoing
interface that include the target IP address. If no suitable
local address is found we select the first local address
we have on the outgoing interface or on all other interfaces,
with the hope we will receive reply for our request and
even sometimes no matter the source IP address we announce.

The max value from conf/{all,interface}/arp_announce is used.

Increasing the restriction level gives more chance for
receiving answer from the resolved target while decreasing
the level announces more valid sender's information.

arp_ignore - INTEGER
Define different modes for sending replies in response to
received ARP requests that resolve local target IP addresses:
0 - (default): reply for any local target IP address, configured
on any interface
1 - reply only if the target IP address is local address
configured on the incoming interface
2 - reply only if the target IP address is local address
configured on the incoming interface and both with the
sender's IP address are part from same subnet on this interface
3 - do not reply for local addresses configured with scope host,
only resolutions for global and link addresses are replied
4-7 - reserved
8 - do not reply for all local addresses

The max value from conf/{all,interface}/arp_ignore is used
when ARP request is received on the {interface}

I set the following in my /etc/sysctrl.conf on all hosts with more than one network interface:

net.ipv4.conf.all.arp_filter = 1
net.ipv4.conf.all.arp_ignore = 1

and taking interfaces up and down does not impact anything.

I also do not know what drivers and NICs are being used or even the kernel version. That bit of info would be helpful.

Comment 4 jghobrial 2010-07-16 17:27:47 UTC

RHEL 5.5
Linux 2.6.18-194.8.1.el5 #1 SMP Wed Jun 23 10:52:51 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

Please note this is my custom modprobe.conf order as the original install had the nexten_nic nics before the igb nics. Also I've modified the MAC addresses of the ifcfg-eth? to get them reordered for my sanity.

alias eth0 igb
alias eth1 igb
alias eth2 netxen_nic
alias eth3 netxen_nic
alias eth4 netxen_nic
alias eth5 netxen_nic
alias eth6 netxen_nic
alias eth7 netxen_nic
alias eth8 netxen_nic
alias eth9 netxen_nic

eth0 and bond0 are on the same subnet and connected to the same switch

eth1, eth2, eth3 are slaves to bond0

eth6, eth7, eth8, and eth9 are connected to a different switch each with their own different non-routable addresses. In this case they are being for aoe purposes.

> Is there any chance that your NICs do not support
> configuration of their MAC address, so bond0 only works correctly when the
> slaves are told to receive all traffic rather than traffic destined for the
> bond0 interface's MAC address?

How would I know if they don't support this?

lspci | grep Ethernet
04:00.0 Ethernet controller: NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter (rev 42)
04:00.1 Ethernet controller: NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter (rev 42)
04:00.2 Ethernet controller: NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter (rev 42)
04:00.3 Ethernet controller: NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter (rev 42)
05:00.0 Ethernet controller: NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter (rev 42)
05:00.1 Ethernet controller: NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter (rev 42)
05:00.2 Ethernet controller: NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter (rev 42)
05:00.3 Ethernet controller: NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter (rev 42)
07:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
07:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)

I'll try the arp options and see what happens.

Comment 5 Andy Gospodarek 2010-07-20 20:18:17 UTC

Ah, netxen nics.  This might explain the problem.

Older versions of the netxen driver didn't handle the MAC address being set and I wonder if there are still some lingering issues.  The two upstream commits that I wanted to make sure were include in RHEL were:

commit 5d09e534bbb94e1fdc8e4783ac822bc172465a91
Author: Narender Kumar <narender.kumar>
Date:   Fri Nov 20 22:08:57 2009 +0000

    netxen : fix BOND_MODE_TLB/ALB mode.

commit 3d0a3cc9d72047e4baa76021c897f64fc84cc543
Author: Dhananjay Phadke <dhananjay>
Date:   Tue May 5 19:05:08 2009 +0000

    netxen: fix bonding support

but both appear to be included in RHEL5.5 and both appear to be applied.

Comment 6 Andy Gospodarek 2010-07-20 21:45:39 UTC

I just tried this on some netxen hardware we have and mode 0 bonding worked just fine.

[root@hp-dl580g7-01 network-scripts]# lspci -s 0000:04:00.0 -n 
04:00.0 0200: 4040:0100 (rev 42)
[root@hp-dl580g7-01 network-scripts]# lspci -vv -s 0000:04:00.0 
04:00.0 Ethernet controller: NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter (rev 42)
	Subsystem: Hewlett-Packard Company NC375i Integrated Quad Port Multifunction Gigabit Server Adapter
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 154
	Region 0: Memory at d0000000 (64-bit, non-prefetchable) [size=2M]
	Region 4: Memory at d2000000 (64-bit, non-prefetchable) [size=32M]
	Capabilities: [40] MSI-X: Enable+ Mask- TabSize=64
		Vector table: BAR=0 offset=00090000
		PBA: BAR=0 offset=00090800
	Capabilities: [80] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [a0] Message Signalled Interrupts: 64bit+ Queue=0/5 Enable-
		Address: 0000000000000000  Data: 0000
	Capabilities: [c0] Express Endpoint IRQ 0
		Device: Supported: MaxPayload 128 bytes, PhantFunc 0, ExtTag-
		Device: Latency L0s <64ns, L1 <1us
		Device: AtnBtn- AtnInd- PwrInd-
		Device: Errors: Correctable- Non-Fatal- Fatal- Unsupported-
		Device: RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
		Device: MaxPayload 128 bytes, MaxReadReq 256 bytes
		Link: Supported Speed unknown, Width x8, ASPM L0s, Port 0
		Link: Latency L0s <64ns, L1 <1us
		Link: ASPM L0s Enabled RCB 64 bytes CommClk- ExtSynch-
		Link: Speed unknown, Width x8
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [140] Device Serial Number 75-73-48-6e-61-46-69-59

[root@hp-dl580g7-01 network-scripts]# ethtool -i eth0
driver: netxen_nic
version: 4.0.65
firmware-version: 4.0.520
bus-info: 0000:04:00.0
[root@hp-dl580g7-01 network-scripts]# cat /proc/net/bonding/bond0 
Ethernet Channel Bonding Driver: v3.4.0 (October 7, 2008)

Bonding Mode: load balancing (round-robin)
MII Status: up
MII Polling Interval (ms): 1000
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth0
MII Status: up
Link Failure Count: 0
Permanent HW addr: d8:d3:85:62:c2:54

Slave Interface: eth2
MII Status: up
Link Failure Count: 0
Permanent HW addr: d8:d3:85:62:c2:56
[root@hp-dl580g7-01 network-scripts]# more ifcfg-bond0
DEVICE=bond0
BOOTPROTO=dhcp
ONBOOT=yes
BONDING_OPTS="mode=0 miimon=1000"
[root@hp-dl580g7-01 network-scripts]# more /etc/modprobe.conf 
alias eth0 netxen_nic
alias eth1 netxen_nic
alias eth2 netxen_nic
alias eth3 netxen_nic
alias scsi_hostadapter cciss
alias scsi_hostadapter1 ata_piix

alias bond0 bonding


Can you cut and paste the output of:

# ethtool -i eth0

Maybe your card needs new firmware?

Comment 7 Julio Entrena Perez 2010-07-21 10:14:19 UTC

Hi Andy,

Output from 'ethtool -i eth4':

driver: netxen_nic
version: 4.0.65
firmware-version: 4.0.520
bus-info: 0000:04:00.0

Firmware is the same version as yours.

FYI, the customer doesn't experience the issue in all their netxen equipped servers, only on some of them. But they are experiencing the issue in a DL580 G7, the same server where you tried (they are using active-backup mode 1 bonding though).

They are experiencing the issue even with only one nic in the bonding:

$ cat proc/net/bonding/bond0 
Ethernet Channel Bonding Driver: v3.4.0 (October 7, 2008)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth4
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth4
MII Status: up
Link Failure Count: 0
Permanent HW addr: d8:d3:85:62:43:70

]$ cat etc/sysconfig/network-scripts/ifcfg-bond0 
#
# bond0 interface configuration file
#
DEVICE=bond0
IPADDR=x.xxx.xxx.xx (customer information, hidden).
NETMASK=255.255.255.0
USERCTL=no
BOOTPROTO=none
ONBOOT=yes
BONDING_OPTS="miimon=100 mode=1"

$ cat etc/modprobe.conf 
alias eth0 bnx2
alias eth1 bnx2
alias eth2 bnx2
alias eth3 bnx2
alias eth4 netxen_nic
alias eth5 netxen_nic
alias eth6 netxen_nic
alias eth7 netxen_nic
alias scsi_hostadapter cciss
alias scsi_hostadapter1 ata_piix
alias scsi_hostadapter2 qla2xxx
alias scsi_hostadapter3 usb-storage
options ipv6 disable=1
# configuration updates during build process
alias bond0 bonding
# alias bond1 bonding
# disable ipv6
alias net-pf-10 off

lspci entry:

04:00.0 Ethernet controller: NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter (rev 42)
04:00.1 Ethernet controller: NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter (rev 42)
04:00.2 Ethernet controller: NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter (rev 42)
04:00.3 Ethernet controller: NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter (rev 42)
...
04:00.0 0200: 4040:0100 (rev 42)
        Subsystem: 103c:705a
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 170
        Region 0: Memory at a0200000 (64-bit, non-prefetchable) [size=2M]
        Region 4: Memory at a2000000 (64-bit, non-prefetchable) [size=32M]
        Capabilities: [40] MSI-X: Enable+ Mask- TabSize=64
                Vector table: BAR=0 offset=00090000
                PBA: BAR=0 offset=00090800
        Capabilities: [80] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [a0] Message Signalled Interrupts: 64bit+ Queue=0/5 Enable-
                Address: 0000000000000000  Data: 0000
        Capabilities: [c0] Express Endpoint IRQ 0
                Device: Supported: MaxPayload 128 bytes, PhantFunc 0, ExtTag-
                Device: Latency L0s <64ns, L1 <1us
                Device: AtnBtn- AtnInd- PwrInd-
                Device: Errors: Correctable- Non-Fatal+ Fatal+ Unsupported-
                Device: RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                Device: MaxPayload 128 bytes, MaxReadReq 4096 bytes
                Link: Supported Speed unknown, Width x8, ASPM L0s, Port 0
                Link: Latency L0s <64ns, L1 <1us
                Link: ASPM L0s Enabled RCB 64 bytes CommClk- ExtSynch-
                Link: Speed unknown, Width x8
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [140] Device Serial Number 75-73-48-6e-61-46-69-59

Comment 8 Andy Gospodarek 2010-07-21 13:35:12 UTC

I also tried active-backup and it did not have any problems.

What does the dmidecode information look like? Is it different on servers that work compared to those that do not?

# dmidecode 2.10
SMBIOS 2.6 present.
308 structures occupying 8294 bytes.
Table at 0xBF7FD000.

Handle 0x0000, DMI type 0, 24 bytes
BIOS Information
Vendor: Hewlett-Packard
Version: P65
Release Date: 02/09/2010
Address: 0xF0000
Runtime Size: 64 kB
ROM Size: 8192 kB
Characteristics:
PCI is supported
PNP is supported
BIOS is upgradeable
BIOS shadowing is allowed
ESCD support is available
Boot from CD is supported
Selectable boot is supported
EDD is supported
5.25"/360 kB floppy services are supported (int 13h)
5.25"/1.2 MB floppy services are supported (int 13h)
3.5"/720 kB floppy services are supported (int 13h)
Print screen service is supported (int 5h)
8042 keyboard services are supported (int 9h)
Serial services are supported (int 14h)
Printer services are supported (int 17h)
CGA/mono video services are supported (int 10h)
ACPI is supported
USB legacy is supported
BIOS boot specification is supported
Function key-initiated network boot is supported
Targeted content distribution is supported
Firmware Revision: 1.5

If the BIOS versions are the same on all of them, I would suggest they check their switch configuration. As odd as that seems, I have seen quite a few bonding cases resolved due to switch configuation issues. Bonding can be quite sensitive sometimes.

Comment 9 Julio Entrena Perez 2010-07-21 14:44:07 UTC

Unfortunatelly, the server that doesn't experience the issue is a completely different model (DL380 G6) using different Netxen cards (10GB ones, instead of Gigabit).

This is the dmidecode of the DL580 G7 that has the issue:

# dmidecode 2.10
SMBIOS 2.6 present.
308 structures occupying 8320 bytes.
Table at 0x7F7FD000.

Handle 0x0000, DMI type 0, 24 bytes
BIOS Information
        Vendor: Hewlett-Packard
        Version: P65
        Release Date: 05/07/2010
        Address: 0xF0000
        Runtime Size: 64 kB
        ROM Size: 8192 kB
        Characteristics:
                PCI is supported
                PNP is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                ESCD support is available
                Boot from CD is supported
                Selectable boot is supported
                EDD is supported
                5.25"/360 kB floppy services are supported (int 13h)
                5.25"/1.2 MB floppy services are supported (int 13h)
                3.5"/720 kB floppy services are supported (int 13h)
                Print screen service is supported (int 5h)
                8042 keyboard services are supported (int 9h)
                Serial services are supported (int 14h)
                Printer services are supported (int 17h)
                CGA/mono video services are supported (int 10h)
                ACPI is supported
                USB legacy is supported
                BIOS boot specification is supported
                Function key-initiated network boot is supported
                Targeted content distribution is supported
        Firmware Revision: 1.5

I'm going to request them to upgrade the BIOS of the server and try again, but according to HP, the new one only adds support for the latest Xeon processors... Let's try.

Comment 10 Andy Gospodarek 2010-07-21 19:15:45 UTC

I'm not sure a BIOS update is the best thing as I'm runnig *OLDER* BIOS than they are.  Right now I'm short on suggestions other than double-and-triple checking the switches.  Though it seems unlikely that this is the case based on the fact that there are 3 reports of this on RHEL5.5, it might be something to consider.

I've also heard reports that performing a 'service network restart' will make bonding work.  If that is the case for anyone I would also encourage them to try a simple:

# ifconfig bond0 down ; ifconfig bond0 up

and

# ifdown bond0 ; ifup bond0

and see if the device functions properly after that.  Reports that either one of those do or do not work will help narrow down the area of code where we can look for problems.

Comment 11 Julio Entrena Perez 2010-07-22 15:29:35 UTC

When the customer replaces the RHEL provided GPL netxen_nic driver by the QLogic/HP, commercial nx_nix one, bonding works like a charm.

Customer has tried several bonding configurations, none that involves a Netxen nic works with the netxen_nic driver, and they always work with the nx_nic one.

Comment 12 Andy Gospodarek 2010-07-22 15:39:03 UTC

(In reply to comment #11)
> When the customer replaces the RHEL provided GPL netxen_nic driver by the
> QLogic/HP, commercial nx_nix one, bonding works like a charm.
> 
> Customer has tried several bonding configurations, none that involves a Netxen
> nic works with the netxen_nic driver, and they always work with the nx_nic one.    

Did they even try this from comment #10?

I've also heard reports that performing a 'service network restart' will make
bonding work.  If that is the case for anyone I would also encourage them to
try a simple:

# ifconfig bond0 down ; ifconfig bond0 up

and

# ifdown bond0 ; ifup bond0

Depending on which one of those procedures work, I may be able to come up with a way to make the GPL driver work.  Otherwise we are at the mercy or HP to post those changes upstream so they can be used in our driver as well.

Comment 13 Julio Entrena Perez 2010-08-03 16:31:33 UTC

Andy, I managed to setup a reproducer.
hp-dl580g7-01.lab.bos.redhat.com has been setup with eth0 as the only member of a simple, mode 1 bonding. eth0 is the only nic connected on that box, but that setup is enough to reproduce the issue.

When the system boots up there's no network connectivity. A simple /etc/init.d/network restart restores the network connectivity.

Comment 14 Julio Entrena Perez 2010-08-03 16:33:32 UTC

I forgot to mention that you'll need to use console access for logging into the server.

Comment 15 Andy Gospodarek 2010-08-03 16:51:22 UTC

(In reply to comment #13)
> Andy, I managed to setup a reproducer.
> hp-dl580g7-01.lab.bos.redhat.com has been setup with eth0 as the only member of
> a simple, mode 1 bonding. eth0 is the only nic connected on that box, but that
> setup is enough to reproduce the issue.
> 
> When the system boots up there's no network connectivity. A simple
> /etc/init.d/network restart restores the network connectivity.    

OK, I will take a look right now.

Comment 16 Andy Gospodarek 2010-08-03 17:33:19 UTC

This is interesting.  I'm quite sure I used DHCP for my test, but using a static IP this fails as described.  More relevant info:

Entering the commands:

# ifconfig eth0 down && ifconfig eth0 up

put the interface in a state where it will start receiving frames.  My guess is that napi is involved here and not all msi-x queues are started right away.  I'm guessing that initscripts or dhclient does an ifdown/ifup at some point and that is why I didn't see this there.

As an aside:

I also do not have to reboot to reproduce the failure.  A simple:

# service network stop && rmmod netxen_nic && service network start

will also reproduce the problem once the NIC is working again.

Comment 17 Andy Gospodarek 2010-08-03 17:40:00 UTC

After some code inspection and after looking at ethtool stats before and after sending some ping floods:

[root@hp-dl580g7-01 ~]# ethtool -S eth0
NIC statistics:
     xmit_called: 100
     xmit_finished: 100
     rx_dropped: 0
     tx_dropped: 0
     csummed: 1
     rx_pkts: 3
     lro_pkts: 0
     rx_bytes: 231
     tx_bytes: 8910
[root@hp-dl580g7-01 ~]# ping -f 10.16.47.254 
PING 10.16.47.254 (10.16.47.254) 56(84) bytes of data.
..............................................
--- 10.16.47.254 ping statistics ---
46 packets transmitted, 0 received, 100% packet loss, time 830ms

[root@hp-dl580g7-01 ~]# ethtool -S eth0
NIC statistics:
     xmit_called: 103
     xmit_finished: 103
     rx_dropped: 0
     tx_dropped: 0
     csummed: 1
     rx_pkts: 3
     lro_pkts: 0
     rx_bytes: 231
     tx_bytes: 9036

it appears this is probably not a NAPI issue and more likely an issue with the hardware initialization on probe + open vs probe + open + close + open.

Comment 18 Andy Gospodarek 2010-08-03 17:47:22 UTC

Here is the configuration on the box that could reproduce the bonding failure on a fresh boot or the first time after the module was loaded.

[root@hp-dl580g7-01 ~]# more /etc/sysconfig/network-scripts/ifcfg-bond0 
# NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter
DEVICE=bond0
BOOTPROTO=none
IPADDR=10.16.42.162
NETMASK=255.255.248.0
USERCTL=no
ONBOOT=yes
BONDING_OPTS="miimon=100 mode=1"
[root@hp-dl580g7-01 ~]# more /etc/modprobe.conf 
alias eth0 netxen_nic
alias eth1 netxen_nic
alias eth2 netxen_nic
alias eth3 netxen_nic
alias scsi_hostadapter cciss
alias scsi_hostadapter1 ata_piix
alias bond0 bonding

Interestingly, when removing bonding from the configuration, the system works just fine.  Something that is done in the init process for bonding must be the cause of the device failure.

Comment 19 Andy Gospodarek 2010-08-03 17:58:40 UTC

I verified that MSI-X has nothing to do with this issue, by booting with pci=nomsi and still saw the failure.

Comment 21 Marvell Linux NIC Driver 2010-08-16 17:49:38 UTC

We have newer netxen_nic (4.0.73) that fixes a similar issue. Would you like to give it a try?

Comment 22 Andy Gospodarek 2010-08-16 18:49:21 UTC

Ameen, is there a specfic patch from upstream that may have resolved this?

If so, you can feel free to post just the full SHA1 object value or a URL to the link in Linus' tree at http://git.kernel.org/linus and we can take a look at it.

Comment 23 Marvell Linux NIC Driver 2010-08-16 19:23:31 UTC

Andy,

Here you go.

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=d49c9640975355c79f346869831bf9780d185de0

Comment 24 Mark Wu 2010-08-23 07:17:58 UTC

Andy,
     Another customer has this issue and our GPS on side engineer had verified that it worked well with the kernel patched commit d49c9640975355c79f346869831bf9780d185de0. And they also found that, running "ifdown bond0 ; ifup bond0" manually after boot made network work.

Comment 25 Andy Gospodarek 2010-08-23 15:46:19 UTC

Thanks, Ameen for the patch and Mark for the positive test feedback.  I was quite sure this was somehow related to the multicast list in the hardware (the usual culprit when promisc-mode works but standard mode does not).

I will work to see if I can get this added to the next RHEL5 update.

Comment 26 Andy Gospodarek 2010-08-23 15:52:15 UTC

Looks like this patch was included in the RHEL5.6 update planned for bug 562937.

Closing this as a duplicate.

*** This bug has been marked as a duplicate of bug 562937 ***

Comment 27 Marvell Linux NIC Driver 2010-08-24 21:48:31 UTC

Andy, Thanks for the update.

Comment 33 Greg A 2010-10-27 22:01:59 UTC

Does this have to wait until RHEL 5.6?  We had the exact issue this weekend when trying to configure bonding on a DL580 G7.  After a few lost hours we found this bug.  We ended up having to add additional network cards to the server to meet our objective over the weekend. Seeing that this was identified in July and a known fix was available in August does not make me a happy customer; particularly when you tell me I have to wait until at least January of 2011.

Comment 34 Andy Gospodarek 2010-10-28 00:45:03 UTC

(In reply to comment #33)
> Does this have to wait until RHEL 5.6?  We had the exact issue this weekend
> when trying to configure bonding on a DL580 G7.  After a few lost hours we
> found this bug.  We ended up having to add additional network cards to the
> server to meet our objective over the weekend. Seeing that this was identified
> in July and a known fix was available in August does not make me a happy
> customer; particularly when you tell me I have to wait until at least January
> of 2011.

Greg, the wheels are already in-motion to try and have this resolved in a 5.5 errata kernel before 5.6 ships.

Comment 38 ivan borghetti 2011-01-10 17:50:25 UTC

Im having the same issue with proliant DL585 G7 and redhat linux 5.5 kernel 2.6.18-194.26.1.el5 #1 SMP. The only way is working is restarting network service after it boot or unloading and loading bonding module or ifdown ifup bond0.

Comment 39 Andy Gospodarek 2011-01-11 20:34:39 UTC

Ivan, if you are using a netxen-based card this should be fixed in 2.6.18-194.27.1.el5 and in the kernel that ships with RHEL5.6.  Please let us know if those kernels or later do not resolve the issue.

Comment 40 ilya m. 2011-11-23 18:17:49 UTC

I hate to be the one who brings the bad news, bad this issue is back on RHEL5.7, specifically latest kernel 2.6.18-274.7.1.el5.. 

My setup as follows:

i have 2 x DL585G7 with 4 NetXen NICs on each.

eth0 and eth1 = bond0 (mode=1 miimon=100)
eth2 and eth3 = bond1 (mode=1 miimon=100)

eth0/eth1 is connected to a juniper switch and functions normally
eth2/eth3 on host1 are connected to eth2/eth3 on host2 (crossover cable).

With this setup, bonding is not stable and exhibits the same issue that are noted in this BZ.

ifdown bond1 && ifup bond1 - addresses the issue sometimes - not always.

Comment 41 Andy Gospodarek 2011-11-23 19:47:07 UTC

(In reply to comment #40)
> I hate to be the one who brings the bad news, bad this issue is back on
> RHEL5.7, specifically latest kernel 2.6.18-274.7.1.el5.. 
> 
> My setup as follows:
> 
> i have 2 x DL585G7 with 4 NetXen NICs on each.
> 
> eth0 and eth1 = bond0 (mode=1 miimon=100)
> eth2 and eth3 = bond1 (mode=1 miimon=100)
> 
> eth0/eth1 is connected to a juniper switch and functions normally
> eth2/eth3 on host1 are connected to eth2/eth3 on host2 (crossover cable).
> 
> With this setup, bonding is not stable and exhibits the same issue that are
> noted in this BZ.
> 
> ifdown bond1 && ifup bond1 - addresses the issue sometimes - not always.

Are you saying that bonding only works in promisc mode or not all all?