Description of problem: Due to configuration changes on HP Virtual Connect, sometimes the NIC links drops for a few seconds. This is not desirable but RHELS 6.0 systems in this Blade enclosure come back on-line without problems. This specific RH 5.6 system crashed with a kernel BUG and was rebooted by HP ASR because it hang after the crash. Version-Release number of selected component (if applicable): How reproducible: Hopefully not because last time I 'only' had to restart the networking service via the ILO console to enable networking again. This is the first time it actually crashed the system. Expected results: Detecting links down and up again without loosing network connectivity in the end or a complete system failure. Additional info: Manufacturer: HP Product Name: ProLiant BL460c G7 SKU Number: 603718-B21 Family: ProLiant ]# /var/log/messages Jan 31 13:49:00 scomp1101 kernel: bonding: bond0: link status definitely up for interface eth1. Jan 31 13:49:00 scomp1101 kernel: ----------- [cut here ] --------- [please bite here ] --------- Jan 31 13:49:00 scomp1101 kernel: Kernel BUG at drivers/net/bonding/bonding.h:135 ]# cat /etc/modprobe.conf alias eth0 be2net alias eth1 be2net alias eth2 be2net alias eth3 be2net alias eth4 be2net alias eth5 be2net alias eth6 be2net alias eth7 be2net alias bond0 bonding options bond0 miimon=100 mode=active-backup primary=eth0 #/ etc/rc.local # Increasing The Transmit Queue Length from 1000 to 10000 for iFace in `ifconfig | grep eth | cut -f 1 -d" "` ; do ifconfig $iFace txqueuelen 10000 ; done unset iFace # ifconfig eth0 eth0 Link encap:Ethernet HWaddr D4:85:64:57:0B:08 UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 RX packets:2378 errors:0 dropped:0 overruns:0 frame:0 TX packets:1363 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:10000 RX bytes:210611 (205.6 KiB) TX bytes:516464 (504.3 KiB) ]# ethtool eth0 Settings for eth0: Supported ports: [ TP ] Supported link modes: 1000baseT/Full 10000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 1000baseT/Full 10000baseT/Full Advertised auto-negotiation: No Speed: 5000Mb/s Duplex: Full Port: Twisted Pair PHYAD: 0 Transceiver: internal Auto-negotiation: on Supports Wake-on: g Wake-on: d Link detected: yes ]# lsb_release -a LSB Version: :core-4.0-amd64:core-4.0-ia32:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-ia32:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-ia32:printing-4.0-noarch Distributor ID: RedHatEnterpriseServer Description: Red Hat Enterprise Linux Server release 5.6 (Tikanga) Release: 5.6 Codename: Tikanga ]# uname -a Linux scomp1101.wurnet.nl 2.6.18-238.1.1.el5 #1 SMP Tue Jan 4 13:32:19 EST 2011 x86_64 x86_64 x86_64 GNU/Linux ]# modinfo bonding filename: /lib/modules/2.6.18-238.1.1.el5/kernel/drivers/net/bonding/bonding.ko author: Thomas Davis, tadavis and many others description: Ethernet Channel Bonding Driver, v3.4.0-1 version: 3.4.0-1 license: GPL srcversion: 956FDE3FEBDD81E105B7727 depends: ipv6 vermagic: 2.6.18-238.1.1.el5 SMP mod_unload gcc-4.1 ]# modinfo be2net filename: /lib/modules/2.6.18-238.1.1.el5/kernel/drivers/net/benet/be2net.ko license: GPL author: ServerEngines Corporation description: ServerEngines BladeEngine 10Gbps NIC Driver 2.102.518r version: 2.102.518r srcversion: 76890C397EB8D93CCC6B539 ]# cat /proc/net/bonding/bond0 Ethernet Channel Bonding Driver: v3.4.0-1 (October 7, 2008) Bonding Mode: fault-tolerance (active-backup) Primary Slave: eth0 (primary_reselect always) Currently Active Slave: eth0 MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0 Slave Interface: eth0 MII Status: up Speed: 100 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: d4:85:64:57:0b:08 Slave Interface: eth1 MII Status: up Speed: 100 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: d4:85:64:57:0b:0c Slave Interface: eth2 MII Status: down Speed: 100 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: d4:85:64:57:0b:09 interfaces eth2...eth7 are all down as configured .
Correction: Link drops are not for a few seconds but for 8 to 10 minutes all links are down. Extra info: This is our first ProLiant BL460c G7 blade and instead of the Broadcom chipset (ProLiant BL460c G6 series Ethernet controller: Broadcom Corporation NetXtreme II BCM57711E 10-Gigabit PCIe) it has an emulex NIC chipset eg Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 01)
This is likely a duplicate of bug 671595. That bug will be fixed in RHEL5.7 and in RHEL5.6 errata kernel version 2.6.18-238.4.1.el5. Please test that kernel and reopen if it does not resolve the issue. Thanks! *** This bug has been marked as a duplicate of bug 671595 ***