Bug 234598

Summary: problem with bnx2 driver on 2.6.20-1.2307.fc5smp running on a Dell server
Product: [Fedora] Fedora Reporter: jairo medina <jairo19>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Brian Brock <bbrock>
Severity: urgent Docs Contact:
Priority: medium    
Version: 5CC: carenas, triage
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard: bzcl34nup
Fixed In Version: fc5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-04-04 12:33:02 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description jairo medina 2007-03-30 14:08:21 UTC
Description of problem:
System running newer kernel (2.6.20-1.2307.fc5smp) no longer is able to get the
network interface working. Older kernel (2.6.18-1.2239.fc5smp) is fine.

May be related to bug:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=233968#c6

How reproducible:
Always

Steps to Reproduce:
1. boot on 2.6.18-1.2239.fc5smp to prove networking comes up fine on eth0
2. boot on 2.6.20-1.2307.fc5smp to prove networking does not come up

  
Actual results:
eth0 cannot be brought up

Expected results:
networking on eth0

Additional info:

   I rebooted my machine into 2.6.20-1.2307.fc5smp and the network card does not
get recognized. ifconfig only shows lo and ifconfig eth0 says no device was found.

[root@linux6 ~]# cat /etc/modprobe.conf
alias eth0 bnx2
alias eth1 bnx2
alias scsi_hostadapter megaraid_sas

  I'm back with 2.6.18-1.2239.fc5smp which runs fine, the driver in question is
bnx2.

   Machine is a Dell PowerEdge 1950

[jairo@linux6 ~]$ dmesg | grep eth
eth0: Broadcom NetXtreme II BCM5708 1000Base-T (B1) PCI-X 64-bit 133MHz found at
mem f4000000, IRQ 169, node addr 001372fa5d5a
eth1: Broadcom NetXtreme II BCM5708 1000Base-T (B1) PCI-X 64-bit 133MHz found at
mem f8000000, IRQ 169, node addr 001372fa5d58
bnx2: eth0: using MSI
ADDRCONF(NETDEV_UP): eth0: link is not ready
bnx2: eth0 NIC Link is Up, 1000 Mbps full duplex, receive & transmit flow control ON
ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
eth0: no IPv6 routers present
[jairo@linux6 ~]$


from /var/log/messages in 2307:
Mar 29 07:34:15 linux6 kernel: Broadcom NetXtreme II Gigabit Ethernet Driver
bnx2 v1.5.5 (February 1, 2007)
Mar 29 07:34:15 linux6 kernel: eth0: Broadcom NetXtreme II BCM5708 1000Base-T
(B1) PCI-X 64-bit 133MHz found at mem f8000000, IRQ 16, node addr 001372fa5d58
Mar 29 07:34:15 linux6 kernel: eth1: Broadcom NetXtreme II BCM5708 1000Base-T
(B1) PCI-X 64-bit 133MHz found at mem f4000000, IRQ 16, node addr 001372fa5d5a

from /var/log/messages in 2239
Mar 29 10:17:49 linux6 kernel: Broadcom NetXtreme II Gigabit Ethernet Driver
bnx2 v1.4.44 (August 10, 2006)
Mar 29 10:17:49 linux6 kernel: eth0: Broadcom NetXtreme II BCM5708 1000Base-T
(B1) PCI-X 64-bit 133MHz found at mem f4000000, IRQ 169, node addr 001372fa5d5a
Mar 29 10:17:49 linux6 kernel: eth1: Broadcom NetXtreme II BCM5708 1000Base-T
(B1) PCI-X 64-bit 133MHz found at mem f8000000, IRQ 169, node addr 001372fa5d58

While writing this e-mail I see the cards got recognized in a different order,
so I went back, rebooted into 2307 and did:
ifconfig eth0 up just to get the no device message again, then I did ifconfig
eth1 up and it came up, and then I did ifup eth1, no cable is in this port, so
nothing happened, then I switched the cable and did ifup eth1 again  and the
network worked.

Now I'm back to the 2239 to my original config. It seems that somewhere between
2239 and 2307 there is a change that impacts this.

Comment 1 Chuck Ebbert 2007-03-30 14:56:30 UTC
Please try the workaround from bug 233968: boot with the kernel option
"pci=nomsi".

I'm beginning to think we should just make that the default and
make people who want msi turn it on with "pci=msi".

Comment 2 jairo medina 2007-03-30 15:34:41 UTC
  Chuck, thanks for your quick answer.

I cannot do that now, server is in production.

I'm not sure what "msi" is for, and my searches on the net to learn what it is,
only came up with references to problems on DELL HW (PCs, NBs and SRVRs) and
that the Ubuntu guys decided to make "pci=msi" the default
(https://launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/74830).

With that said, maybe one of these questions makes sense: what changed between
the kernels in terms of the bnx2 driver that got impacted by the msi, or what
changed in the msi that impacts the bnx2 driver ?

Thanks.

Comment 3 Bug Zapper 2008-04-04 06:43:39 UTC
Fedora apologizes that these issues have not been resolved yet. We're
sorry it's taken so long for your bug to be properly triaged and acted
on. We appreciate the time you took to report this issue and want to
make sure no important bugs slip through the cracks.

If you're currently running a version of Fedora Core between 1 and 6,
please note that Fedora no longer maintains these releases. We strongly
encourage you to upgrade to a current Fedora release. In order to
refocus our efforts as a project we are flagging all of the open bugs
for releases which are no longer maintained and closing them.
http://fedoraproject.org/wiki/LifeCycle/EOL

If this bug is still open against Fedora Core 1 through 6, thirty days
from now, it will be closed 'WONTFIX'. If you can reporduce this bug in
the latest Fedora version, please change to the respective version. If
you are unable to do this, please add a comment to this bug requesting
the change.

Thanks for your help, and we apologize again that we haven't handled
these issues to this point.

The process we are following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.

And if you'd like to join the bug triage team to help make things
better, check out http://fedoraproject.org/wiki/BugZappers

Comment 4 jairo medina 2008-04-04 12:33:02 UTC
this was fixed by a later kernel. Thank you.