Bug 184440
Summary: | kernel / bcm43xx lockups | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Bernard Johnson <bjohnson> | ||||
Component: | kernel | Assignee: | John W. Linville <linville> | ||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Brian Brock <bbrock> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 5 | CC: | davej, orion, wtogami, zaitcev | ||||
Target Milestone: | --- | Keywords: | Reopened | ||||
Target Release: | --- | ||||||
Hardware: | i386 | ||||||
OS: | Linux | ||||||
Whiteboard: | NeedsRetesting | ||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2007-08-07 18:44:10 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Bernard Johnson
2006-03-08 20:15:33 UTC
I see that in kernel-2.6.15-1.2032_FC5, the bcm43xx has been disabled from automatically coming up (http://cvs.fedora.redhat.com/viewcvs/rpms/kernel/devel/linux-2.6-bcm43xx-neuter.patch?rev=1.1&view=auto). I just loaded kernel-2.6.15-1.2032_FC5 and I do see some duplicates: 64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=1.73 ms 64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=3.36 ms (DUP!) 64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=1.68 ms 64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=1.74 ms 64 bytes from 192.168.1.1: icmp_seq=11 ttl=64 time=1.78 ms 64 bytes from 192.168.1.1: icmp_seq=11 ttl=64 time=3.01 ms (DUP!) but I've not yet (in 5 minutes testing) seen lockup. When I booted up 2025 this morning, I managed to get three lockups in about ten minutes time. I was able to lock up kernel-2.6.15-1.2032_FC5 today as well. I ran 'ping -f gateway' and let it sit for awhile. Unfortunately, the screensaver kicked in so I couldn't see how long it took to lock up. It looks to me like the bcm43xx driver will not be supported in FC5 since the PCI ID was dropped. Should I just wait until further upstream bcm43xx patches arrive? Or maybe try http://people.redhat.com/linville/kernels/fedora-netdev/ ? When I had some time today, I installed ndiswrapper and brought up the networking using ndiswrapper and a Windows NDIS driver for my Broadcom chip. It's the same driver that I used fwcutter on to get native bcm43xx support. Under ndiswrapper, I get no lockups or dupe packets whatsoever. It seems that the problem is buried in either the SoftMAC code or the bcm43xx code. There have been a lot of bcm43xx/softmac updates in the last 4 months. Can you verify that this is still a problem with current FC5 or rawhide kernels? Althought I did have one lockup yesterday during a network scan, I can't guarantee that it's related to this bug. It's also the first lockup I've had in months now. Also, the weird symptoms originally reported (DUP packets) are no longer reproducible with a fully updated rawhide system, so I believe this bug to be dead. Closing. I was able to reproduce this today, so I'm reopening this bug. Created attachment 132559 [details]
15 second wireshark capture showing the duplicate packets
This 15 second wireshark capture was performed while I was pinging the gateway
at my current location.
I have also had lockups trying to use the bcm43xx driver. I am on an x86_64 system though, and I have 1.2GB of RAM. I read elsewhere were there was a bug in the driver when the system had more that 1GB of RAM. Don't know if that is related. Also, I have three different drivers for my Broadcom card. My broadcom card is reported by the bcm43xx driver as being a 0x4306 rev 0x3. I have a Windows XP x64 driver version 3.70.17.5 and a 3.100.64.0, and if I load the firmware for the 3.70.17.5 after booting (putting the files from fwcutter into /lib/firmware), the wireless interface comes up and associates with the access point, although the wpa_supplicant doesn't work (different problem). If I reboot with that firmware still in /lib/firmware, the OS hangs every time on boot as soon as the ifplugd hits that interface. If I upload the 3.100.64.0 version of the firmware, the wireless light comes on, and the OS immediately hangs. I have an even newer driver, but fwcutter doesn't support it, so I cannot try it. In both cases, at boot time, the OS always hangs as soon as it touches the wireless interface. The only way I can boot my system, is to boot off the rescue CD and remove the firmware files from /lib/firmware. I have the latest kernel: 2.6.17-1.2157_FC5, and all patches applied. I get the same symptoms (duplicate pings) and behaviour (system lockup) on FC6test3. The system froze (did not respond to Ctl-Alt-F1, Ctl-Alt-Backspace) 1-10 minutes after booting as long as I kept the modprobe.bcm43xx file in /etc/modprobe.d. It didn't matter whether I even ifup'd the wireless. [root]# grep MemTotal /proc/meminfo MemTotal: 1035368 kB [root]# uname -srvmpi Linux 2.6.17-1.2187_FC5 #1 Mon Sep 11 01:17:06 EDT 2006 i686 i686 i386 [root]# lspci | grep Broadcom 02:03.0 Network controller: Broadcom Corporation BCM4303 802.11b Wireless LAN Controller (rev 01) [root]# lspci -n | grep `lspci | grep Broadcom | cut -d \ -f 1` 02:03.0 0280: 14e4:4301 (rev 01) [root]# grep "model name" /proc/cpuinfo model name : Mobile Intel(R) Pentium(R) 4 - M CPU 2.00GHz My comment #10 was premature. The unscheduled freezes resumed and continued until I changed the video mode. Probably the networking was just a red herring. 2.6.18-based FC5 test kernels are available here: http://people.redhat.com/linville/kernels/fc5/ These include a post-2.6.18 patch intended to eliminate lock-ups, as well as a all the bcm43xx fixes from 2.6.18. Please give them a try. Well, just a single reboot so far, but 2.6.18-1.2195.2.1.fc5.jwltest.17 seems to be working. System hung after being up for 20 hours with the test kernel. No idea what caused the hang though. so it may not be the same issue. Wireless interface was not configured. A new kernel update has been released (Version: 2.6.18-1.2200.fc5) based upon a new upstream kernel release. Please retest against this new kernel, as a large number of patches go into each upstream release, possibly including changes that may address this problem. This bug has been placed in NEEDINFO state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. In the last few updates, some users upgrading from FC4->FC5 have reported that installing a kernel update has left their systems unbootable. If you have been affected by this problem please check you only have one version of device-mapper & lvm2 installed. See bug 207474 for further details. If this bug is a problem preventing you from installing the release this version is filed against, please see bug 169613. If this bug has been fixed, but you are now experiencing a different problem, please file a separate bug for the new problem. Thank you. Still getting occasional DUP packets as of kernel-2.6.18-1.2798.fc6. I have not seen a recent lockup on my system, but they were rare to start with. Removing NEEDINFO I've been having the same problem The system always locks up after the last message below. bcm43xx: Controller RESET (TX timeout) ... This is using kennel 2.6.18-1.2831.2.1.fc6.jwltest.12 removing the bcm43xx module eliminates the lockups. Is this problem related to the BADNESS limit? I've read elsewhere that changing the BADNESS limit to 20 fixes the problem. I tried changing the value but was unable to compile the kernel not related to this. Nov 12 17:44:17 dell kernel: SoftMAC: Start scanning with channel: 1 Nov 12 17:44:17 dell kernel: SoftMAC: Scanning 14 channels Nov 12 17:44:18 dell kernel: SoftMAC: Scanning finished Nov 12 17:46:18 dell kernel: SoftMAC: Start scanning with channel: 1 Nov 12 17:46:18 dell kernel: SoftMAC: Scanning 14 channels Nov 12 17:46:18 dell kernel: NETDEV WATCHDOG: eth1: transmit timed out Nov 12 17:46:18 dell kernel: bcm43xx: Controller RESET (TX timeout) ... Nov 12 17:46:18 dell kernel: bcm43xx: select_wireless_core: cleanup Nov 12 17:46:28 dell kernel: NETDEV WATCHDOG: eth1: transmit timed out Nov 12 17:46:28 dell kernel: bcm43xx: Controller RESET (TX timeout) ... Clifford, have you tried the fc6.netdev kernels? http://people.redhat.com/linville/kernels/fedora-netdev/ Do they work any better for you? I've been a regular user of the netdev kernels. I was using your last kernel but decided to try the jwltest.12 to see if it had any impact on the bcm43xx lock problem. Clifford Current FC6.netdev kernels include the new d80211 stack. Please give them a try, and be sure to change /etc/modprobe.conf to refer to bcm43xx-d80211 instead of bcm43xx. You will probably have to use v4 firmware as well. Does this work any better for you? Closed due to lack of response. Please reopen when the requested information becomes available...thanks! John- Sorry for the lack of response. I haven't not been using this particular machine very much so it has been difficult to report any findings. I did however travel with it this week and found that at least in the FC6-current kernels, the DUP packet problem still persists, but I haven't seen any recent lockups (though I haven't used it as heavily as on the past). How should I proceed? Since the DUP problem remains, I'm reopening for now. Could you try the bcm43xx-mac80211 driver in the test kernels here? http://people.redhat.com/linville/kernels/fc6/ Do they show the DUP behaviour? I know it seems that I'm being somewhat unhelpful in gathering information, but let me explain a bit. I do not see any bad behavior on my work network. Everything works smoothly. (Although originally, I did have the DUP problem on my network). Now, I usually see the behavior when I travel, on foreign networks. I know of three networks that cause this problem now. Unfortunately it's impossible to test them on a regular basis because they are > 1000 miles from where I live. Also, I updated my laptop to rawhide a few days ago, although I suspect I'll see it there as well. I'll leave this NEEDINFO for now to remind me to keep an eye out for the DUP packets as I roam on foreign networks. Setting state back to NEEDINFO pending availability of the information from the previous comment...thanks! Closed due to lack of response...please reopen if the problem persists on recent fedora kernels. |