Bug 190776
Summary: | 2.6.16-1.2185_FC6 spinlocks CPU #0 | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Joshua Wulf <jwulf> |
Component: | kernel | Assignee: | David Woodhouse <dwmw2> |
Status: | CLOSED RAWHIDE | QA Contact: | Brian Brock <bbrock> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | rawhide | CC: | davej, lcarlon, linville, sgrubb, wtogami |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | powerpc | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2006-05-05 16:34:28 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Joshua Wulf
2006-05-05 01:13:38 UTC
That's a sucky message. Wot no backtrace? Thank $DEITY for xmon. In the 2.6.16-1.2187 kernel, it's a spinlock at ieee80211softmac_start_scan+0x58/0xd4 from ieee80211softmac_assoc_work+0x380/0x52x from run_workqueue. Quite possibly caused by my linux-2.6-bcm43xx-assoc-on-startup.patch which adds a schedule_work(&bcm->softmac->associnfo.work); in bcm43xx_init_board() to make sure we actually associate when the link is brought up. Did this ever get fixed properly in softmac? Building a test kernel now to verify... Yeah, removing that patch fixes the reported problem. Leaves us with a machine check in bcm43xx_phy_read+0x1c/0x2c from bcm43xx_phy_initg+0xe04/0xe54 bcm43xx_phy_calibrate+0xe8/0x118 bcm43xx_init_board+0x2f8/0x624 dev_open dev_change_flags devinet_ioctl blah... Does this mean that bcm43xx will load up without manual intervention with this kernel? At the moment it doesn't load on my machine without a modprobe. We deliberately prevented it from autoloading in FC5 because it was a bit too new and exciting. It's loaded automatically in rawhide though -- and that's what is killing your machine. I've removed the patch from CVS, so after the next build it won't die like that -- it'll die differently, as shown in comment #3. I can't see anything obvious which has changed recently in softmac or bcm43xx which should cause this. Nothing changed upstream since April 26th. Which was the latest rawhide kernel that worked? Looks like it never worked since it was merged with Linus' tree. Our FC5 kernel actually had a slightly older snapshot, from just before it got broken. I can 'fix' it by doing this... -- bcm43xx_phy.c.orig 2006-05-05 16:26:43.000000000 +0100 +++ bcm43xx_phy.c 2006-05-05 16:27:28.000000000 +0100 @@ -1288,10 +1288,14 @@ static void bcm43xx_phy_initg(struct bcm bcm43xx_phy_write(bcm, 0x0805, 0x3230); bcm43xx_phy_init_pctl(bcm); if (bcm->chip_id == 0x4306 && bcm->chip_package != 2) { + printk("Would kill you now. chip_package %d\n", + bcm->chip_package); +#if 0 bcm43xx_phy_write(bcm, 0x0429, bcm43xx_phy_read(bcm, 0x0429) & 0xBFFF); bcm43xx_phy_write(bcm, 0x04C3, bcm43xx_phy_read(bcm, 0x04C3) & 0x7FFF); +#endif } } We still have the problem that it doesn't associate on 'ifconfig up' though, since I had to remove the patch which fixes that. Going back to the original problem... the offending spinlock isn't sm->lock. It's sm->ieee->dev->xmit_lock (which is locked in netif_tx_disable()). And it's fixed if I rediff the original patch so that it doesn't get misapplied. Should both be fixed in kernel-2_6_16-1_2194_FC6. I've sent the patch for the machine check upstream, and I've also re-sent the (re-diffed) patch to associate on startup. *** Bug 190592 has been marked as a duplicate of this bug. *** I have updated to the 2196 kernel. It does not lock up with bad magic like the previous versions, but networking doesn't work either. The 2139 kernel does work. I first noticed the problem in the 2174 build. So, somewhere between 2139 & 2174 the problem was introduced. The error I get is "Error: Microcode "bcm43xx_microcode5.fw" not available or load failed." (In reply to comment #12) > The error I get is "Error: Microcode "bcm43xx_microcode5.fw" not available or > load failed." It's failing to load the firmware. Does /lib/firmware/bcm43xx_microcode5.fw exist? No. Locate does not show that file anywhere on my system. Is it supposed to be packaged? No, it's not packaged. It's firmware which needs to be extracted from the MacOS or Windows driver. Install bcm43xx-fwcutter and follow the instructions therein. In comment #12 you said that the 2139 kernel does work. Are you telling me that you had the bcm43xx driver working _without_ having the firmware for it installed? I find that unlikely. Problem turned out to be a device re-ordering problem. Adding HWADDR to eth0 fixed it. |