Bug 431967

Summary: crashes and wireless network failures using b43 on apple ibook g4 ppc (regression)
Product: [Fedora] Fedora Reporter: Alex Markley <alex>
Component: kernelAssignee: John W. Linville <linville>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: 8CC: cebbert, davej, mb, rs
Target Milestone: ---   
Target Release: ---   
Hardware: powerpc   
OS: Linux   
Whiteboard:
Fixed In Version: 2.6.24.2-10.fc8 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-02-26 18:09:44 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
sylog of badness
none
requested objdump
none
dump matching syslog
none
objdump from the working (2.6.23.9-85.fc8) kernel.
none
objdump from the broken (2.6.23.15-137) kernel
none
screen capture of crashed powerpc none

Description Alex Markley 2008-02-08 04:14:39 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux ppc; en-US; rv:1.8.1.10) Gecko/20071213 Fedora/2.0.0.10-3.fc8 Firefox/2.0.0.10

Description of problem:
When using the wireless network device built into my apple ibook g4, I experience intermittent but frequent crashes (at peak network activity) and network failures (no TCP/IP packets in or out of the interface) with various wireless network configurations.

My configuration is a fairly simple fedora 8 installation with network manager enabled and b43 firmware installed. (See below for some details on the b43 configuration.)

This is a REGRESSION because kernel version "kernel-2.6.23.9-85.fc8" works quite reliably. The problem started with "kernel-2.6.23.14-107.fc8" and still exists in the newly-released "kernel-2.6.23.14-115.fc8".

Version-Release number of selected component (if applicable):
kernel-2.6.23.14-115.fc8

How reproducible:
Always


Steps to Reproduce:
Reproducing the problem seems fairly easy. Certain network configurations seem more prone to errors than others, but going to the various neighborhood coffee shops always seems to encounter a fatal configuration. (Or three.)

If the problem is going to happen, it will happen immediately and it will manifest itself in one of two ways:

A) Network manager will perform a successful DHCP query, but no further IP packets will go across the interface. Pinging the gateway, performing a DNS request, etc. all fail.

B) Network manager performs a successful DHCP query, and the interface comes up successfully. Start firefox, thunderbird, or some other heavy net application, and the system will crash hard within seconds. (No response at console or over network, display has a nasty black band about an inch thick at the top, speckled with random dots of color.) Hard power off (power button down for 5+ seconds) is the only way to get out of it.

Actual Results:


Expected Results:


Additional info:
[root@localhost tmp]# dmesg | grep -i b43 | head
b43-phy0: Broadcom 4318 WLAN found
b43-phy0 debug: Found PHY: Analog 3, Type 2, Revision 7
b43-phy0 debug: Found Radio: Manuf 0x17F, Version 0x2050, Revision 8
b43-phy0 debug: Loading firmware version 351.126 (2006-07-29 05:54:02)
Registered led device: b43-phy0:tx
Registered led device: b43-phy0:rx
b43-phy0 debug: Chip initialized
b43-phy0 debug: 32-bit DMA initialized
b43-phy0 debug: Wireless interface started
b43-phy0 debug: Adding Interface type 2
[root@localhost tmp]#

Comment 1 Robert Story 2008-02-17 18:13:38 UTC
I have a powerbook g4, and I get kernel panics as soon as I try to bring an
interface up with 2.6.23.15-137.fc8. I'm getting ready to try a few older kernels..

# dmesg|grep b43
b43-phy0: Broadcom 4306 WLAN found
b43-phy0 debug: Found PHY: Analog 2, Type 2, Revision 2
b43-phy0 debug: Found Radio: Manuf 0x17F, Version 0x2050, Revision 2


Comment 2 Robert Story 2008-02-17 18:27:35 UTC
Created attachment 295107 [details]
sylog of badness

I tried 2.6.23.8-63.fc8 (the most recent I have before the breaking point
reported by OP, kernel-2.6.23.14-107.fc8), and I get lots of 'badness' in the
syslog, which looks like the kernel panics I saw (which didn't make it into
syslog).

Comment 3 John W. Linville 2008-02-18 14:15:39 UTC
Would you mind helping me a bit by doing some disassembly?  I don't have a 
powerpc box handy...

You need to start by determining some addresses.  The examples are from an 
i686 box running Fedora 7, so you'll need to change the kernel version to 
2.6.23.15-137.fc8:

/home/linville
[linville-t43.mobile]:> 
nm /lib/modules/2.6.23.15-80.fc7/kernel/drivers/net/wireless/b43/b43.ko | grep 
b43_handle_txstatus
0000f353 T b43_handle_txstatus

/home/linville
[linville-t43.mobile]:> 
nm /lib/modules/2.6.23.15-80.fc7/kernel/drivers/net/wireless/b43/b43.ko | grep 
b43_handle_hwtxstatus
0000f39a T b43_handle_hwtxstatus

Then do some disassembly -- use the actual address values you determined in 
the last step, and don't forget the "0x" at the beginning of the start and 
stop addresses:

objdump -d --start-address=0x0000f353 --stop-address=0x0000f39a /lib/modules/2.6.23.15-80.fc7/kernel/drivers/net/wireless/b43/b43.ko

Please attach the output to this bug...thanks!

Comment 4 Robert Story 2008-02-18 16:14:03 UTC
Created attachment 295177 [details]
requested objdump

Comment 5 John W. Linville 2008-02-18 16:21:39 UTC
Thanks!  But now I have to ask if the attachment in comment 2 came from 
running the 2.6.23.15-137.fc8 kernel?  The numbers don't quite seem to 
match-up with what the objdump shows.

If the stuff in comment 2 came from a different kernel, could I impose upon 
you to repeat the exercise above but for the kernel which matches comment 2?  
Thanks!

Comment 6 Robert Story 2008-02-18 21:23:10 UTC
Created attachment 295209 [details]
dump matching syslog

newer kernels panic and don't write anything to syslog, so the bandness stuff
is from the older kernel.. here is the objdump from the matching kernel..

Comment 7 Alex Markley 2008-02-19 21:40:00 UTC
Created attachment 295339 [details]
objdump from the working (2.6.23.9-85.fc8) kernel.

More information is attached. Two more comments+attachments immediately follow
this.

Comment 8 Alex Markley 2008-02-19 21:42:01 UTC
Created attachment 295340 [details]
objdump from the broken (2.6.23.15-137) kernel

This attachment (and the attachment before it) should represent the information
you requested.

One more comment+attachment immediately follows.

Comment 9 Alex Markley 2008-02-19 21:45:37 UTC
Created attachment 295341 [details]
screen capture of crashed powerpc

My last attachment for the day.

It's something of a miracle, but I figured out that if I enter text mode
immediately before the crash, I can see some of the debug output from the
powerpc monitor. (Apparently, the black bar I've been seeing is the monitor's
attempt to dump debugging information to the screen while the video hardware is
in video mode.)

I apologize for the low quality of the image. Let me know if it's not readable
enough, and I can borrow somebody else's camera.

Comment 10 Robert Story 2008-02-26 16:27:52 UTC
did the info from comment 6 help? Anything else you need?

Comment 11 John W. Linville 2008-02-26 16:41:12 UTC
Thanks for the information, I'm sure it will be helpful.  I'm sorry I have 
been too busy to deal with it in the last week.  I'll CC the upstream 
maintainer, in case he can provide a quick word of advise on the issue.

In the meantime, have you tried a later F8 kernel?

   http://koji.fedoraproject.org/koji/buildinfo?buildID=39121

Do you get similar results with it?

Comment 12 Robert Story 2008-02-26 17:52:52 UTC
I tried the kernel from comment 11, and was able to bring up the wireless
interface and get an ip address from dhcp.. no log warning messages, bandess or
panics.

Comment 13 John W. Linville 2008-02-26 18:09:44 UTC
Ah, cool!

Comment 14 Alex Markley 2008-02-26 18:53:37 UTC
This new kernel (2.6.24.2-10.fc8) appears to solve the problem for me. I will
continue testing, but unless you hear from me again, the bug is resolved. :)