Bug 640766 - b43legacy wlan0 fails DHCP requests
Summary: b43legacy wlan0 fails DHCP requests
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 14
Hardware: All
OS: Linux
low
high
Target Milestone: ---
Assignee: John W. Linville
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-10-06 19:02 UTC by Steven Haigh
Modified: 2011-01-17 18:27 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-01-17 18:27:01 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
dmesg output from system (48.74 KB, text/plain)
2010-10-06 19:03 UTC, Steven Haigh
no flags Details
extract from /var/log/messages when DHCP fails (7.30 KB, text/plain)
2010-10-06 19:03 UTC, Steven Haigh
no flags Details
output of 'tethereal -i eth1.10 -n port bootpc' on dhcp server (1.37 KB, text/plain)
2010-10-06 19:05 UTC, Steven Haigh
no flags Details
dmesg-phy-transmission-error.txt (42.97 KB, text/plain)
2010-10-08 21:41 UTC, Steven Haigh
no flags Details

Description Steven Haigh 2010-10-06 19:02:33 UTC
Description of problem:
This is a rather strange bug - I'm not sure if its an issue with the kernel or network manager, however when associating with an access point, I can see DHCP requests leaving the laptop, I can see those requests hit the DHCP server, I can see the reply leaving the server, but nothing is done on the laptop regarding these requests.

I have stopped iptables, all chains are -P ACCEPT.

Version-Release number of selected component (if applicable):
kernel-2.6.35.4-28.fc14.i686
NetworkManager-0.8.1-6.git20100831.fc14.i686
NetworkManager-glib-0.8.1-6.git20100831.fc14.i686
NetworkManager-gnome-0.8.1-6.git20100831.fc14.i686
dhclient-4.2.0-6.fc14.i686
b43 firmware cut from: wl_apsta-3.130.20.0.o

How reproducible:
8 out of 10 connections to wifi.

Steps to Reproduce:
1. Set up connection in NetworkManager
2. Try to connect to AP

Comment 1 Steven Haigh 2010-10-06 19:03:09 UTC
Created attachment 451963 [details]
dmesg output from system

Comment 2 Steven Haigh 2010-10-06 19:03:43 UTC
Created attachment 451964 [details]
extract from /var/log/messages when DHCP fails

Comment 3 Steven Haigh 2010-10-06 19:05:34 UTC
Created attachment 451965 [details]
output of 'tethereal -i eth1.10 -n port bootpc' on dhcp server

Comment 4 Dan Williams 2010-10-06 22:01:40 UTC
A few things check first:

1) can you use the connection with other non-b43 laptops using NetworkManager?
2) if your AP has 802.11n capability enabled, try disabling it on the AP

I'm more inclined to think this is a kernel problem since the device is transmitting correctly, but not receiving.  But we can narrow that down.

Comment 5 Steven Haigh 2010-10-06 22:05:57 UTC
(In reply to comment #4)
> A few things check first:
> 
> 1) can you use the connection with other non-b43 laptops using NetworkManager?

Yes, I can connect using NetworkManager on a different laptop with no issue.

> 2) if your AP has 802.11n capability enabled, try disabling it on the AP

The access point is a Linksys WRT54GS - therefore is only 802.11 b/g

> I'm more inclined to think this is a kernel problem since the device is
> transmitting correctly, but not receiving.  But we can narrow that down.

Thats my thoughts as well, however I figured I'd start with what I can see at the moment. Interestingly enough, when I do get it to connect to the AP after several retries, performance is woeful.

The stats for the adapter are as follows:
$ iwconfig wlan0
wlan0     IEEE 802.11bg  ESSID:"www.crc.id.au"  
          Mode:Managed  Frequency:2.462 GHz  Access Point: 00:0F:66:C5:2D:6B   
          Bit Rate=54 Mb/s   Tx-Power=20 dBm   
          Retry  long limit:7   RTS thr:off   Fragment thr:off
          Power Management:off
          Link Quality=57/70  Signal level=-53 dBm  
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:0  Invalid misc:0   Missed beacon:0

The connection is at 54Mb/s, however I am lucky to get 300kbytes/sec transfer over the link. I'm wondering if it is dropping packets, this issue could be one and the same.

Comment 6 Dan Williams 2010-10-06 22:36:24 UTC
Yeah, definitely a kernel issue then; if the rate is 54Mbps then it should be getting a lot better transfer rate.  If the card thought there was interference, then it would drop the bitrate to compensate.  So the fact that it still thinks its at 54 yet performance sucks most likely means an issue in the driver itself.  Moving to kernel.

Comment 7 Adam Williamson 2010-10-08 16:39:35 UTC
We discussed this at the 2010-10-08 blocker review meeting. For now we are undecided on the status, we need more input on the impact of this bug from John. John? Thanks!

Comment 8 John W. Linville 2010-10-08 18:16:01 UTC
Testing here w/ b43 hardware kernel-2.6.35.6-39.fc14.x86_64 (on an F-13 userland) and no apparent connectivity or performance problems.  Whatever the issue is, my testing and the lack of other reports suggests that it isn't a general problem.

Could you take another ethereal/wireshark capture from another wireless station near the problematic laptop while that laptop is trying to get it's DHCP address?  It would be good to see if the DHCP replies are actually making it over the air to the b43 device.

Comment 9 Steven Haigh 2010-10-08 18:50:27 UTC
For the complete record, I'm going to document the entire procedure from a fresh F14 install:

# b43-fwcutter -w /lib/firmware/ /media/USBDRIVE/wl_apsta-3.130.20.0.o 
This file is recognised as:
  ID         :  FW10
  filename   :  wl_apsta.o
  version    :  295.14
  MD5        :  e08665c5c5b66beb9c3b2dd54aa80cb3
Extracting b43legacy/ucode2.fw
Extracting b43legacy/ucode4.fw
Extracting b43legacy/ucode5.fw
Extracting b43legacy/ucode11.fw
Extracting b43legacy/pcm4.fw
Extracting b43legacy/pcm5.fw
Extracting b43legacy/a0g0bsinitvals2.fw
Extracting b43legacy/b0g0bsinitvals5.fw
Extracting b43legacy/a0g0initvals5.fw
Extracting b43legacy/a0g1bsinitvals5.fw
Extracting b43legacy/a0g0initvals2.fw
Extracting b43legacy/a0g1initvals5.fw
Extracting b43legacy/b0g0bsinitvals2.fw
Extracting b43legacy/b0g0initvals5.fw
Extracting b43legacy/b0g0initvals2.fw
Extracting b43legacy/a0g0bsinitvals5.fw
#

Connected to wifi... Entered WPA2-PSK key...

Can't connect. Attaching output from other laptop in next post.

Comment 10 Steven Haigh 2010-10-08 18:52:00 UTC
Second laptop on wifi using: tshark -i wlan0 -n port bootpc

  0.000000      0.0.0.0 -> 255.255.255.255 DHCP DHCP Discover - Transaction ID 0x6fe18e17
  0.282557      0.0.0.0 -> 255.255.255.255 DHCP DHCP Request  - Transaction ID 0x6fe18e17
  6.289324      0.0.0.0 -> 255.255.255.255 DHCP DHCP Request  - Transaction ID 0x6fe18e17
 20.306057      0.0.0.0 -> 255.255.255.255 DHCP DHCP Discover - Transaction ID 0x2110dc4a
 26.313777      0.0.0.0 -> 255.255.255.255 DHCP DHCP Discover - Transaction ID 0x2110dc4a
 34.321948      0.0.0.0 -> 255.255.255.255 DHCP DHCP Discover - Transaction ID 0x2110dc4a
 92.362495      0.0.0.0 -> 255.255.255.255 DHCP DHCP Request  - Transaction ID 0x8e9ab13b

Comment 11 Steven Haigh 2010-10-08 18:57:16 UTC
Output from additional laptop without the filter for bootpc port using:

# tshark -i wlan0 -n

57.809017 00:0f:66:c5:2d:6b -> 00:90:4b:74:1f:e3 EAPOL Key
 57.820126 00:0f:66:c5:2d:6b -> 00:90:4b:74:1f:e3 EAPOL Key
 57.997234      0.0.0.0 -> 255.255.255.255 DHCP DHCP Discover - Transaction ID 0xc863f107
 57.998347 00:18:4d:79:65:47 -> ff:ff:ff:ff:ff:ff ARP Who has 10.1.1.174?  Tell 10.1.1.254
 58.223830 00:18:4d:79:65:47 -> 00:90:4b:74:1f:e3 LLC I, N(R)=16, N(S)=0; DSAP NULL LSAP Group, SSAP NULL LSAP Command
 58.996569 00:18:4d:79:65:47 -> ff:ff:ff:ff:ff:ff ARP Who has 10.1.1.174?  Tell 10.1.1.254
 59.996484 00:18:4d:79:65:47 -> ff:ff:ff:ff:ff:ff ARP Who has 10.1.1.174?  Tell 10.1.1.254
 61.001484      0.0.0.0 -> 255.255.255.255 DHCP DHCP Discover - Transaction ID 0xc863f107
 61.001673 00:18:4d:79:65:47 -> 00:90:4b:74:1f:e3 LLC I, N(R)=16, N(S)=0; DSAP LLC Sub-Layer Management Individual, SSAP NULL LSAP Command
 67.009071      0.0.0.0 -> 255.255.255.255 DHCP DHCP Discover - Transaction ID 0xc863f107
 67.009261 00:18:4d:79:65:47 -> 00:90:4b:74:1f:e3 LLC I, N(R)=16, N(S)=0; DSAP LLC Sub-Layer Management Group, SSAP NULL LSAP Command
 80.023201      0.0.0.0 -> 255.255.255.255 DHCP DHCP Discover - Transaction ID 0xc863f107
 80.023395         F320 -> 0000         SNA SNA device <--> Non-SNA Device
 95.038498 00:18:4d:79:65:47 -> 00:90:4b:74:1f:e3 LLC I, N(R)=16, N(S)=0; DSAP SNA Path Control Group, SSAP NULL LSAP Command

Comment 12 Steven Haigh 2010-10-08 19:01:09 UTC
tshark output from b43legacy laptop - surprisingly, this time it connected!

  0.000000 00:0f:66:c5:2d:6b -> 00:90:4b:74:1f:e3 EAPOL Key
  0.002763 00:90:4b:74:1f:e3 -> 00:0f:66:c5:2d:6b EAPOL Key
  0.018891 00:0f:66:c5:2d:6b -> 00:90:4b:74:1f:e3 EAPOL Key
  0.019065 00:90:4b:74:1f:e3 -> 00:0f:66:c5:2d:6b EAPOL Key
  0.318564      0.0.0.0 -> 255.255.255.255 DHCP DHCP Discover - Transaction ID 0x6f9ed978
  3.322060      0.0.0.0 -> 255.255.255.255 DHCP DHCP Discover - Transaction ID 0x6f9ed978
 11.330527      0.0.0.0 -> 255.255.255.255 DHCP DHCP Discover - Transaction ID 0x6f9ed978
 23.336797      0.0.0.0 -> 255.255.255.255 DHCP DHCP Discover - Transaction ID 0x6f9ed978
 40.354337      0.0.0.0 -> 255.255.255.255 DHCP DHCP Discover - Transaction ID 0x6f9ed978
 40.364590   10.1.1.254 -> 10.1.1.174   DHCP DHCP Offer    - Transaction ID 0x6f9ed978
 40.365306      0.0.0.0 -> 255.255.255.255 DHCP DHCP Request  - Transaction ID 0x6f9ed978
 40.456797   10.1.1.254 -> 10.1.1.174   DHCP DHCP ACK      - Transaction ID 0x6f9ed978

Comment 13 John W. Linville 2010-10-08 19:15:45 UTC
So when it connected there was a visible DHCP Offer, but when it didn't connect you never saw one.  This is a key step in the DHCP negotiation. :-)

So, it seems like the signal from the AP isn't as strong as it should be?  Or it is otherwise dropping frames?

Comment 14 Steven Haigh 2010-10-08 19:20:13 UTC
Yeah - I noticed this. If I do the same tshark dump from the dhcp server, I can see both the request & reply happening. Interestingly, this is the only device (out of 3 laptops and a phone) that has issues connecting to wifi.

The AP is located about 2-3 meters from the laptop - and another laptop sitting right next to it does not have these problems.

Comment 15 Steven Haigh 2010-10-08 19:41:41 UTC
output of 'iwconfig wlan0' from affected b43 laptop:

wlan0     IEEE 802.11bg  ESSID:"www.crc.id.au"
          Mode:Managed  Frequency:2.412 GHz  Access Point: 00:0F:66:C5:2D:6B
          Bit Rate=54 Mb/s   Tx-Power=20 dBm
          Retry  long limit:7   RTS thr:off   Fragment thr:off
          Power Management:off
          Link Quality=52/70  Signal level=-58 dBm
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:0  Invalid misc:0   Missed beacon:0

Comment 16 Steven Haigh 2010-10-08 19:43:10 UTC
output of 'iwconfig wlan0' from 100% working laptop:

wlan0     IEEE 802.11bg  ESSID:"www.crc.id.au"  
          Mode:Managed  Frequency:2.412 GHz  Access Point: 00:0F:66:C5:2D:6B   
          Bit Rate=54 Mb/s   Tx-Power=20 dBm   
          Retry  long limit:7   RTS thr:off   Fragment thr:off
          Power Management:off
          Link Quality=62/70  Signal level=-48 dBm  
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:0  Invalid misc:0   Missed beacon:0

Comment 17 John W. Linville 2010-10-08 19:50:21 UTC
Have you tried manually configuring an IP address on the troubled box?  If you do so, how is the network connectivity afterwards?

Comment 18 Steven Haigh 2010-10-08 19:51:45 UTC
Example of correct speed on another laptop connected via the same wifi ap:

$ wget http://lamp.vm.crc.id.au/500mbfile
--2010-10-09 06:50:05--  http://lamp.vm.crc.id.au/500mbfile
Resolving lamp.vm.crc.id.au... 203.56.246.93
Connecting to lamp.vm.crc.id.au|203.56.246.93|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 524288000 (500M) [text/plain]
Saving to: `500mbfile'

19% [===============>                    ] 100,358,619 2.22M/s  eta 3m 6s

Comment 19 Steven Haigh 2010-10-08 19:52:09 UTC
Example of slow speed from b43legacy card:

$ wget http://lamp.vm.crc.id.au/500mbfile
--2010-10-09 06:47:18--  http://lamp.vm.crc.id.au/500mbfile
Resolving lamp.vm.crc.id.au... 203.56.246.93
Connecting to lamp.vm.crc.id.au|203.56.246.93|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 524288000 (500M) [text/plain]
Saving to: “500mbfile”

 8% [=======>                           ] 46,836,731   306K/s  eta 22m 17s

Comment 20 Quentin Armitage 2010-10-08 21:10:33 UTC
This may be a complete red herring, but I once had a similar problem to this. As far as I recall, the problem originally with an Atheros card, and I swapped it for a b43 card with exactly the same results.

When I was swapping the card again, the antenna wire fell out of the micro FL connector, having previously been held in just by the shrinkwrap. Replacing the antenna cable resolved the problem.

Comment 21 Steven Haigh 2010-10-08 21:41:11 UTC
Created attachment 452439 [details]
dmesg-phy-transmission-error.txt

To try and rule this out, I disconnected the antennas, inspected the connections, reinserted the card, connected antennas again.

It took ~10-15 attempts to get it to connect to wifi.

I did see some errors in /var/log/messages:
Oct  9 08:37:02 dell8600 kernel: b43legacy-phy0 ERROR: PHY transmission error
Oct  9 08:37:02 dell8600 kernel: b43legacy-phy0 ERROR: PHY transmission error
Oct  9 08:38:02 dell8600 kernel: b43legacy-phy0 ERROR: PHY transmission error

I see quite a few of these.
# grep "PHY transmission error" messages | wc -l
41

Also, with Windows XP on this system, I can connect to the AP straight away.

Comment 22 John W. Linville 2010-10-11 14:25:23 UTC
OK, was confused -- you are using b43legacy, not b43...

Comment 23 John W. Linville 2010-10-11 14:27:40 UTC
Are comment 18 and comment 19 in response to comment 17?

Comment 24 Steven Haigh 2010-10-11 16:02:19 UTC
(In reply to comment #23)
> Are comment 18 and comment 19 in response to comment 17?

Hi John,

They are an example of network performance with the b43legacy card vs a second laptop (not using a b43* card). The IP address on both tests was obtained via DHCP in one of the moments where the b43legacy card decided to behave enough for DHCP to work :)

Comment 25 John Poelstra 2010-10-13 16:36:26 UTC
John,

Any more thoughts or feedback on this bug? Do you consider this a blocker bug or is there a suitable work around?  We need to address this bug by Friday, 2010-10-15.

Thanks,
JohnP

Comment 26 John W. Linville 2010-10-13 17:22:11 UTC
I do not consider this a blocker bug.  I'm not even sure what criteria it is under suspicion of meeting for blocker status.

The main problem here seems to be intermittent connectivity,  low throughput (2.4Mbps vs 17.6Mbps), and "b43legacy-phy0 ERROR: PHY transmission error" (which are probably all related).  FWIW, the "b43legacy-phy0 ERROR: PHY transmission error" message has been around for a long time.

I'm sorry this is causing an inconvenience, but I don't think I have much to offer.  As you may know, Broadcom has been quite uncooperative with the Linux community in the past.  Even their recently posted driver for newer wireless hardware completely ignores not only the b43legacy parts but nearly all of the b43 parts as well.

That leaves the existing reverse engineered driver.  I know that the PHY transmission error problem has been a thorn in their side for a long time and I'm sure they would like to fix it.  Yet so far a solution has proven elusive.

I think making this a blocker for Fedora would be a mistake.

Comment 27 Adam Williamson 2010-10-13 21:51:45 UTC
I concur with John that this is not a blocker.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 28 John Poelstra 2010-10-13 23:05:35 UTC
Based on comments below, removing from blocker list.

Comment 29 John W. Linville 2011-01-13 19:13:11 UTC
Been awhile...does this issue persist with currently updated Fedora kernels?

Comment 30 Steven Haigh 2011-01-14 09:19:11 UTC
I've just reinstalled F14 on this Dell 8600 Laptop and have been playing around connecting / disconnecting from wifi, rebooting etc etc and today I haven't been able to reproduce it.

I'll keep playing for the next few days and post again.

Comment 31 Steven Haigh 2011-01-14 09:40:23 UTC
As a side note, this is with kernel 2.6.35.10-74.fc14.i686.

I still only get ~350Kb/sec - however I don't get any PHY errors in /var/log/messages or dmesg now and DHCP seems to work every one of the 30+ times I've connected to the AP in testing.

$ iwlist  wlan0 rate
wlan0     unknown bit-rate information.
          Current Bit Rate=54 Mb/s

I have noticed that this seems to vary between 18Mb/s and 54Mb/s.

$ iwconfig wlan0
wlan0     IEEE 802.11bg  ESSID:"www.crc.id.au"  
          Mode:Managed  Frequency:2.412 GHz  Access Point: 00:0F:66:C5:2D:6B   
          Bit Rate=54 Mb/s   Tx-Power=20 dBm   
          Retry  long limit:7   RTS thr:off   Fragment thr:off
          Power Management:off
          Link Quality=63/70  Signal level=-47 dBm  
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:0  Invalid misc:0   Missed beacon:0

If anyone has any suggestions, I'll be more than happy to experiment with this system! :)

Comment 32 John W. Linville 2011-01-17 18:27:01 UTC
I suspect that the b43legacy performance simply is what it is -- it is unlikely to get much further attention from anyone familiar with the hardware. :-(

I'm going to close this bug on the basis of comment 31 saying that DHCP is working.


Note You need to log in before you can comment on or make changes to this bug.