Bug 250721

Summary: iwl3945 driver flakey
Product: [Fedora] Fedora Reporter: Norm Murray <nmurray>
Component: kernelAssignee: John W. Linville <linville>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 7CC: cebbert, chris.brown, davej, grgustaf
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: 2.6.23.8-34.fc7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-12-12 16:13:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Norm Murray 2007-08-03 07:06:29 UTC
Description of problem:
Periodically and sporadically (no pattern observed) the wireless will lock up
Aug  1 21:13:54 top kernel: iwl3945: ipw going down
Aug  1 21:13:54 top kernel: iwl3945: Grabbing access while already held at line\
 830.

Is what's in the logs at that point. 

Since this appears to be a re-init of the card, we can probably do better clean
up when downing it? 

@@ -6380,7 +6380,8 @@ void iwl_down(struct iwl_priv *priv)
        if (priv->ibss_beacon)
                dev_kfree_skb(priv->ibss_beacon);
        priv->ibss_beacon = NULL;
-
+       /* unconditional release since going down, in case we come back */
+       iwl_release_restricted_access(priv);
        /* clear out any free frames */
        iwl_clear_free_frames(priv);
 }

As yet untested, but a thought to start with?

Comment 1 Norm Murray 2007-08-03 07:12:44 UTC
Linux top 2.6.22.1-33.fc7 #1 SMP Mon Jul 23 16:59:15 EDT 2007 x86_64 x86_64
x86_64 GNU/Linux

Lenovo T60 laptop

Comment 2 John W. Linville 2007-08-03 13:45:15 UTC
Norm, good to hear from you! :-)

I'll take a closer look at your patch -- seems reasonable at first glance.  
Have you tried it yourself?

Also, that driver has gotten a lot of updates recently.  Please try 
2.6.22.1-41.fc7 or later just to keep us close to the same page.

Comment 3 Norm Murray 2007-08-06 03:49:26 UTC
Nope, haven't tried it yet - focused on other things over the weekend. Worth
noting, while I still experienced a few pauses over the weekend, I'm now at the
longest 'stable' period with this connection - disabled NetworkManager and the
card seems much happier. Haven't even seen the: "iwl3945: REPLY_ADD_STA failed"
message since taking out NetworkManager. So NM seems to push the card/driver in
strange different ways. 


Comment 4 John W. Linville 2007-08-21 17:38:05 UTC
Norm, -57 is working pretty well for me:

   http://koji.fedoraproject.org/koji/buildinfo?buildID=13752

Does it work for you?

Comment 5 Christopher Brown 2007-09-23 21:07:29 UTC
Hello Norm,

I'm reviewing this bug as part of the kernel bug triage project, an attempt to
isolate current bugs in the fedora kernel.

http://fedoraproject.org/wiki/KernelBugTriage

I am CC'ing myself to this bug and will try and assist you in resolving it if I can.

There hasn't been much activity on this bug for a while. Could you tell me if
you are still having problems with the latest kernel?

If the problem no longer exists then please close this bug or I'll do so in a
few days if there is no additional information lodged.

Cheers
Chris

Comment 6 Norm Murray 2007-09-24 00:04:39 UTC
on 2.6.22.1-41.fc7 (which I updated to after Jon's note I think) I'm still
experiencing odd disconnects - periodically (once every 1-5 days), the wireless
will just drop, and requires a po/po of the wireless access point as well as
ifdown/ifup of the wlan device. 

Will give the 2.6.22.5-76.fc7 kernel a spin here... 

Comment 7 John W. Linville 2007-09-28 18:49:22 UTC
http://koji.fedoraproject.org/koji/buildinfo?buildID=19787

Big iwlwifi update there.  How is it working for you?

Comment 8 Norm Murray 2007-10-01 04:53:21 UTC
Well, same essential problems - dropped me after two days, and rebooting the
wireless access point (which is otherwise quite stable and accessible from the
location witnessed by no problems on the t42 laptop) is required. 

wlan0: RX ReassocResp from 00:0f:66:d5:99:cc (capab=0x401 status=0 aid=3)
wlan0: associated
wlan0: switched to short barker preamble (BSSID=00:0f:66:d5:99:cc)
wlan0: RX deauthentication from 00:0f:66:d5:99:cc (reason=7)
wlan0: deauthenticated
wlan0: authenticate with AP 00:0f:66:d5:99:cc
wlan0: RX authentication from 00:0f:66:d5:99:cc (alg=0 transaction=2 status=0)
wlan0: authenticated
wlan0: associate with AP 00:0f:66:d5:99:cc
wlan0: RX ReassocResp from 00:0f:66:d5:99:cc (capab=0x401 status=0 aid=3)
wlan0: associated
wlan0: switched to short barker preamble (BSSID=00:0f:66:d5:99:cc)
wlan0: RX deauthentication from 00:0f:66:d5:99:cc (reason=7)
wlan0: deauthenticated
wlan0: authenticate with AP 00:0f:66:d5:99:cc
wlan0: RX authentication from 00:0f:66:d5:99:cc (alg=0 transaction=2 status=0)
wlan0: authenticated

is pretty constant in dmesg, but doesn't hit /var/log/messgaes, so can't tell
you total frequency... seems to be a new stanza once an hour or so

Comment 9 Norm Murray 2007-10-01 22:14:22 UTC
wlan0: switched to short barker preamble (BSSID=00:0f:66:d5:99:cc)
wlan0: No ProbeResp from current AP 00:0f:66:d5:99:cc - assume out of range
wlan0: No STA entry for own AP 00:0f:66:d5:99:cc
wlan0: No STA entry for own AP 00:0f:66:d5:99:cc
wlan0: No STA entry for own AP 00:0f:66:d5:99:cc
... (for several hudred lines)
wlan0: No STA entry for own AP 00:0f:66:d5:99:cc
wlan0: No STA entry for own AP 00:0f:66:d5:99:cc
ADDRCONF(NETDEV_UP): wlan0: link is not ready


Is what's seen just after it disconnects. 

ifconfig information after it's back up:
wlan0     Link encap:Ethernet  HWaddr 00:1B:77:1C:6B:E0  
          inet addr:192.168.69.139  Bcast:192.168.69.255  Mask:255.255.255.0
          inet6 addr: fe80::21b:77ff:fe1c:6be0/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:2374802 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2697871 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:1037656218 (989.5 MiB)  TX bytes:2054026950 (1.9 GiB)

wmaster0  Link encap:UNSPEC  HWaddr
00-1B-77-1C-6B-E0-00-00-00-00-00-00-00-00-00-00  
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)


This is still running without Network Manager at this point as well. 

Comment 10 Norm Murray 2007-10-04 01:46:45 UTC
wlan0: switched to short barker preamble (BSSID=00:0f:66:d5:99:cc)
wlan0: RX deauthentication from 00:0f:66:d5:99:cc (reason=7)
wlan0: deauthenticated
wlan0: authenticate with AP 00:0f:66:d5:99:cc
wlan0: RX authentication from 00:0f:66:d5:99:cc (alg=0 transaction=2 status=0)
wlan0: authenticated
wlan0: associate with AP 00:0f:66:d5:99:cc
wlan0: RX ReassocResp from 00:0f:66:d5:99:cc (capab=0x401 status=0 aid=3)
wlan0: associated
wlan0: switched to short barker preamble (BSSID=00:0f:66:d5:99:cc)
iwl3945: Microcode SW error detected.  Restarting 0x82000008.
iwl3945: Error Reply type 0x00000005 cmd REPLY_TX (0x1C) seq 0x0220 ser 0x0000004B
iwl3945: ipw going down 
iwl3945: Can't stop Rx DMA.

Was in the log this morning with a disconnected network. 

Comment 11 John W. Linville 2007-10-16 12:25:38 UTC
Is that without running either NM or wpa_supplicant?  After a disconnect in 
that situation, does an "iwconfig wlan0 essid <SSID>" restore the connection?

The mac80211-based drivers lack a robust MLME implementation for roaming, 
reconnection, etc...

Comment 12 Christopher Brown 2007-12-12 15:32:47 UTC
Its been a while. Norm, any improvement?

Comment 13 Norm Murray 2007-12-12 16:13:53 UTC
Yeah, with latest f7 kernel 2.6.23.8-34.fc7 and a change in my WAP things are
definitely much more stable than when I opened this originally. There are some
issues where speed seems to fall, but when/if I can quantify that better I'll
open that as a new bug. 

Of course I don't have network maanager stressing the system with it's
probing/reset behaviour either.