Bug 419871

Summary: Wireless connection drops out with wlan0: No STA entry for own AP
Product: [Fedora] Fedora Reporter: Bevis King <brwk>
Component: kernelAssignee: John W. Linville <linville>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: 8CC: cebbert, davej, mb, wk
Target Milestone: ---   
Target Release: ---   
Hardware: ia64   
OS: Linux   
Whiteboard:
Fixed In Version: 2.6.23.13-105.fc8 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-01-22 20:05:56 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
do not disassoc on probe failure none

Description Bevis King 2007-12-11 15:20:10 UTC
Description of problem:
After a period of working, the wireless lan connection drops out with a number
of errors in dmesg.  Drops out after 15 minutes to a couple of hours of
successful operation; full cold reboot with power off fixes.  Machine is about
four weeks old.

Machine is a Dell Lattitude D430 laptop, Broadcom BCM4312 wireless using the
standard kernel b43 driver and the firmware from the openwrt.org project. 
Fedora 8 (x86_64) with kernel 2.6.23.8-63.fc8 #1 SMP

dmesg log (copied by hand):
wlan0: authenticated
wlan0: associate with AP 00:18:39:0a:db:cd
wlan0: authentication frame received from 00:18:39:0a:db:cd, but not in
authenticate state - ignored
wlan0: RX ReassocResp from 00:18:39:0a:db:cd (capab=0x1 status=0 aid=14)
wlan0: associated
wlan0: CTS protection enabled (BSSID=00:18:39:0a:db:cd)
wlan0: association frame received from 00:18:39:0a:db:cd, but not in associate
state - ignored
wlan0: association frame received from 00:18:39:0a:db:cd, but not in associate
state - ignored
wlan0: association frame received from 00:18:39:0a:db:cd, but not in associate
state - ignored
wlan0: association frame received from 00:18:39:0a:db:cd, but not in associate
state - ignored
wlan0: No ProbeResp from current AP 00:18:39:0a:db:cd - assume out of range
wlan0: No STA entry for own AP 00:18:39:0a:db:cd
wlan0: No STA entry for own AP 00:18:39:0a:db:cd
wlan0: No STA entry for own AP 00:18:39:0a:db:cd
wlan0: No STA entry for own AP 00:18:39:0a:db:cd
[... continues ...]

The access point is still up; the fault occurs with multiple different access
points.

Version-Release number of selected component (if applicable):
kernel 2.6.23.8-63.fc8 (x86_64)

How reproducible:
Happened three times now in 24 hours in multiple locations (work & home).

Steps to Reproduce:
1.  boot system
2.  use wireless successfully
3.  sometime later connection fails
  
Actual results:
wireless fails after a period of time.

Expected results:
wireless works throughout session.

Additional info:
iwconfig of failed device:

% iwconfig
lo        no wireless extensions.

eth0      no wireless extensions.

wmaster0  no wireless extensions.

wlan0     IEEE 802.11g  ESSID:"SCSECM01"
          Mode:Managed  Frequency:2.462 GHz  Access Point: 00:18:39:0A:DB:CD
          Tx-Power=27 dBm
          Retry min limit:7   RTS thr:off   Fragment thr=2352 B
          Link Quality:0   Signal level:0   Noise level:0
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:0  Invalid misc:0   Missed beacon:0

%

Comment 1 John W. Linville 2007-12-11 15:47:48 UTC
Instead of a reboot, does a 'iwlist wlan0 scan' bring it back?  If not, how 
about a 'iwconfig wlan0 essid SCSECM01'?

Does this occur during active use?  Or after at least some period of network 
inactivity?

Comment 2 Bevis King 2007-12-11 16:51:04 UTC
>Does this occur during active use?  Or after at least some period of network 
>inactivity?

So far it's been during relative inactivity - ie > 2 mins.

I'll get you answers on the top two questions next time it falls over.

Regards, Bevis.


Comment 3 Alexei Podtelezhnikov 2007-12-12 18:15:59 UTC
It is b43. It is another duplicate of bug 412861.


Comment 4 John W. Linville 2007-12-12 18:36:09 UTC
No, it isn't.

Comment 5 Alexei Podtelezhnikov 2007-12-12 19:40:22 UTC
Yes. it is!

bug 412861 is about dropped connections, as this bug is.
bug 413291 is about slow connections

Stop pretending you are dealing with irrelevant rare user-specific issues.
This is all in the same damn untested wireless updates that you delivered.
Oh well, this one is on x86_64.. big deal. Now we know that it is not only on i386.




Comment 6 Michael Buesch 2007-12-12 23:47:05 UTC
I believe that mac80211 drops the connection too early.
If I read the code correctly, it drops the connection immediately, if it failed
to receive one probe response (after two seconds of idle).
I think it should send like 4 or 5 probes and drop the connection if all of
these failed.
One probe response might easily get lost in a bad or busy network (due to
another STA violating the NAV or something else for example.)

Comment 7 Michael Buesch 2007-12-12 23:50:08 UTC
Created attachment 286371 [details]
do not disassoc on probe failure

Comment 8 Michael Buesch 2007-12-12 23:53:17 UTC
Can you test the patch I just attached?
It will completely disable disassociation after failed probes.
Please check if the device keeps working if you apply this and don't move it around.

Note that this patch is just a hack for testing. It will disable automatic STA
timeout.

Comment 9 John W. Linville 2007-12-13 00:35:23 UTC
Re: comment 5

You are merely confusing things with all your attempts to call every bug a 
duplicate of every other.  Please pipe down.

Comment 10 Bevis King 2007-12-14 20:49:24 UTC
Hi guys - not quite following the thread of the discussion here - however here
are the results of the tests you asked for:

iwlist wlan0 scan

This produces a scan list of the available access points including the one it
was talking to.  It *APPEARS* to be continuing to report the lost access point
into dmesg after this point (not certain about that but pretty sure).

I also tried a ifconfig wlan0 down immediately followed by a ifconfig wlan0 up -
that did not recover it.

Then I tried the:

iwconfig wlan0 essid bevteccom

(bevteccom is my own domain/ESSID at home)

That brought the link back up and I could ping the local subnet ONLY.  The loss
of connection had dropped the default route out of the kernel routing table and
it had to be added back in manually before normal connection was resumed.

Hope that info helps.

Regards, Bevis.


Comment 11 Bevis King 2007-12-14 21:00:29 UTC
OK, I've looked at the patch - It certainly makes sense to me.  When I get a
moment or three I'll download the kernel sources, apply the patch and let you
know how it responds with the patch in place.  Thanks for that.

Regards, Bevis.


Comment 12 Wojciech Kazubski 2007-12-14 22:06:33 UTC
I have problem with two wireless cards loosing connection when left idle for 
several minutes or so. The cards are based on RTL8185 and ZD1211, both use 
mac80211.
Usually, cycling the interface down and then back up restores the connection.


Comment 13 Alexei Podtelezhnikov 2007-12-17 02:26:10 UTC
I have Broadcom 4312. I have the same connection drops with the same messages
(never mind my posts to the other bugs). For me, it is usually relatively weak
connections that are dropped. Strong connections seems to be never dropped.

The strength of the link is, however, set pretty randomly around my house, which
is strange. I seem to have noticed a pattern: 

- After fresh F8 boot, the connection is usually weaker.
- After reboot from Win XP, the connection is usually stronger.

I hope this helps.

Comment 14 Michael Buesch 2007-12-17 09:52:32 UTC
(In reply to comment #13)
> I have Broadcom 4312. I have the same connection drops with the same messages
> (never mind my posts to the other bugs). For me, it is usually relatively weak
> connections that are dropped. Strong connections seems to be never dropped.
> 
> The strength of the link is, however, set pretty randomly around my house, which
> is strange. I seem to have noticed a pattern: 
> 
> - After fresh F8 boot, the connection is usually weaker.
> - After reboot from Win XP, the connection is usually stronger.
> 
> I hope this helps.

Did you try latest wireless-2.6#everything git tree? It contains an important RX
SSI fix.

Comment 15 John W. Linville 2007-12-17 15:22:52 UTC
I'll get an F8 kernel built w/ the latest stuff from wireless-2.6 either today 
or tomorrow...  However, I don't think there is anything in there that 
resolves the general mac80211 problem for this bug.

Comment 16 Wojciech Kazubski 2007-12-22 23:11:02 UTC
With kernel 2.6.23.9-85.fc8 connection still get lost from time to time for 
RTL8185 based card. No test for ZD1211 due to missing module.

Comment 17 Alexei Podtelezhnikov 2007-12-23 03:54:43 UTC
2.6.23-12-99 looks really good so far. No dropped connections yet

Comment 18 Wojciech Kazubski 2007-12-29 21:03:18 UTC
With kernel 2.6.23.12-99 I still have dropped connection with 8185 when 
interface is left idle for several minutes. I get this single message in dmesg 
on disconnect:
No ProbeResp from current AP 00:06:4f:42:6f:e5 - assume out of range



Comment 19 Michael Buesch 2007-12-29 21:10:57 UTC
Can someone try my patch, please?

Comment 20 John W. Linville 2008-01-15 19:21:26 UTC
A patch was merged upstream to at least partially address this issue.  It is 
available in these kernels:

   http://koji.fedoraproject.org/koji/buildinfo?buildID=31090

Can you recreate the issue with those kernels?

Comment 21 Bevis King 2008-01-16 10:15:16 UTC
John - thanks.  I'll take a look at this as soon as I can - probably tomorrow as
I didn't bring the laptop in today - however a recent kernel update has
completely hosed the wireless support and suspend/resume, so I need to get back
the status quo before I can move forward.

Regards, Bevis.


Comment 22 Wojciech Kazubski 2008-01-20 00:28:14 UTC
RTL8185 wireless card works stable for me with 2.6.23.14_111 kernel.  
Connection dropped only once after 2 days of testing.

Also ZD1211 appears to work much better with this kernel.



Comment 23 Bevis King 2008-01-22 17:01:14 UTC
OK, so far, so good.  This 2.6.23.13-105.fc8.x86_64 kernel fixes the issues I
was previously seeing with the 2.6.23.9-85.fc8.x86_64 kernel which was not
functional at all with the Broadcom BCM4312 chipset I have in my Dell Latitude
D430 laptop.

With this -105 kernel, it seems to be working normally again.

I'll continue to test over the next few days and report back.

Regards, Bevis.

Comment 24 Michael Buesch 2008-01-22 17:07:04 UTC
(In reply to comment #23)
> OK, so far, so good.  This 2.6.23.13-105.fc8.x86_64 kernel fixes the issues I
> was previously seeing with the 2.6.23.9-85.fc8.x86_64 kernel which was not
> functional at all with the Broadcom BCM4312 chipset I have in my Dell Latitude
> D430 laptop.
> 
> With this -105 kernel, it seems to be working normally again.

Great. Thanks a lot for testing.