Bug 250913

Summary: [PATCH] iwl3945 improperly combines 802.11a and 802.11bg networks and fails to find the a net
Product: [Fedora] Fedora Reporter: Derek Atkins <warlord>
Component: kernelAssignee: John W. Linville <linville>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: medium    
Version: 7CC: cebbert, chris.brown, davej
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 2.6.22.9-91.fc7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-10-04 00:14:17 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
0001-mac80211-store-freq-info-in-sta_bss_list.patch
none
Scan results
none
0001-mac80211-store-channel-info-in-sta_bss_list.patch
none
J. Linville's patch updated to apply against 2.6.22.5 none

Description Derek Atkins 2007-08-05 05:31:13 UTC
Description of problem:

I've got a D-link a/b/g access point for my network.  This device uses the same
MAC address for all three interfaces (wired, 802.11a, and 802.11b/g).  I have
different network names (SSIDs) configured for the two wireless networks. 
Unfortunately the driver seems to combine the two networks into one (because
both the a and b/g nets have the same BSSID) and never shows me the 802.11(a)
network.

Running iwlist eth1 scan will never show the 802.11a network.
Network Manager never sees the 802.11a network.
Even asking NM to connect to the 802.11a network fails when using the iwl3945
driver.  It never associates.

I can't even get it to associate MANUALLY..  If I stop NM, unload the driver,
and reload the driver, I STILL can't get the driver to associate to the 802.11a
network, only the 802.11b/g.

I know the card (and driver!) work fine in an 802.11a network when the network
has a unique BSSID.  I also know that my home configuration works with the
ipw3945 driver, but not the iwl3945 driver that comes with Fedora.

Version-Release number of selected component (if applicable):

kernel-2.6.22.1-29.fc7
iwlwifi-firmware-2.14.4-1
NetworkManager-0.6.5-7.fc7

How reproducible:

This is completely reproducible in my environment.

Steps to Reproduce:
1. Get an a/b/g access point that uses the same BSSID for both a and b/g
2. Configure both networks.  I have WEP configured on mine.
3. Try to connect to the "a" network.  Watch it fail.
  
Actual results:

It will only connect the b/g network and never shows scan results for the
802.11a network.  I know there are two networks because I can log into the AP,
and if I use the ipw driver instead of the iwl driver it works fine.

Expected results:

The driver shouldn't combine 802.11a and 802.11b/g scan results, even if the
BSSID is the same.  The driver also should be able to associate to an 802.11a
network even when there's an 802.11b/g network with the same BSSID -- even if
the 802.11b/g network has a different network name.

Additional info:

Comment 1 John W. Linville 2007-08-07 18:36:39 UTC
Just to clarify, the problem is that you can't associate to "SSID-a", but you 
can associate to "SSID-g".  The theory you are offering (which seems plausible 
enough) is that because "SSID-a" and "SSID-g" both use the same BSSID they 
must be conflicting with one another in the kernel's internal data structures 
used for storing scan results.  Is this a correct characterization?

I'll have to look at the scanning code to see if I can find such a problem.  I 
just want to make sure I understand your report.

Comment 2 Derek Atkins 2007-08-07 18:41:15 UTC
John,

Yes, that is the correct characterization.  The scan results presented via
"iwlist eth1 scan" don't even show the SSID-a network in the list;
network-manager never sees it, either, and I can't seem to get the driver even
to manually swap over to the SSID-a.

At the IETF two weeks ago, however, I WAS able to connect to ietf-a because they
were different BSSIDs than "ietf-g".

I'm only guessing at the cause, but it seems reasonable.

I'm happy to test kernel RPMs for you.


Comment 3 John W. Linville 2007-08-15 21:15:19 UTC
As suspected, mac80211 collects scan results based solely on BSSID.  So, I 
surmise that your scan results cover both your .11a network and your .11g 
network, with your .11g results overwriting the .11a stuff almost immediately.

I'm not sure what (if anything) IEEE802.11 says about using the same BSSID on 
different channels or modulations.  But it may not matter anyway if there is 
equipment in the field doing it already.

I'll look at a patch and see about getting you a test kernel...

Comment 5 Derek Atkins 2007-08-15 21:22:45 UTC
Thanks.   I don't know specifically what the 802.11 docs say about this, but
I'll just add that the win32 driver, the ipw3945 driver, and previous a/b/g
drivers have all worked just fine in this particular environment.   I look
forward to your patch and a test kernel.

FWIW I've since upgraded to 2.6.22.1-41.fc7 but obviously it still has this
issue.   Thank you for looking at it!

Comment 6 John W. Linville 2007-08-16 20:18:03 UTC
Created attachment 161683 [details]
0001-mac80211-store-freq-info-in-sta_bss_list.patch

Comment 7 John W. Linville 2007-08-16 20:21:32 UTC
I'm building RPMs now (may be a while before they're done):

   http://koji.fedoraproject.org/koji/taskinfo?taskID=105621

Just click on the build for your arch, then look for the right RPM in 
the "output" secion at the bottom of the page.

Do these kernels show both the a and b/g BSSIDs?

Also, what is the AP you are using?

Comment 8 Derek Atkins 2007-08-16 20:30:30 UTC
It's a D-Link DWL-7100AP
I'll try the new kernel once it finishes.

Comment 9 Derek Atkins 2007-08-17 13:58:28 UTC
Created attachment 161736 [details]
Scan results

I'm running with the new kernel now.  The good news:  the scan results show
both my 802.11a and 802.11b/g networks.  The bad news:	it's showing too many
entries.  For example, there are multiple entries for the same AP on the same
BSSID/Frequency pairs.

So, I would say that this patch isn't quite right, yet.

I'm happy to test your next attempt!  Thanks!

Comment 10 John W. Linville 2007-08-17 14:12:06 UTC
Yeah, something definitely looks busted there... :-)  I'll look at it again.

BTW, I think the meaning of ON_DEV is non-obvious (and it may not actually be 
used anymore in our process).  Also, it makes this bug drop off my default 
list!  Let's stick to ASSIGNED or maybe NEEDINFO with the field set to 
Assignee...FWIW, I think it will put it back to ASSIGNED automatically when 
you are the NEEDINFO person and you add a comment.

Comment 11 Derek Atkins 2007-08-17 14:25:21 UTC
Yeah, I just didn't add a comment directly; I commented by adding an attachment.
 That didn't automatically convert it from NEEDINFO to ASSIGNED, and I didn't
see "ASSIGNED" in the list when I tried to convert it directly.

I'll be more careful next time around.  :)


Comment 12 Derek Atkins 2007-08-17 17:18:46 UTC
Oh, something else I noticed..  Even with this new kernel in place,
networkmanager still didn't list both devices, but I think that's a NM issue,
not a driver issue.  Once we get the driver working then I'll file a bug with NM.

Comment 13 John W. Linville 2007-08-17 18:24:47 UTC
(Mumbling to myself...)

Can't replicate here, might be related to actually having same BSSIDs on 
multiple freqs.  Maybe that situations screws-up the hash table so that is 
never right afterwards (causing later BSSID lookups to fail)?

Comment 14 Derek Atkins 2007-08-17 18:38:37 UTC
Hmm..  Yeah, it might be making an assumption that there's only one entry per
BSSID and as a result screwing up the hash table..  And if you never actually
HAVE that situation in your environment then you'll never see it.  Hmm..  So how
would you like me to help you debug this?

Comment 15 John W. Linville 2007-08-17 19:12:04 UTC
Hash table looks pretty simple -- still, I observe that your scan results look 
fine until the first not-quite-duplicate bssid.  After that every entry is 
either a duplicate or the first in a series of duplicates.  Still, might be 
coincidental since only one bssid is un-duplicated in that list anyway (two if 
you count "ihtfp-a" as distinct from "ihtfp").

Does the scan result list grow longer over time?  Any idea of the rate at 
which it gets longer?  I'm guessing it relates to beacon intervals, but that 
should make it grow pretty darned fast up to limit of 256 or so...

Comment 16 John W. Linville 2007-08-17 19:17:42 UTC
Hmmm...scratch that -- it probably has one entry for each beacon it sees 
_during_the_scan_ (after list entry duplication starts).  So, it might be 
naturally limited by the channel dwell time during the scan.  You might try 
manipulating the beacon interval on any AP you control, to see if you can make 
it show more duplicates?  It probably doesn't help find the problem, but it 
might be fun... :-)

Comment 17 John W. Linville 2007-08-17 19:19:25 UTC
Along those lines, do "ihtfp-a" and "ihtfp" have different beacon intervals?  
If so, that might account for why "ihtfp-a" doesn't get duplicated...

Comment 18 Derek Atkins 2007-08-17 20:16:46 UTC
802.11a (ihtfp-a) is configured with beacon interval 100, DTIM 1, Fragment
Length 2346, PTS Length 2346.  802.11g (ihtfp) is configured exactly the same.

So many what I'm seeing is the sum of all beacons, where it's never actually
detecting any duplicates so it never culls the list?  Or when a beacon comes in
it doesn't check if it's in the list before adding it?

Comment 19 John W. Linville 2007-08-17 20:40:49 UTC
Check for an existing entry seems to be failing in some cases.  It looks like 
there is some funny business with the freq values in the bss list entries.  
I'd like to try using channel instead -- still not sure if I really should 
check mode either in addition or instead.  But, lets see if this works as a 
start...

Comment 20 John W. Linville 2007-08-17 20:42:09 UTC
Created attachment 161770 [details]
0001-mac80211-store-channel-info-in-sta_bss_list.patch

http://koji.fedoraproject.org/koji/taskinfo?taskID=107553

Comment 21 John W. Linville 2007-08-17 20:42:59 UTC
Build in comment 20 will likely take a while -- anyway, let me know if it 
works any differently (hopefully better)...thanks!

Comment 22 Derek Atkins 2007-08-19 16:31:42 UTC
I just updated to 2.6.22.2-57.bz250913.1.fc7 and it does indeed fix the problem.
 Thanks!   Now I get to file a bug report against NM.

Nice work.

Comment 23 Derek Atkins 2007-08-24 13:21:36 UTC
I take it this patch didn't make it into 2.6.22.4-65.fc7 that was just released?


Comment 24 Derek Atkins 2007-09-17 17:36:25 UTC
Any chance this patch will make it into a new kernel anytime soon?  If not, any
chance you could build be a new kernel based on kernel-2.6.22.5-76.fc7 ?

Thanks

Comment 25 Christopher Brown 2007-09-24 00:05:54 UTC
Hello Derek,

I'm reviewing this bug as part of the kernel bug triage project, an attempt to
isolate current bugs in the fedora kernel.

http://fedoraproject.org/wiki/KernelBugTriage

I am CC'ing myself to this bug and will try and assist you in resolving it if I can.

(In reply to comment #24)
> Any chance this patch will make it into a new kernel anytime soon?  If not, any
> chance you could build be a new kernel based on kernel-2.6.22.5-76.fc7 ?

I'm building one now and will run one in Koji if it goes okay. Attaching updated
patch - this should be in 2.6.23 - can you test a kernel from the development
repository to see whether this issue is resolved for you?

Cheers
Chris

Comment 26 Christopher Brown 2007-09-24 00:08:01 UTC
Created attachment 203631 [details]
J. Linville's patch updated to apply against 2.6.22.5

Comment 27 Christopher Brown 2007-09-24 10:20:42 UTC
(In reply to comment #24)
> Any chance this patch will make it into a new kernel anytime soon?  If not, any
> chance you could build be a new kernel based on kernel-2.6.22.5-76.fc7 ?
> 
> Thanks

http://koji.fedoraproject.org/koji/taskinfo?taskID=171488


Comment 28 Derek Atkins 2007-09-25 14:52:39 UTC
I can verify that kernel-2.6.22.5-76.bz250913.fc7.i686.rpm works for me.

Comment 29 Derek Atkins 2007-09-28 15:23:15 UTC
FYI, I'm also tracking another iwl3945 bug #293701 but I cannot test both
simultaneously.  See my latest comment in that bug for an explanation. 
Hopefully both fixes will make it into an updated kernel.

Thanks.

Comment 30 Chuck Ebbert 2007-09-28 15:48:50 UTC
Patch is in kernel 2.6.22.9-91