Bug 431894

Summary: iwl4965 unreliable connecting to LEAP
Product: Red Hat Enterprise Linux 5 Reporter: Traxtopel <traxtopel>
Component: kernelAssignee: John W. Linville <linville>
Status: CLOSED DUPLICATE QA Contact: Martin Jenner <mjenner>
Severity: high Docs Contact:
Priority: low    
Version: 5.2CC: dcbw, dzickus, jrb, jvillalo, ltroan, syeghiay, walicki
Target Milestone: rcKeywords: OtherQA
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-11-20 19:39:15 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 432382    
Attachments:
Description Flags
Laptop C : IPW2200 - Connecting to LEAP
none
Laptop B : Madwifi connecting to LEAP
none
Laptop A : iwl4965 struggling at connecting to LEAP
none
Example of iwl4965 finally authenticating on LEAP
none
T61 iwl4965 working failed
none
Compat Wireless 2008 06 03 Leap connection iwl4965 fc9 none

Description Traxtopel 2008-02-07 17:58:41 UTC
Description of problem:
trying to connect to LEAP at IBM using a intel 4965 wireless card. Has issues
connecting. 

Version-Release number of selected component (if applicable):
el5.1 & test 2.6.18-77.el5 kernel.

In our lab we have set up a test network. 3 laptops. All within inches of each
other.
Laptop A : t61 & 4965
Laptop B : t60 & madwifi
Laptop C : t43 & ipw200

Laptop B & Laptop C connect within seconds to the LEAP AP.
I have included logs file/configs, attached to this bug report.

In all cases "wpa_supplicant -Dwext -iwlan0 -cleap.conf -ddd" was used to get
this data, with the exception of madwifi where "wpa_supplicant -Dmadwifi -iwlan0
-cleap.conf -ddd" was used.
The reason I supply the additional data is so a comparison can be drawn.

Comment 1 Traxtopel 2008-02-07 17:58:41 UTC
Created attachment 294236 [details]
Laptop C : IPW2200 - Connecting to LEAP

Comment 2 Traxtopel 2008-02-07 17:59:38 UTC
Created attachment 294237 [details]
Laptop B : Madwifi connecting to LEAP

Comment 3 Traxtopel 2008-02-07 18:00:18 UTC
Created attachment 294238 [details]
Laptop A : iwl4965 struggling at connecting to LEAP

Comment 4 Traxtopel 2008-02-07 18:01:09 UTC
I should also mention, on a rare occasion the 4965 it does connect. But few and
far between.

Comment 5 Dan Williams 2008-02-07 20:17:06 UTC
Can you try to play around with the fragement_size network block option and see
if that makes a difference in the results?  Try fragment_size=1300 and work down
from there.

Comment 6 Traxtopel 2008-02-07 20:56:23 UTC
Dan,
I tried 1300 to 1000 in counts of -10, then 1000-0 in counts of -100.
Make little if any difference. 

Comment 7 Traxtopel 2008-02-08 14:22:57 UTC
Created attachment 294356 [details]
Example of iwl4965 finally authenticating on LEAP

Perhaps this is also useful, after about 10 attempts, in this example it
finally authenticates with LEAP. This can take anywhere between 30 seconds and
5 minutes.

Comment 8 John W. Linville 2008-02-25 20:37:51 UTC
http://bughost.org/bugzilla/show_bug.cgi?id=1581

Comment 9 John Walicki 2008-03-20 19:29:24 UTC
The above bughost bug report suggests that the new iwl4965 driver version
snapshot02202008 (kernel-2.6.25rc2 with iwlwifi-1.2.26 built in) might improve
EAP-TLS (which we also use extensively).  Can Red Hat consider backporting these
recent patches to  RHEL 5.2?

Comment 10 John W. Linville 2008-03-24 15:12:22 UTC
There is no specific patch identified as fixing that issue.  Unless such a 
patch can be pinpointed, I don't think anything can be done until 5.3.  Even 
if there was such a patch, it would probably have to be a z-stream fix.

Comment 11 John W. Linville 2008-05-20 19:07:49 UTC
Can you recreate this issue with the rhel5 test kernels here?

   http://people.redhat.com/linville/kernels/rhel5

Comment 12 Traxtopel 2008-05-27 13:41:32 UTC
Created attachment 306772 [details]
T61 iwl4965 working failed

Using 2.6.18-93.el5.jwltest.54 iwl4965 kernel.

Comment 13 Traxtopel 2008-06-03 21:05:03 UTC
Created attachment 308289 [details]
Compat Wireless 2008 06 03 Leap connection iwl4965 fc9

Comment 14 John W. Linville 2008-06-04 18:49:03 UTC
Does comment 13 represent success?  Does the current fc9 kernel work as well? 
Or does it require the compat-wireless bits?

Comment 15 John W. Linville 2008-06-04 19:09:13 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=438813#c2

Comment 16 John W. Linville 2008-06-04 19:13:01 UTC
I'm not going to commit taking drivers from an unreleased upstream kernel into
rhel5 at this time.  I would much prefer to figure-out what patch actually fixed
the issue in question.

Comment 17 Traxtopel 2008-06-16 14:14:26 UTC
Issue is fixed now on fc9.

See update to https://bugzilla.redhat.com/show_bug.cgi?id=438813

Does this mean it will be fixed for 5.2 at some point?

Comment 18 Traxtopel 2008-07-25 15:39:40 UTC
Interesting enough in the latest F9 kernel its broken again
-kernel-2.6.25.10-86.fc9.i686.rpm. Leap will almost connect, I wonder if this
helps any.


Comment 19 Traxtopel 2008-07-28 13:27:29 UTC
On F9 kernel-2.6.25.11-97.fc9.i686.rpm LEAP works again.

Comment 20 Traxtopel 2008-08-13 18:45:28 UTC
John,
are there plans for el5.3 to add 2.6.26 driver updates to be added?
Thanks.

Comment 21 John W. Linville 2008-08-14 20:07:48 UTC
Test kernels w/ drivers from 2.6.26 available here:

   http://people.redhat.com/linville/kernels/rhel5/

Please give them a try and post the results here...thanks!

Comment 22 Traxtopel 2008-09-16 07:28:22 UTC
John, sorry for the delay, the new drivers seem to allow us to connect to LEAP, however when a dhclient request is made, the drivers lock up and cause a kernel panic. If I try connecting here at home using wpa-psk, works fine.

We are using the latest firmware. I can try and get you more information, if you can give me an idea what is useful.

Comment 23 John W. Linville 2008-09-16 13:50:35 UTC
Hmmm...well I need to see whatever messages the kernel outputs from the crash -- netconsole might be helpful for you to capture that.

Comment 26 John W. Linville 2008-11-06 15:18:24 UTC
Ping?  What is the latest kernel you have tried?

Comment 29 Traxtopel 2008-11-07 06:50:00 UTC
John,
we are still seeing kernel panics on some machines which have the iwl4965, I myself since the later kernels 118-121 no longer see the issue.
On a plus note we have been able to connect to LEAP using the iwl4965.
I will try and get my colleague to capture output using netconsole.

Comment 30 Traxtopel 2008-11-18 06:21:02 UTC
John,
a colleague of mine managed to get the following on his T61p, trying to connect using 2.6.18-122.el5 x86_64 kernel from el 5.3 snap 2.

Nov 17 21:05:07 strongbad.local  hsfserial(U)
Nov 17 21:05:32 strongbad.local Unable to handle kernel paging request
Nov 17 21:05:32 strongbad.local  at fffffffffffffea4 RIP:  
Nov 17 21:05:32 strongbad.local  [<ffffffff882640c5>] :mac80211:ieee80211_get_tkip_key+0x5a/0xe9 
Nov 17 21:05:32 strongbad.local PGD 203067 
Nov 17 21:05:32 strongbad.local PUD 2371067 
Nov 17 21:05:32 strongbad.local PMD 0 
Nov 17 21:05:32 strongbad.local  
Nov 17 21:05:32 strongbad.local Oops: 0000 [1] 
Nov 17 21:05:32 strongbad.local SMP 
Nov 17 21:05:32 strongbad.local  
Nov 17 21:05:32 strongbad.local last sysfs file: /class/net/eth0/broadcast 
Nov 17 21:05:32 strongbad.local CPU 0 
Nov 17 21:05:32 strongbad.local  
Nov 17 21:05:32 strongbad.local Modules linked in:
Nov 17 21:05:32 strongbad.local  netconsole
Nov 17 21:05:32 strongbad.local  lt_hotswap(U)
Nov 17 21:05:32 strongbad.local  tun
Nov 17 21:05:32 strongbad.local  bridge
Nov 17 21:05:32 strongbad.local  bnep
Nov 17 21:05:32 strongbad.local  rfcomm
Nov 17 21:05:32 strongbad.local  l2cap
Nov 17 21:05:33 strongbad.local  autofs4
Nov 17 21:05:33 strongbad.local  vmnet(U)
Nov 17 21:05:33 strongbad.local  vsock(FU)
Nov 17 21:05:33 strongbad.local  vmci(U)
Nov 17 21:05:33 strongbad.local  vmmon(U)
Nov 17 21:05:33 strongbad.local  ip_conntrack_irc
Nov 17 21:05:33 strongbad.local  ip_conntrack_ftp
Nov 17 21:05:33 strongbad.local  iptable_nat
Nov 17 21:05:33 strongbad.local  ip_nat
Nov 17 21:05:33 strongbad.local  ipt_LOG
Nov 17 21:05:33 strongbad.local  xt_limit
Nov 17 21:05:33 strongbad.local  ipt_REJECT
Nov 17 21:05:33 strongbad.local  xt_tcpudp
Nov 17 21:05:33 strongbad.local  xt_state
Nov 17 21:05:33 strongbad.local  ip_conntrack
Nov 17 21:05:33 strongbad.local  nfnetlink
Nov 17 21:05:33 strongbad.local  iptable_filter
Nov 17 21:05:33 strongbad.local  ip_tables
Nov 17 21:05:33 strongbad.local  x_tables
Nov 17 21:05:33 strongbad.local  cpufreq_ondemand
Nov 17 21:05:33 strongbad.local  acpi_cpufreq
Nov 17 21:05:33 strongbad.local  freq_table
Nov 17 21:05:33 strongbad.local  dm_mirror
Nov 17 21:05:33 strongbad.local  dm_log
Nov 17 21:05:33 strongbad.local  dm_multipath
Nov 17 21:05:33 strongbad.local  scsi_dh
Nov 17 21:05:33 strongbad.local  dm_mod
Nov 17 21:05:33 strongbad.local  video
Nov 17 21:05:33 strongbad.local  sbs
Nov 17 21:05:33 strongbad.local  i2c_ec
Nov 17 21:05:33 strongbad.local  button
Nov 17 21:05:33 strongbad.local  battery
Nov 17 21:05:33 strongbad.local  asus_acpi
Nov 17 21:05:33 strongbad.local  acpi_memhotplug
Nov 17 21:05:33 strongbad.local  ac
Nov 17 21:05:33 strongbad.local  parport_pc
Nov 17 21:05:33 strongbad.local  lp
Nov 17 21:05:33 strongbad.local  parport
Nov 17 21:05:34 strongbad.local  thinkpad_acpi
Nov 17 21:05:34 strongbad.local  hwmon

Comment 31 John W. Linville 2008-11-18 15:57:36 UTC
How often does this occur?  Is there a way to reliably recreate it?

Comment 32 Traxtopel 2008-11-18 16:10:50 UTC
From what I understand, it seems like a TKIP issue, not appearing connecting using AES. i.e. WPA Crashes, WPA2 does not.

Comment 33 John W. Linville 2008-11-18 16:54:03 UTC
It's just that it looks a bit like a TKIP bug that is supposed to already be fixed in -122.el5.  Are you sure that is the kernel being run?

Comment 34 Traxtopel 2008-11-18 17:53:53 UTC
Checking John, I understood from colleagues they were on -122.el5. However looks like they were still on -121.el5. Have asked them to updated and retest

Comment 35 Traxtopel 2008-11-20 15:41:13 UTC
John,
looks like the crashes were indeed resolved with the -122.el5 kernel.

Comment 36 John W. Linville 2008-11-20 16:36:13 UTC
Excellent!