Bug 438813 - iwl4965 unreliable connecting to LEAP
iwl4965 unreliable connecting to LEAP
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
8
i386 Linux
low Severity high
: ---
: ---
Assigned To: John W. Linville
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-03-25 09:03 EDT by Traxtopel
Modified: 2008-06-23 15:24 EDT (History)
6 users (show)

See Also:
Fixed In Version: 2.6.25.6-55.fc9
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-06-23 15:24:05 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
wireless-2_6_25_4-26_fc9.shortlog (16.03 KB, text/plain)
2008-06-05 14:03 EDT, John W. Linville
no flags Details

  None (edit)
Description Traxtopel 2008-03-25 09:03:46 EDT
Following advice of John Linville opening a bug on the fedora 8 tree.
I tested the same setup as below on FC8 using John Linvilles latest test kernel
 2.6.24.3-38.fc8
I am seeing the same issues with 4965 on Fc8 with firmware 4.44.1.18/20. 
Using a 3945 card and the iwl3945 driver it connects with few issues. 
 
Could this be a iwl4965 firmware issue?

Please let me know what additional information/testing is required.

+++ This bug was initially created as a clone of Bug #431894 +++

Description of problem:
trying to connect to LEAP at IBM using a intel 4965 wireless card. Has issues
connecting. 

Version-Release number of selected component (if applicable):
el5.1 & test 2.6.18-77.el5 kernel.

In our lab we have set up a test network. 3 laptops. All within inches of each
other.
Laptop A : t61 & 4965
Laptop B : t60 & madwifi
Laptop C : t43 & ipw200

Laptop B & Laptop C connect within seconds to the LEAP AP.
I have included logs file/configs, attached to this bug report.

In all cases "wpa_supplicant -Dwext -iwlan0 -cleap.conf -ddd" was used to get
this data, with the exception of madwifi where "wpa_supplicant -Dmadwifi -iwlan0
-cleap.conf -ddd" was used.
The reason I supply the additional data is so a comparison can be drawn.

-- Additional comment from traxtopel@fastmail.fm on 2008-02-07 12:58 EST --
Created an attachment (id=294236)
Laptop C : IPW2200 - Connecting to LEAP


-- Additional comment from traxtopel@fastmail.fm on 2008-02-07 12:59 EST --
Created an attachment (id=294237)
Laptop B : Madwifi connecting to LEAP


-- Additional comment from traxtopel@fastmail.fm on 2008-02-07 13:00 EST --
Created an attachment (id=294238)
Laptop A : iwl4965 struggling at connecting to LEAP


-- Additional comment from traxtopel@fastmail.fm on 2008-02-07 13:01 EST --
I should also mention, on a rare occasion the 4965 it does connect. But few and
far between.

-- Additional comment from dcbw@redhat.com on 2008-02-07 15:17 EST --
Can you try to play around with the fragement_size network block option and see
if that makes a difference in the results?  Try fragment_size=1300 and work down
from there.

-- Additional comment from traxtopel@fastmail.fm on 2008-02-07 15:56 EST --
Dan,
I tried 1300 to 1000 in counts of -10, then 1000-0 in counts of -100.
Make little if any difference. 

-- Additional comment from traxtopel@fastmail.fm on 2008-02-08 09:22 EST --
Created an attachment (id=294356)
Example of iwl4965 finally authenticating on LEAP

Perhaps this is also useful, after about 10 attempts, in this example it
finally authenticates with LEAP. This can take anywhere between 30 seconds and
5 minutes.

-- Additional comment from linville@redhat.com on 2008-02-25 15:37 EST --
http://bughost.org/bugzilla/show_bug.cgi?id=1581

-- Additional comment from walicki@us.ibm.com on 2008-03-20 15:29 EST --
The above bughost bug report suggests that the new iwl4965 driver version
snapshot02202008 (kernel-2.6.25rc2 with iwlwifi-1.2.26 built in) might improve
EAP-TLS (which we also use extensively).  Can Red Hat consider backporting these
recent patches to  RHEL 5.2?

-- Additional comment from linville@redhat.com on 2008-03-24 11:12 EST --
There is no specific patch identified as fixing that issue.  Unless such a 
patch can be pinpointed, I don't think anything can be done until 5.3.  Even 
if there was such a patch, it would probably have to be a z-stream fix.
Comment 1 John W. Linville 2008-05-20 15:06:55 EDT
Can you recreate this issue with the F9 kernels here?

   http://koji.fedoraproject.org/koji/buildinfo?buildID=49743
Comment 2 Traxtopel 2008-06-04 14:57:30 EDT
Latest compat-wireless-20080603 on fc9 works.
See log in https://bugzilla.redhat.com/show_bug.cgi?id=431894
Can these updated drivers be added to you el5 jwl kernel, please.
Comment 3 John W. Linville 2008-06-04 15:10:50 EDT
Did you try the kernels from comment 1?  It normally doesn't make a lot of sense
to run the compat-wireless stuff on Fedora, because the latest Koji kernels will
generally have the same code as in compate-wireles...

Also, it would probably be best to keep discussion of el5 to bug 431894 as this
is a fedora bug.
Comment 4 Traxtopel 2008-06-04 15:24:39 EDT
John, I will download the fc9 koji kernels and test, I'll see if I can between
which kernels the code changed, does that sound ok?
Comment 5 John W. Linville 2008-06-04 15:36:04 EDT
Yes, that sounds very helpful...thanks!
Comment 6 Traxtopel 2008-06-05 08:56:04 EDT
John,
tested 2.6.25-10.fc9, it associates but hangs there and almost never authenticates.
tested 2.6.25-3-18.fc9, it associates but hangs there and almost never
authenticates.
tested 2.6.25.4.29.fc9, this works. Associates and I can authenticate
(I noticed there were changes made to the wireless code in 2.6.25.4.26.fc9,
however this kernel is not available hence 2.6.25.4.29.fc9)
Also latest kernel 2.6.25.4.40.fc9 works every time.

I do not see any other kernels out there to try, does this narrow it down enough
for you? If not point me to a specific fc9 kernel version and I will test.
Thanks.
Comment 7 John W. Linville 2008-06-05 14:03:32 EDT
Created attachment 308466 [details]
wireless-2_6_25_4-26_fc9.shortlog

Shortlog of wireless changes between 2.6.25.3-18.fc9 and
2.6.25.4-26.fc9...unfortunately, not short enough to be immediately helpful...
Comment 8 Traxtopel 2008-06-10 10:56:45 EDT
John,
took a different approach, went back to compat-wireless tarballs. 
Something changed in April ... 

2008-04-07 - does not connect
2008-04-26 - connects

There are no tarballs available in between these dates. In those 19 days there
are more than 4000 line changes to the iwlwifi directory alone.
The snapshots do not seem to download from gitweb. 
Do you have any other pointers how I can checkout archives in between those
dates. Just trying to narrow it down.
Comment 9 Traxtopel 2008-06-12 14:55:03 EDT
Ignore compat-wireless tarballs test. 
Do you have the individual patches mentioned in the
wireless-2_6_25_4-26_fc9.shortlog

I can try to apply/compile them individually(if possible).
Comment 10 John W. Linville 2008-06-12 15:14:06 EDT
You sure you want to do that? :-)

The individual patches are all in wireless-testing tree (available through 
git):

   git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-testing.git

You'll have to use git-log to find the patch from its entry in the short log, 
then git-show based on each patch's commit ID.

Good luck!  You must be really motivated... :-)
Comment 11 Traxtopel 2008-06-12 15:27:48 EDT
To be honest I do not want to, if the patch is something trivial(big doubt),
then I can use it to create an updated module for my el5.2 t61 wireless users.

It's fixed upstream, when can I hope to see the fix in your jwl test el5.2 kernels?
Comment 12 John W. Linville 2008-06-12 15:53:40 EDT
Those kernels already have wireless bit up-to-date with 2.6.25.  Since 2.6.26 
is not released, I currently have no plans to move them forward from there 
unless I identify specific fixes.
Comment 13 Traxtopel 2008-06-13 09:34:54 EDT
One thing that is curious, using the same version of wpa_supplicant.
I do see one difference.
In the case it fails, it sends my "identity" before "EAPOL: startWhen --> 0"
starts. Could this be the problem. I repeated this multiple times, everytime I
see the same results.

I will call it bad code/good code(bad naming but makes it clear)

i.e.

#Bad Code#
Associates
Sends Identity
starts "EAPOL: startWhen --> 0"

#Good Code#
Associates
starts "EAPOL: startWhen --> 0"
Sends Identity

################################################################
#Bad code#
Setting authentication timeout: 70 sec 0 usec
EAPOL: Received EAP-Packet frame
EAPOL: SUPP_PAE entering state RESTART
EAP: EAP entering state INITIALIZE
EAP: EAP entering state IDLE
EAPOL: SUPP_PAE entering state AUTHENTICATING
EAPOL: SUPP_BE entering state REQUEST
EAPOL: getSuppRsp
EAP: EAP entering state RECEIVED
EAP: Received EAP-Request id=1 method=1 vendor=0 vendorMethod=0
EAP: EAP entering state IDENTITY
CTRL-EVENT-EAP-STARTED EAP authentication started
EAP: EAP-Request Identity data - hexdump_ascii(len=42):
     00 6e 65 74 77 6f 72 6b 69 64 3d 49 42 4d 2c 6e   _networkid=IBM,n
     61 73 69 64 3d 72 63 78 2d 61 70 2d 30 35 31 2d   asid=rcx-ap-051-
     31 2c 70 6f 72 74 69 64 3d 30                     1,portid=0
EAP: using real identity - hexdump_ascii(len=27):
     67 72 61 6e 74 5f 77 69 6c 6c 69 61 6d 73 6f 6e   grant_williamson
     40 6e 6c 2e 69 62 6d 2e 63 6f 6d                  @nl.ibm.com
EAP: EAP entering state SEND_RESPONSE
EAP: EAP entering state IDLE
EAPOL: SUPP_BE entering state RESPONSE
EAPOL: txSuppRsp
TX EAPOL: dst=00:0e:83:60:04:70
TX EAPOL - hexdump(len=36): 01 00 00 20 02 01 00 20 01 67 72 61 6e 74 5f 77 69
6c 6c 69 61 6d 73 6f 6e 40 6e 6c 2e 69 62 6d 2e 63 6f 6d
EAPOL: SUPP_BE entering state RECEIVE
EAPOL: startWhen --> 0
################################################################

Where as using the newer modules.
################################################################
# Good Code #
Setting authentication timeout: 70 sec 0 usec
EAPOL: Received EAP-Packet frame
EAPOL: SUPP_PAE entering state RESTART
EAP: EAP entering state INITIALIZE
EAP: EAP entering state IDLE
EAPOL: SUPP_PAE entering state AUTHENTICATING
EAPOL: SUPP_BE entering state REQUEST
EAPOL: getSuppRsp
EAP: EAP entering state RECEIVED
EAP: Received EAP-Request id=2 method=1 vendor=0 vendorMethod=0
EAP: EAP entering state IDENTITY
CTRL-EVENT-EAP-STARTED EAP authentication started
EAP: EAP-Request Identity data - hexdump_ascii(len=42):
     00 6e 65 74 77 6f 72 6b 69 64 3d 49 42 4d 2c 6e   _networkid=IBM,n
     61 73 69 64 3d 72 63 78 2d 61 70 2d 30 34 74 2d   asid=rcx-ap-04t-
     31 2c 70 6f 72 74 69 64 3d 30                     1,portid=0      
EAP: using real identity - hexdump_ascii(len=27):
     67 72 61 6e 74 5f 77 69 6c 6c 69 61 6d 73 6f 6e   grant_williamson
     40 6e 6c 2e 69 62 6d 2e 63 6f 6d                  @nl.ibm.com     
EAP: EAP entering state SEND_RESPONSE
EAP: EAP entering state IDLE
EAPOL: SUPP_BE entering state RESPONSE
EAPOL: txSuppRsp
TX EAPOL: dst=00:0e:83:39:a6:80
################################################################

Comment 14 Traxtopel 2008-06-13 09:41:48 EDT
Wrong paste, these are the correct logs

#Bad Modules#
Setting authentication timeout: 70 sec 0 usec
EAPOL: Received EAP-Packet frame
EAPOL: SUPP_PAE entering state RESTART
EAP: EAP entering state INITIALIZE
EAP: EAP entering state IDLE
EAPOL: SUPP_PAE entering state AUTHENTICATING
EAPOL: SUPP_BE entering state REQUEST
EAPOL: getSuppRsp
EAP: EAP entering state RECEIVED
EAP: Received EAP-Request id=1 method=1 vendor=0 vendorMethod=0
EAP: EAP entering state IDENTITY
CTRL-EVENT-EAP-STARTED EAP authentication started
EAP: EAP-Request Identity data - hexdump_ascii(len=42):
     00 6e 65 74 77 6f 72 6b 69 64 3d 49 42 4d 2c 6e   _networkid=IBM,n
     61 73 69 64 3d 72 63 78 2d 61 70 2d 30 35 31 2d   asid=rcx-ap-051-
     31 2c 70 6f 72 74 69 64 3d 30                     1,portid=0
EAP: using real identity - hexdump_ascii(len=27):
     67 72 61 6e 74 5f 77 69 6c 6c 69 61 6d 73 6f 6e   grant_williamson
     40 6e 6c 2e 69 62 6d 2e 63 6f 6d                  @nl.ibm.com
EAP: EAP entering state SEND_RESPONSE
EAP: EAP entering state IDLE
EAPOL: SUPP_BE entering state RESPONSE
EAPOL: txSuppRsp
TX EAPOL: dst=00:0e:83:60:04:70
TX EAPOL - hexdump(len=36): 01 00 00 20 02 01 00 20 01 67 72 61 6e 74 5f 77 69
6c 6c 69 61 6d 73 6f 6e 40 6e 6c 2e 69 62 6d 2e 63 6f 6d
EAPOL: SUPP_BE entering state RECEIVE
EAPOL: startWhen --> 0

#Good Modules#
EAPOL: startWhen --> 0
EAPOL: SUPP_PAE entering state CONNECTING
EAPOL: txStart
TX EAPOL: dst=00:0e:83:39:a6:80
TX EAPOL - hexdump(len=4): 01 01 00 00
RX EAPOL from 00:0e:83:39:a6:80
RX EAPOL - hexdump(len=51): 01 00 00 2f 01 02 00 2f 01 00 6e 65 74 77 6f 72 6b
69 64 3d 49 42 4d 2c 6e 61 73 69 64 3d 72 63 78 2d 61 70 2d 30 34 74 2d 31 2c 70
6f 72 74 69 64 3d 30
Setting authentication timeout: 70 sec 0 usec
EAPOL: Received EAP-Packet frame
EAPOL: SUPP_PAE entering state RESTART
EAP: EAP entering state INITIALIZE
EAP: EAP entering state IDLE
EAPOL: SUPP_PAE entering state AUTHENTICATING
EAPOL: SUPP_BE entering state REQUEST
EAPOL: getSuppRsp
EAP: EAP entering state RECEIVED
EAP: Received EAP-Request id=2 method=1 vendor=0 vendorMethod=0
EAP: EAP entering state IDENTITY
CTRL-EVENT-EAP-STARTED EAP authentication started
EAP: EAP-Request Identity data - hexdump_ascii(len=42):
     00 6e 65 74 77 6f 72 6b 69 64 3d 49 42 4d 2c 6e   _networkid=IBM,n
     61 73 69 64 3d 72 63 78 2d 61 70 2d 30 34 74 2d   asid=rcx-ap-04t-
     31 2c 70 6f 72 74 69 64 3d 30                     1,portid=0
EAP: using real identity - hexdump_ascii(len=27):
     67 72 61 6e 74 5f 77 69 6c 6c 69 61 6d 73 6f 6e   grant_williamson
     40 6e 6c 2e 69 62 6d 2e 63 6f 6d                  @nl.ibm.com
EAP: EAP entering state SEND_RESPONSE
EAP: EAP entering state IDLE
EAPOL: SUPP_BE entering state RESPONSE
EAPOL: txSuppRsp
Comment 15 John W. Linville 2008-06-13 10:07:09 EDT
Dan, you are much better at interpreting wpa_supplicant trances than I am...?
Comment 16 Traxtopel 2008-06-16 10:12:55 EDT
The latest kernel 2.6.25.6-55.fc9 works. Connects everytime.
This bug for fc9 can be closed, will continue on el5 bug report.

Note You need to log in before you can comment on or make changes to this bug.