Bug 489039 - iwlagn fails to scan networks
Summary: iwlagn fails to scan networks
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 11
Hardware: x86_64
OS: Linux
high
medium
Target Milestone: ---
Assignee: Stanislaw Gruszka
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 513462
TreeView+ depends on / blocked
 
Reported: 2009-03-06 21:21 UTC by Bryan O'Sullivan
Modified: 2010-01-21 13:47 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-01-21 13:47:52 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
output of "lspci -n" (717 bytes, text/plain)
2009-05-13 05:22 UTC, Bryan O'Sullivan
no flags Details

Description Bryan O'Sullivan 2009-03-06 21:21:31 UTC
Description of problem:

About 70% of the time, the iwlagn driver fails to notice nearby networks when it scans. In these situations, it finds exactly one network (always the same one), and then never sees any of the other 12+ networks in range, including my own.

How reproducible:

This problem occurs 70% of the time. It doesn't seem to matter whether I've done a cold boot or a suspend/resume cycle.

Usually, after six to ten driver rmmod/modprobe cycles, the driver will notice the nearby networks properly, and I can then associate to my AP successfully.

However, there's about a 1/3 likelihood that the rmmod/modprobe cycles will cause the kernel to hang and require a hard (hold down the power button) reboot.

I *suspect* that the AP that always gets detected when this problem occurs has had its transmit power boosted by its owner, but I've no idea how to verify that. Regardless, the fact that sometimes all other networks are seen and sometimes none are seen points to a driver bug.

The end result of this is that because of the driver flakiness, constant fiddling, and forced reboots, it's more trouble than it's worth to me to ever suspend my laptop, which is both a shame and a waste of electricity.

Steps to Reproduce:
1. Boot.
2. Log into Gnome.
3. Notice just one wireless network in the network picklist.

Comment 1 Edouard Bourguignon 2009-03-22 16:45:24 UTC
I've got a similar problem except that I see no wireless network at all. So I generally remove/reload iwlagn once and then it's ok untill next reboot.

Comment 2 Bryan O'Sullivan 2009-03-22 16:59:09 UTC
I definitely see the "no networks" problem pretty often, but it's about as common for me to see just one of the dozen around me.

Comment 3 Edouard Bourguignon 2009-05-02 09:28:51 UTC
This problem also occurs on RawHide, except that we can see networks but can't authenticate...

wlan0 direct probe responded
wlan0: authenticate with AP 00:17:33:b0:3b:80
wlan0: authenticated
wlan0: associate with AP 00:17:33:b0:3b:80
wlan0: RX AssocResp from 00:17:33:b0:3b:80 (capab=0x401 status=0 aid=1)
wlan0: associated
wlan0: disassociating by local choice (reason=3)

Comment 4 John W. Linville 2009-05-04 13:31:45 UTC
What are the kernel versions in use?

Comment 5 Edouard Bourguignon 2009-05-04 17:00:54 UTC
2.6.29.1-111.fc11.x86_64 for me.

Comment 6 Edouard Bourguignon 2009-05-05 16:58:56 UTC
after a couple of hours, nm-applet crash, and the errors below appeared in dmesg:

E: hpet increasing min_delta_ns to 33750 nsec
CE: hpet increasing min_delta_ns to 50624 nsec
iwlagn: Read index for DMA queue txq_id (2) index 234 is out of range [0-256] 238 237
iwlagn: Read index for DMA queue txq_id (2) index 231 is out of range [0-256] 238 237
iwlagn: Read index for DMA queue txq_id (2) index 233 is out of range [0-256] 238 237
iwlagn: Error sending REPLY_SCAN_CMD: time out after 500ms.
wlan0: No ProbeResp from current AP a6:b5:fa:f7:ac:c8 - assume out of range
iwlagn: Error sending REPLY_SCAN_CMD: time out after 500ms.
iwlagn: Error sending REPLY_SCAN_CMD: time out after 500ms.
iwlagn: Error sending REPLY_SCAN_CMD: time out after 500ms.
iwlagn: Error sending REPLY_SCAN_CMD: time out after 500ms.
iwlagn: Error sending REPLY_SCAN_CMD: time out after 500ms.
iwlagn: Error sending REPLY_SCAN_CMD: time out after 500ms.
iwlagn: Error sending REPLY_SCAN_CMD: time out after 500ms.
iwlagn: Error sending REPLY_SCAN_CMD: time out after 500ms.
iwlagn: Error sending REPLY_SCAN_CMD: time out after 500ms.
iwlagn: Error sending REPLY_SCAN_CMD: time out after 500ms.
iwlagn: Error sending REPLY_SCAN_CMD: time out after 500ms.
iwlagn: No space for Tx
iwlagn: Error sending REPLY_TX_POWER_DBM_CMD: enqueue_hcmd failed: -28
iwlagn: No space for Tx
iwlagn: Error sending REPLY_SCAN_CMD: enqueue_hcmd failed: -28
iwlagn: No space for Tx
iwlagn: Error sending REPLY_TX_POWER_DBM_CMD: enqueue_hcmd failed: -28
iwlagn: No space for Tx
iwlagn: Error sending REPLY_SCAN_CMD: enqueue_hcmd failed: -28
iwlagn: No space for Tx
iwlagn: Error sending REPLY_TX_POWER_DBM_CMD: enqueue_hcmd failed: -28
iwlagn: No space for Tx
iwlagn: Error sending REPLY_SCAN_CMD: enqueue_hcmd failed: -28
iwlagn: No space for Tx
iwlagn: Error sending REPLY_TX_POWER_DBM_CMD: enqueue_hcmd failed: -28
iwlagn: No space for Tx
iwlagn: Error sending REPLY_SCAN_CMD: enqueue_hcmd failed: -28
iwlagn: No space for Tx
iwlagn: Error sending REPLY_TX_POWER_DBM_CMD: enqueue_hcmd failed: -28
iwlagn: No space for Tx
iwlagn: Error sending REPLY_SCAN_CMD: enqueue_hcmd failed: -28
iwlagn: No space for Tx
iwlagn: Error sending REPLY_TX_POWER_DBM_CMD: enqueue_hcmd failed: -28
iwlagn: No space for Tx
iwlagn: Error sending REPLY_SCAN_CMD: enqueue_hcmd failed: -28
iwlagn: No space for Tx

Comment 7 Bryan O'Sullivan 2009-05-05 17:05:50 UTC
For me, it's happened with every F-10 kernel. I'm currently running kernel-2.6.27.21-170.2.56.fc10.x86_64.

Comment 8 Edouard Bourguignon 2009-05-12 07:49:54 UTC
still having the "disassociating by local choice (reason=3)" problem on kernel 2.6.29.2-126.fc11.x86_64.

Comment 9 John W. Linville 2009-05-12 12:35:49 UTC
Please attach the output of 'lspci -n'.  Also, the output of
'rpm -q iwl4965-firmware' and rpm -q iwl5000-firmware'...thanks!

Comment 10 Bryan O'Sullivan 2009-05-13 05:22:53 UTC
Created attachment 343703 [details]
output of "lspci -n"

I've attached the lspci output.

Firmware versions:
iwl4965-firmware-228.57.2.23-2.noarch
iwl5000-firmware-5.4.A.11-3.noarch

Comment 11 John W. Linville 2009-05-13 13:14:44 UTC
FWIW, this driver defaults to letting the device firmware handle scanning.  There is an option to switch it back to doing scanning at the CPU (like most other mac80211-based drivers).  Please drop a file into /etc/modprobe.d (e.g. /etc/modprobe.d/iwlagn.options) with a line in it like this:

   options iwlagn disable_hw_scan=1

After that, either execute 'modprobe -r iwlagn ; modprobe iwlagn' or simply reboot.  Does this change/improve scanning behavior?

Comment 12 Edouard Bourguignon 2009-05-13 19:57:34 UTC
I now have "wlan0: deauthenticated (Reason: 6)" in dmesg with newly updated kernel 2.6.29.3-140.fc11.x86_64.

I'm trying with /etc/modprobe.d/iwlagn.conf disabling the hw scan, it seems a little bit better. Hope it will also be more stable.

Here is my pci/vendor id: 
03:00.0 0280: 8086:4236

Firmware versions:
iwl5000-firmware-5.4.A.11-4.noarch
iwl4965-firmware-228.57.2.23-5.fc11.noarch

Comment 13 Bryan O'Sullivan 2009-05-20 03:15:07 UTC
I've successfully associated three times in a row with hw scanning disabled, in the environment that's normally so problematic, and both scanning and associating seem to be much faster now too.

Comment 14 John W. Linville 2009-07-21 19:32:37 UTC
Firmware updates are available:

   yum --enablerepo=updates-testing update iwl4965-firmware iwl5000-firmware

Do these updates improve the situation?

Comment 15 Edouard Bourguignon 2009-07-27 11:46:49 UTC
just had the same problem on F11 kernel-2.6.29.6-213. Do I have to try with updates-testing?

Comment 16 Edouard Bourguignon 2009-07-27 11:49:40 UTC
It doesn't seems to happen every time I boot on fedora. It was working with a fresh F11 install, but after updating and rebooting a few times NetworkManager failed to connect. After a modprobe -r iwlagn && modprobe iwlagn, I'm able to connect wireless networks.

Comment 17 John W. Linville 2009-07-27 12:54:01 UTC
Honestly, that sounds like a different issue...

Comment 18 Edouard Bourguignon 2009-07-27 14:19:12 UTC
I had the fails to scan networks problem before, now I always can scan the networks. So for this part of the problem (scanning networks) it seems to be OK for me.
I should open a new bug for the iwlagn failing to associate networks.

Comment 19 Bryan O'Sullivan 2009-07-28 16:59:19 UTC
John, I'll take a look at the updates tonight.

Comment 20 Bryan O'Sullivan 2009-08-03 22:36:30 UTC
I've got the updated firmware running, but no news to report yet. (i.e. no discernible difference, due to insufficient thrashing around)

Comment 21 Bryan O'Sullivan 2009-08-08 17:40:26 UTC
It's worked reliably for a few days in my environment where it usually has problems. I wouldn't exactly call this "data" yet, but it's something.

Comment 22 Edouard Bourguignon 2009-08-09 08:41:57 UTC
for me too, I can now transfer big files, which was impossible a few days ago. But this morning, it failed to associate again, had to reload iwlagn driver :(

Comment 23 Bryan O'Sullivan 2009-08-10 03:50:36 UTC
Edouard: I think you and I have different problems. I never lose associations once they're made. I just wasn't able to see any more than 1 AP (always the same one) about 80% of the time.

Comment 24 Stanislaw Gruszka 2010-01-18 09:59:04 UTC
Bryan,

Is scanning ok with with recent F11 kernels/firmware? I would like to close this bug. 


Edouard,

This bug: 
iwlagn: Error sending REPLY_SCAN_CMD: enqueue_hcmd failed: -28
iwlagn: No space for Tx
is not solved yet, even upstream. Fedora bug report is here:
https://bugzilla.redhat.com/show_bug.cgi?id=493018
and Intel bugzilla entry here:
http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2037
Please CC yourself and comment to help solve this problem.

Comment 25 Stanislaw Gruszka 2010-01-21 13:47:52 UTC
Ok, according to previous comment this bug is fixed, closing it.


Note You need to log in before you can comment on or make changes to this bug.