| Summary: | intel 4965 wireless drops connections | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Mike Iglesias <iglesias> | ||||||||
| Component: | kernel | Assignee: | Stanislaw Gruszka <sgruszka> | ||||||||
| Status: | CLOSED NOTABUG | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||
| Severity: | unspecified | Docs Contact: | |||||||||
| Priority: | unspecified | ||||||||||
| Version: | 14 | CC: | gansalmon, itamar, jonathan, kernel-maint, linville, madhu.chinakonda | ||||||||
| Target Milestone: | --- | ||||||||||
| Target Release: | --- | ||||||||||
| Hardware: | Unspecified | ||||||||||
| OS: | Unspecified | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2011-10-24 11:41:02 UTC | Type: | --- | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Attachments: |
|
||||||||||
|
Description
Mike Iglesias
2011-10-03 17:56:29 UTC
The compat-wireless package needs some patching to work with the current F14 kernels. Lucky for you, I've been working on a related project... :-) First, you will need to edit /etc/depmod.d/dist.conf to replace the last line with a line like this: search updates extra backports built-in weak-updates Then, you need to install _both_ the kernel and kernel-backports packages available from here: http://koji.fedoraproject.org/koji/taskinfo?taskID=3395348 That will let you test the compat-wireless version of the iwl4965 driver. Created attachment 526118 [details]
/var/log/messages ffor intel 4965 failure
About 25 minutes after booting the new kernel+backports, it successfully renewed the lease from the dhcp server. About 4 minutes later, it noted that it was calling CRDA to update world regulatory domain, updated the domain to US, and about a minute after that it was no longer able to talk to the dhcp server. Maybe that has something to do with the problem? Are you from other country than US? Then perhaps setting proper regdomain could help, see https://bugzilla.redhat.com/show_bug.cgi?id=709803#c13 I'm from the US. I looked thru the messages files on the laptop, and while it didn't happen every time it updated the regulatory domain, there are several occurrences where the wireless stopped working shortly after that took place. I've updated compat-wireless packages they should work now. Mike, you checked latest code and the 2.6.35. Did you also check some other (older or somewhere between 2.6.35 and latest) version, is there any version on which driver works for you? My feeling is that it's been a problem since I upgraded from Fedora 13 to Fedora 14. I'm pretty sure 13 was much better than 14 has been. I did some testing this morning and it appears that something tries to set the domain to US about 30 minutes after the system is booted, and at that point the wireless stops working. I set up a script that did a "ping -c 5 ip-addr" and then slept for 60 seconds, and repeated that. The set of pings just before the domain was set to US worked, and the pings after, as well as the dhcp lease renewal, did not go through. I did this testing with the iwlagn debug flags set to 0x43fff, and I have extracted the information from /var/log/messages from about 2 minutes before the domain setting to some time after the wireless was not working. I will upload that shortly in case you want to look at it. Why would it need to set the domain to US after the system is up - it did it at least once while it was booting, so it seems to me it shouldn't need to do it again. If it makes any difference, I'm not using NetworkManager. Created attachment 527749 [details]
/var/log/messages entries showing wireless failure
(In reply to comment #7) > My feeling is that it's been a problem since I upgraded from Fedora 13 to > Fedora 14. I'm pretty sure 13 was much better than 14 has been. That probably mean regression introduced between 2.6.34 and 2.6.35. 2.6.35 is well known from various iwlwifi regressions. Some of them are fixed now, some not. Some time ago I prepared 5 patches, which downgrade 2.6.35 iwlwifi driver to 2.6.34, to allow to narrow regression. They are attached to one bug report in bugzilla.kernel.org, but unfortunately that service is down. Not sure when it will be restarted. So for now we can only debug this problem based on logs. > set to 0x43fff, and I have extracted the information from /var/log/messages > from about 2 minutes before the domain setting to some time after the wireless > was not working. I will upload that shortly in case you want to look at it. I'm not sure why, but there is station removal > Oct 12 10:02:57 dhcp-v041-206 kernel: [ 1835.887404] ieee80211 phy0: U iwl_mac_sta_remove received request to remove station 00:1e:79:d6:7d:02 before US regulatory domain set. That most likely is caused by user space application, which for some reason want to disconnect. However user space may want disconnect because of previous driver malfunction. > If it makes any difference, I'm not using NetworkManager. But you still use wpa_supplicant ? ;-) Please provide log from wpa_supplicant and kernel as described in "Configure syslog to debug kernel and wpa_supplicant" at https://fedoraproject.org/wiki/DebugWireless For iwlagn module use debug=0x7fffffff , what show much more detailed infromation. In case of big log file, compress it. Thanks. There were no user-space applications running other than me being logged in on the laptop and the ping script I had running, so I don't know what might have triggered the station removal. I'm not using wpa_supplicant, so there's no log for that. I will try the iwlagn debugging setting next week when I'm back in the office. How do you configure connection? Could you show that, I'll check if I could reproduce problem locally. I'm just using a standard ifcfg script in /etc/sysconfig/network-scripts. DEVICE=wlan1 TYPE=Wireless USERCTL=yes IPV6INIT=no BOOTPROTO=dhcp ONBOOT=yes ESSID="UCInet Mobile Access" I'm going to upload an extract from /var/log/messages with the debug flags set to 0x7fffffff as you requested. Created attachment 528579 [details]
/var/log/messages entries with full debug flags set
It really looks like request for disconnect come from user space. Is NetworkManager not installed at all? If not, could you check if adding NM_CONTROLLED="no" to network-scrips help, and also this command: "chkconfig NetworkManager off" Also "chkconfig wpa_supplicant off" Is ESSID="UCInet Mobile Access" unprotected network?Or password is provided somehow. I'm sorry for stupid question, but this wireless interface configuration method is totally unknown for me NetworkManager and wpa_supplicant are not running, and adding NM_CONTROLLED="no" did not help. I booted the laptop and did not log in, and almost exactly 30 minutes after it booted, it lost the wireless connection. That would rule out anything in user space that is started when I log in as being the culprit. I found this in the dmesg output: [ 1836.069733] wlan1: deauthenticated from 00:1e:79:d6:7d:02 (Reason: 1) What does a reason code of 1 mean? One other interesting thing is that when the wireless connection dies, if I do /sbin/ifdown wlan1 /sbin/ifup wlan1 it dies again 30 minutes after the /sbin/ifup, so it appears to be something related to starting the wireless connection. The only thing I can think of that is started is dhclient. I changed the way dhclient started so it was doing verbose logging (-v option) but that didn't reveal anything either. Is there a way to determine what user space process might be doing this? (In reply to comment #16) > [ 1836.069733] wlan1: deauthenticated from 00:1e:79:d6:7d:02 (Reason: 1) > > What does a reason code of 1 mean? It mean UNSPECIFIED :-( So not user space disconnect but AP, but we don't know why. > it dies again 30 minutes after the /sbin/ifup, so it appears to be something > related to starting the wireless connection. I'm not sure what could cause such behaviour. Perhaps we do something wrong that make AP wants to deauthenticate us, or maybe this is configured behaviour of AP - default deauth after 30 minutes (if so, using wpa_supplicant could help with that issue as it should automaticly authenticate after deauthentication) What is encryption used on that wireless network? (In reply to comment #17) > It mean UNSPECIFIED :-( So not user space disconnect but AP, but we don't know > why. I'll talk to our network engineer about this and see if he has any idea why this might be happening. > I'm not sure what could cause such behaviour. Perhaps we do something wrong > that make AP wants to deauthenticate us, or maybe this is configured behaviour > of AP - default deauth after 30 minutes (if so, using wpa_supplicant could help > with that issue as it should automaticly authenticate after deauthentication) > > What is encryption used on that wireless network? We use MAC address authorization (people need to register the wireless MAC address to be allowed on the network), and we're not using encryption at this time. According to our network engineer, the APs are setup with a session timeout that requires the clients to reauthenticate every 30 minutes. My laptop has Vista on it and it seems to work ok when Vista is running. The network engineer did tell me that some Mac OS-X systems seem to have trouble with this too, but for the most part clients are working ok with that setting on the APs. Ok, I see. We should handle such situation in user space. I think wpa_supplicant is able to associate automatically once AP disassociate. It can be configured for unencrypted network like that:
ctrl_interface=/var/run/wpa_supplicant
network={
ssid="UCInet Mobile Access"
key_mgmt=NONE
}
Otherwise NetworkManager should handle that. If that will not work, reopen bug and change to proper component.
|