Bug 473507 - iwl3945 wireless driver causes Kernel Panic
Summary: iwl3945 wireless driver causes Kernel Panic
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 10
Hardware: i386
OS: Linux
medium
high
Target Milestone: ---
Assignee: John W. Linville
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-11-28 23:23 UTC by Zoltan Balogh
Modified: 2008-12-16 16:37 UTC (History)
7 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2008-12-16 16:37:17 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
lspci -v; ifconfig wlan0; uname -a; cat /proc/cpuinfo; dmesg (36.77 KB, text/plain)
2008-11-28 23:23 UTC, Zoltan Balogh
no flags Details
/var/log/messages (78.08 KB, text/plain)
2008-11-28 23:29 UTC, Zoltan Balogh
no flags Details
/var/log/messages of an entire session which ended with Kernel Panic (1.34 MB, text/plain)
2008-11-28 23:38 UTC, Zoltan Balogh
no flags Details
HOWTO of University on setting up NetworkManager. (42 bytes, text/plain)
2008-11-28 23:45 UTC, Zoltan Balogh
no flags Details
iwlist wlan0 scan from library (6.15 KB, text/plain)
2008-12-01 16:07 UTC, Zoltan Balogh
no flags Details
iwlist wlan0 scan with Access Points that DO cause my Kernel Panics (13.66 KB, text/plain)
2008-12-01 16:11 UTC, Zoltan Balogh
no flags Details
ifconfig wlan0 when connected to Access Point that causes Kernel Panic (481 bytes, text/plain)
2008-12-02 10:52 UTC, Zoltan Balogh
no flags Details
Captured Kernel Oops Message via Netconsole (2.60 KB, text/plain)
2008-12-04 12:45 UTC, Zoltan Balogh
no flags Details
Upstream kernel solved the problem, the issue is Fedora-related. (64.09 KB, text/plain)
2008-12-11 13:08 UTC, Zoltan Balogh
no flags Details

Description Zoltan Balogh 2008-11-28 23:23:02 UTC
Created attachment 325049 [details]
lspci -v; ifconfig wlan0; uname -a; cat /proc/cpuinfo; dmesg

Description of problem:

Whilst using NetworkManager to connect to my university's PPTP VPN, I experience frequent Kernel Panics.

In order to get an insight into in-depth details (including my university's instructions on VPN setup), please see my attachments, here is a short description:

The connection is set up via open (non-protected) WiFi access points. (I'll include the output of an "iwlist scan" once I'm there.) NetworkManager successfully associates with the access point. Then I initiate the PPTP connection. This is also a success.

<see attachment of /var/log/messages>

However after a random amount of time, I experience a Hard Kernel Panic, after which, no trace is found in logs.

Here is what's left in /var/log/messages during an actual Kernel Panic:

..
Nov 28 14:35:02 zolinux pptp[6631]: nm-pptp-service-6618 log[decaps_gre:pptp_gre.c:414]: buffering packet 238326 (expecting 238291, lost or reordered)
Nov 28 14:35:02 zolinux pptp[6631]: nm-pptp-service-6618 log[decaps_gre:pptp_gre.c:414]: buffering packet 238327 (expecting 238291, lost or reordered)
Nov 28 14:35:02 zolinux pptp[6631]: nm-pptp-service-6618 log[decaps_gre:pptp_gre.c:414]: buffering packet 238328 (expecting 238291, lost or reordered)
Nov 28 14:35:02 zolinux pptp[6631]: nm-pptp-service-6618 log[decaps_gre:pptp_gre.c:414]: buffering packet 238329 (expecting 238291, lost or reordered)
Nov 28 14:36:25 zolinux vmnetBridge: RTM_NEWLINK: name:wlan0 index:4 flags:0x00011043
Nov 28 14:36:25 zolinux vmnetBridge: Can't add interface wlan0 4 (does exist).

Then I restarted the machine after I got back to my PC:

Nov 28 14:55:41 zolinux kernel: imklog 3.21.3, log source = /proc/kmsg started.
Nov 28 14:55:41 zolinux kernel: Initializing cgroup subsys cpuset
Nov 28 14:55:41 zolinux kernel: Initializing cgroup subsys cpu
Nov 28 14:55:41 zolinux kernel: Linux version 2.6.27.5-117.fc10.i686 (mockbuild.phx.redhat.com) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #1 SMP Tue Nov 18 12:19:59 EST 2008
...

As far as I can recall, the last message in /var/log/messages is always

Nov 28 14:36:25 zolinux vmnetBridge: RTM_NEWLINK: name:wlan0 index:4 flags:0x00011043
Nov 28 14:36:25 zolinux vmnetBridge: Can't add interface wlan0 4 (does exist).

or similar, before the kernel panics.

My /var/log/messages is swamped with messages like

Nov 28 20:19:47 zolinux pptp[6199]: nm-pptp-service-6187 log[decaps_gre:pptp_gre.c:414]: buffering packet 5121 (expecting 5113, lost or reordered)

which is not good in itself, but should never cause a kernel panic. Why does vmnetBridge try to re-add my existing interface? Is it related to NetworkManager-pptp?



Version-Release number of selected component (if applicable):

zed@zolinux:/usr/bin$ yum info NetworkManager NetworkManager-pptp
Loaded plugins: fastestmirror, fedorakmod, kernel-module, refresh-packagekit
Installed Packages
Name       : NetworkManager
Arch       : i386
Epoch      : 1
Version    : 0.7.0
Release    : 0.12.svn4326.fc10
Size       : 2.9 M
Repo       : installed
Summary    : Network connection manager and user applications
URL        : http://www.gnome.org/projects/NetworkManager/
License    : GPLv2+
Description: NetworkManager attempts to keep an active network connection available at all times.  It is intended only for the desktop use-case, and is not
           : intended for usage on servers.   The point of NetworkManager is to make networking configuration and setup as painless and automatic as
           : possible.  If using DHCP, NetworkManager is _intended_ to replace default routes, obtain IP addresses from a DHCP server, and change
           : nameservers whenever it sees fit.

Name       : NetworkManager-pptp
Arch       : i386
Epoch      : 1
Version    : 0.7.0
Release    : 0.12.svn4326.fc10
Size       : 403 k
Repo       : installed
Summary    : NetworkManager VPN plugin for pptp
URL        : http://www.gnome.org/projects/NetworkManager/
License    : GPLv2+
Description: This package contains software for integrating PPTP VPN support with the NetworkManager and the GNOME desktop.


How reproducible:

Always, however the Panic occurs after a random time was spent on the network. So far, I haven't found relation between the panics occuring and network load, however the sequence numbers of packets are more messed up under high load (like copying files on network).

Steps to Reproduce:

1. Associate with wifi access point via GUI.
2. Connect to PPTP VPN via GUI.
3. Start some traffic and wait for Kernel Panic.
  
Actual results:

When using the PPTP VPN tunnel, the Kernel Panics.

Expected results:

Normal PPTP VPN communication, without erros. Successful termination of tunnel upon user's request (disconnect).

Additional info:

Another interesting information for you is that when I tried to export my VPN settings (*.conf file) to you using the GUI, it failed, saying:


Cannot export VPN connection. The VPN connection 'Newcastle University Internal VPN' could not be exported to Newcastle University Internal VPN (pptp).conf.

Error: VPN setting invalid.


I tried creating random PPTP VPNs, but I failed to export all of them. This might be another bug though that is not related.

VPN Settings (compare with attached university instructions):

In order to set up the VPN, in the GUI, I set up the name of the connection. (Left "always connect" checkbox empty.) Under VPN tab, I filled in the gateway, username, password (saved in keyring). I haven't added the NT Domain as when adding it, I fail to connect. (Possible bug...) Under advanced, I allow all authentication methods. I use MPPE encryption. I allow BSD, Deflate and TCP compression, but do not send PPP echo packets, nor do I allow stateful encryption.


Files attached:

Comment 1 Zoltan Balogh 2008-11-28 23:29:08 UTC
Created attachment 325050 [details]
/var/log/messages

Connecting from HOME to the EXTERNAL VPN, which requires identical settings to the Internal one.

Comment 2 Zoltan Balogh 2008-11-28 23:38:44 UTC
Created attachment 325053 [details]
/var/log/messages of an entire session which ended with Kernel Panic

This is from the university, taken today, ended in a sad Kernel Panic.

Comment 3 Zoltan Balogh 2008-11-28 23:45:14 UTC
Created attachment 325054 [details]
HOWTO of University on setting up NetworkManager.

See also: http://docking.ncl.ac.uk/vpn/linux/

As a final note, it is worth noting that I only experience Kernel Panics, when using my notebook on the university VPN.

What has not been tested yet:

Whether the kernel panics, even if I just associate with the Wifi access point ONLY. Whether it works without errors in different buildings, using other access points. (Usually being in one building, one floor.)

I'm planning to revert to prior versions of NetworkManager and NetworkManager-pptp in order to be able to work at the University.

Thanks for your time and efforts.

Zoltan

Comment 4 Dan Williams 2008-11-29 21:27:12 UTC
kernel panics are kernel issues; the kernel should never panic.

Comment 5 Zoltan Balogh 2008-11-29 22:41:38 UTC
(In reply to comment #4)
> kernel panics are kernel issues; the kernel should never panic.

Fair enough. Please let me know if I can help you by any means to determine the kernel bug.

Also, I think there are several issues mentioned in my bug-report that are of incorrect/unexpected nature. Do you agree that I should file them as separate bugs against NetworkManager? (Example the sequence numbers, exporting VPN conf files or re-adding the wlan0 interface.)

I'm testing the External VPN from home right now. I route every traffic of mine through that at the moment. Had no panics so far, so I suspect, there might be an iwl3945 bug as the root of all problems. (There are about 10 wifi access points in range at the room I am usually working in and this might mess things up.)

Comment 6 Zoltan Balogh 2008-12-01 16:07:11 UTC
Created attachment 325244 [details]
iwlist wlan0 scan from library

I have made some further research towards determining the nature of this defect. I found out that I never get a kernel panic from connecting to the External VPN from home (I use Wifi at home.).

In addition to this, I went to the library (see access point information), to see if all Internal VPN connections make my kernel panic or whether just given ones.

I did a 5 hour session in the library, without a kernel panics. Therefore I determined that there is something unusual in the wireless access points of the building, where I usually work from, which somehow make my kernel panics.

I *guess*, it's down to an iwl3945 bug and/or a mis-configuration of routers locally.

I'll keep you updated. I'll also send a file with the iwlist wlan0 scan results of the exact location where I get panics at.

Comment 7 Zoltan Balogh 2008-12-01 16:11:55 UTC
Created attachment 325246 [details]
iwlist wlan0 scan with Access Points that DO cause my Kernel Panics

Here are the wireless access points of the location, where my laptop's kernel panics.

I forgot to mention that the network I use throughout the campus has the ESSID "magpie".


As a next step, I'm contacting the relevant authorities of my university to see if they can help me tracing down the problem. I'll keep you updated.

Comment 8 Zoltan Balogh 2008-12-02 10:52:14 UTC
Created attachment 325355 [details]
ifconfig wlan0 when connected to Access Point that causes Kernel Panic

Forgot to add this yesterday. My kernel panicked a few minutes after saving this file. I was NOT connected to the VPN, therefore the defect is down to the level of wireless communication between the access point and the laptop of mine.

Will raise a defect against iwl3945 as well to help problem resolution.

So far no reply had been received from my university's IT support services, however they assigned the issue to an engineer.

Comment 9 John W. Linville 2008-12-02 14:25:39 UTC
Can you capture the Oops using netconsole over a wired connection to another box?  Or perhaps you can recreate the crash using a text console?  It is difficult to debug without more information about where the crash is occuring.

Comment 10 Zoltan Balogh 2008-12-04 12:45:18 UTC
Created attachment 325670 [details]
Captured Kernel Oops Message via Netconsole

Hello John,

I hope this helps. I'll let the iwl3945 people know as well that I managed to capture the Oops.

Also according to dmesg, my iwl3945 version is the following:

iwl3945: Intel(R) PRO/Wireless 3945ABG/BG Network Connection driver for Linux, 1.2.26kds

The driver is the default that is packaged with Fedora 10.

Comment 11 Chuyee 2008-12-05 01:22:48 UTC
The first oops looks more like a mac80211 problem (ieee80211_invoke_rx_handlers) to me. The second WARN() looks like netconsole related, you can ignore it. Does the problem happen on upstream?

Comment 12 Zoltan Balogh 2008-12-06 12:43:51 UTC
As I'm not entirely sure what Mr. Zhu means by "upstream", could anyone specify this to me?

I'm going on a holiday next Friday (12th of Dec.) and hence we should get all the trace beforehand, especially as the university might modify the network over the holidays as I reported my problems to them as well.

Comment 13 John W. Linville 2008-12-08 15:14:50 UTC
"Upstream" means an official, unpatched kernel from Linus.  (We push patches "upstream" to him, and they flow from him "downstream" to us.)

Comment 15 Zoltan Balogh 2008-12-11 13:08:29 UTC
Created attachment 326612 [details]
Upstream kernel solved the problem, the issue is Fedora-related.

Hello,

With the kind help of John, I managed to check out the new wireless testing kernel yesterday. Compilation was based on the default Fedora config (make oldconfig) and I left every new configuration setting as default (as John suggested).

Good news is: it works without any problems. :D (iwl3945: Intel(R) PRO/Wireless 3945ABG/BG Network Connection driver for Linux, 1.2.26kds)

I've attached a text file containing the following information:

uname -a, extract from /var/log/messages, iwlist wlan0 scan, ifconfig, dmesg


Another good news is that with the help of our Information Systems Services team, we figured out why I get messages about PPTP packets arriving in a bad sequence.

Basically you can't set up the default MTU for VPN connections from the new NetworkManager. Therefore it uses 1500 instead, however the university would rather prefer 1416. This then messes things up as it constantly needs to recover. I'll find out later, how to manually set up the MTU. I found this bug on Ubuntu's Launchpad already, so not reporting it.

Thank you very much for all your co-operation and please try to fix this issue in Fedora 10's (or a next release's) default kernel.

Wishing you a Merry Christmas,

Zoltan
PS: Marking this defect as needinfo, so that you'd get back to me quicker. Then I could still extract some info out from the network setup before I go on a holiday tomorrow.


Note You need to log in before you can comment on or make changes to this bug.