Description of problem:
frequent kernel panics
Version-Release number of selected component (if applicable):
03:00.0 Network controller: Intel Corporation PRO/Wireless 3945ABG [Golan] Network Connection (rev 02)
Steps to Reproduce:
Unfortunately, the kernel doesn't have a chance to log the panic. All I have is a screen shot I did with my mobile.
What I noticed is, that it is for sure related to the Wifi. If I use the hardware switch to switch off the Wifi the kernel doesn't panic. If it is switched on I get a panic frequently within 30 minutes.
I also noticed that the panic only happens using the Wifi at work which is PEAP protected. I don't have the panic using an ordinary WPA2 at home.
even a photo of the panic would be helpful, as there's not much else to go on.
Created attachment 518364 [details]
Created attachment 518709 [details]
I'm testing PEAP v1 with GTC inner authentication, but did not hit this bug yet, perhaps it happens only on some special radio conditions.
Attached patch should prevent kernel panic, and print warning and some additional data when bad condition occurs. Please apply it on compat-wireless package and provide dmesg when it generate call trace. Instructions hint:
$ wget http://wireless.kernel.org/download/compat-wireless-2.6/compat-wireless-2011-08-08.tar.bz2
$ tar xfj compat-wireless-2011-08-08.tar.bz2
$ cd compat-wireless-2011-08-08
$ patch -p1 < ~/iwl3945-rs-debug.patch
and install modules according to README file.
./scripts/gen-compat-autoconf.sh config.mk > include/linux/compat_autoconf.h
make -C /lib/modules/2.6.40-4.fc15.i686.PAE/build M=/home/jan/Downloads/compat-wireless-2011-08-08 modules
make: Entering directory `/usr/src/kernels/2.6.40-4.fc15.i686.PAE'
CC [M] /home/jan/Downloads/compat-wireless-2011-08-08/compat/main.o
LD [M] /home/jan/Downloads/compat-wireless-2011-08-08/compat/compat.o
CC [M] /home/jan/Downloads/compat-wireless-2011-08-08/drivers/bcma/main.o
In file included from include/linux/pci.h:43:0,
include/linux/mod_devicetable.h:386:8: error: redefinition of ‘struct bcma_device_id’
/home/jan/Downloads/compat-wireless-2011-08-08/include/linux/compat-3.0.h:32:8: note: originally defined here
make: *** [/home/jan/Downloads/compat-wireless-2011-08-08/drivers/bcma/main.o] Error 1
make: *** [/home/jan/Downloads/compat-wireless-2011-08-08/drivers/bcma] Error 2
make: *** [_module_/home/jan/Downloads/compat-wireless-2011-08-08] Error 2
make: Leaving directory `/usr/src/kernels/2.6.40-4.fc15.i686.PAE'
make: *** [modules] Error 2
Hmm I'm sorry, I will look in more detail at this error on Mondey. On the meantime could you try to compile on older F-15 kernel.
I used the compat-wireless 3.0 stable release and no kernel panic occurs. I can connect successfully to the PEAP wifi and it's stable since hours. I will try to compile on an older F-15 kernel over the weekend. I keep you posted.
Created attachment 519273 [details]
Compilation fix for compat-wireless-2011-08-08 on 2.6.40
Did you use unpa(In reply to comment #6)
> I used the compat-wireless 3.0 stable release and no kernel panic occurs.
Did you use unpatched compat-wireless-3.0 ? If so that would be confusing, since 2.6.40 and 3.0 use the same code (fedora renamed 3.0 to 2.6.40 to do not broke applications that expect kernel version in form 2.x.y). I you patched compat-wireless-3.0, it should generate warnings and some other prints in dmesg, plese attach them here.
Created attachment 519288 [details]
output of dmesg after applying the patch
I use the patch provided against compat-wireless 3.0-2 stable release.
I'm not seeing direct reason of oops yet. I'm not sure if mac layer do not give invalid band to iwl3945 rate scaling code. I think I will prepare modified patch which print more data. But first could you show output of "iwconfig wlan0" while associated, it will print used frequency?
wlan0 IEEE 802.11abg ESSID:"eduroam"
Mode:Managed Frequency:2.462 GHz Access Point: 00:0B:85:84:42:DD
Bit Rate=48 Mb/s Tx-Power=15 dBm
Retry long limit:7 RTS thr:off Fragment thr:off
Link Quality=59/70 Signal level=-51 dBm
Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0
Tx excessive retries:0 Invalid misc:0 Missed beacon:0
I won't have the possibility to get you more data after Tuesday 30.08. as I'm changing job. I won't have access to this computer and network any longer. Please provide me with possible patches for getting more data before that.
Created attachment 519631 [details]
Thanks for info Jan. I hope we will find proper fix before end of the month. If not, we just apply workaround we already have.
This patch print some more data, hopefully that will be enough to find true reason of oops.
Created attachment 519648 [details]
log of the kernel panic
after applying the second patch to compat-wireless-3.0-2
Created attachment 520059 [details]
I'm not sure what is detailed reason of iwl3945 rate scaling code malfunction. To figure this out I need more detailed knowledge of 3945 rate scaling algorithm. Let fix kernel panic for now. This patch like previous should fix the panic and print warning instead when bad condition occurs. After your confirmation, I'll post it officially and it will be applied in kernel. Thanks.
I used the latest patch. The system is stable and reports warnings.
You can close this bug.
Thanks for the fast fix!
Posted for now, will be applied in Fedora kernel through upstream -stable tree.
I am suffering of this bug since this morning, everything was working fine until yesterday night, even when using PAE kernel
It seems that only PAE kernel is affected.
Not clear to me in which kernel will be included...
Patch is currently in wireless-testing tree. Should be available on fedora kernel in week or two.
I suffered also of same bug with standard kernel..
Will be applied on upstream kernel, even before patch will land in fedora. Patch propagate through different kernel trees: wireless-testing -> net -> linux -> stable -> fedora.
That's the process, I wish it could be faster for kernel panic fixes. But well, you have a patch, you can always apply it by yourself on vanilla kernel or on compat-wireless.
Just for information, it happened also this morning after four days of operation:
can someone explain to this newbie why this bug appears randomly and generally after a short time of operation after start-up???
Tnx for help
I don't know why, bugs have different nature and generally there are no rules with them. Antonio are you sure are you hitting this particular bug, i.e there is rate_control_get_rate in calltrace?
I am quite sure to hit this bug, in particular I have messages similar to what is shown in the panic screenshot in the attachment by Jan.
To fully confirm it, I am waiting a new kernel panic and compare with it.
I hope to stress my system and cause such a panic ;-) but now it seems to be stable
Perhaps you should rather try to compile compat-wireless with patch from comment 15, install it and see if kernel panic. See comment 3 how to get compat-wrieless, you will need also a patch from comment 7.
yum groupinstall "Development tools"
yum install kernel-devel
should be enough to install all tools needed for compilation.
I proceeded to install updated drivers according to previous comments and I have to report a crash just during shut-down of my laptop.
I had no camera to get a copy of the screen as I was at work, but I remember some lines connected to iwl3945,legacy or alike.
Hope a new crash can be reported shortly
Do you have any news about updated kernel???
Created attachment 523411 [details]
My Dmesg file after applying patches to wireless-compat
(In reply to comment #27)
> Hope a new crash can be reported shortly
If you do not find other way to capture the logs, you may try kdump.
> Do you have any news about updated kernel???
kernel.org is down, so don't know if fix was applied or not.
using 220.127.116.11-0.fc15.i686.PAE : since then not experiencing any oops (no patches applied)
waiting for final confirmation