Bug 688252 - iwl3945-related kernel crash
Summary: iwl3945-related kernel crash
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 15
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Stanislaw Gruszka
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 691585 698815 699447 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-03-16 16:38 UTC by Dan Winship
Modified: 2011-05-25 14:16 UTC (History)
19 users (show)

Fixed In Version: kernel-2.6.38.4-20.fc15
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-05-25 14:16:09 UTC
Type: ---


Attachments (Terms of Use)
register dump from crash (405.97 KB, image/png)
2011-03-16 16:39 UTC, Dan Winship
no flags Details
backtrace from crash (411.09 KB, image/png)
2011-03-16 16:39 UTC, Dan Winship
no flags Details
Screenshot from the crash (781.89 KB, image/jpeg)
2011-04-22 21:38 UTC, Christian Jann
no flags Details

Description Dan Winship 2011-03-16 16:38:23 UTC
Every now and then (maybe every 5-10 hours of use?) my laptop crashes. See attached. This is probably related to bug 683571.

Comment 1 Dan Winship 2011-03-16 16:39:09 UTC
Created attachment 485788 [details]
register dump from crash

Comment 2 Dan Winship 2011-03-16 16:39:33 UTC
Created attachment 485789 [details]
backtrace from crash

Comment 3 Dan Winship 2011-03-16 16:40:14 UTC
forgot to mention, this is kernel-2.6..38-0.rc8.git0.1.fc15

Comment 4 Stanislaw Gruszka 2011-03-16 20:09:04 UTC
Does disable_hw_scan=1 module option helps?

Comment 5 Dan Winship 2011-03-16 20:17:37 UTC
I know very little about kernel stuff. Where do I put that option?

Comment 6 Stanislaw Gruszka 2011-03-17 07:36:08 UTC
Below should do the job (as root):
$ echo "options iwl3945 disable_hw_scan=1" >> /etc/modprobe.d/iwlwifi.conf
$ rmmod iwl3945 iwlcore
$ modprobe iwl3945

You can check if option have correct value using
$ cat /sys/module/iwl3945/parameters/disable_hw_scan

Comment 7 Stanislaw Gruszka 2011-03-18 07:34:33 UTC
Does option stops microcode error from bug 683571 ?

Comment 8 Dan Winship 2011-03-18 13:08:35 UTC
It definitely stops the error from bug 683571. I haven't gotten another crash yet either, so it probably fixes/works around that too.

(I am still seeing lots of:

    [36356.287070 iwl3945 0000:03:00.0: Failed to get channel info for channel 165 [0]

for various channel numbers, which also seem to be tied to NM scans. Not sure if that indicates a driver bug, an NM bug, or just overzealous logging.)

Comment 9 Stanislaw Gruszka 2011-03-18 13:38:25 UTC
"Failed to get channel info" is know software scanning problem, already fixed upstream by:

http://git.kernel.org/linus/f844a709a7d8f8be61a571afc31dfaca9e779621

I need to post it to stable ...

I think you are not able to patch kernel. So to test driver with patch, you can eventually install beading edge compat-wireless. For F-14 and F-13 released kernels, I have compat-wireless-next binary packages at:
http://people.redhat.com/sgruszka/compact_wireless.html
On other kernels need to build from source:
http://wireless.kernel.org/en/users/Download
http://wireless.kernel.org/download/compat-wireless-2.6/

Comment 10 Zdenek Kabelac 2011-03-27 20:41:55 UTC
I'm seeing the very same problem - and same oops on my T61.

Some updated package must be behind this. I've tried even very old kernel 2.6.35 I keep on my this - and it gives now same error.

I  tried  'disable_hw_scan' hint - however I had to compile linus vanilla kernel 2.6.39-pre-rc kernel to get my wifi usable. 

So for now - the only way to have usable wifi is to use quite unstable 2.6.39 kernel.

Comment 11 Zdenek Kabelac 2011-03-27 23:57:25 UTC
In fact - I've run simple valgrind -

and the reason seems to quite obvious and IMHO easy to fix - though as I don't have devel tree for compilation - leave for developers:

So here is important part of valgrind log for this BZ:

Invalid read of size 4
   at 0x80389CB: DrawableGone (glxext.c:131)
   by 0x44C564: FreeClientResources (resource.c:854)
   by 0x42DFFD: CloseDownClient (dispatch.c:3461)
   by 0x45EA14: EstablishNewConnections (connection.c:838)
   by 0x432DA1: ProcessWorkQueue (dixutils.c:527)
   by 0x45B5D1: WaitForSomething (WaitFor.c:173)
   by 0x42E8D9: Dispatch (dispatch.c:367)
   by 0x422DC9: main (main.c:287)
 Address 0xcbac2d4 is 4 bytes inside a block of size 96 free'd
   at 0x4C2756E: free (vg_replace_malloc.c:366)
   by 0x910E1C5: fbDestroyPixmap (fbpixmap.c:104)
   by 0x8AA287E: intel_uxa_destroy_pixmap (intel_uxa.c:1092)
   by 0x4DD9C0: damageDestroyPixmap (damage.c:1696)
   by 0x79CB094: XvDestroyPixmap (xvmain.c:389)
   by 0x4B8738: ShmDestroyPixmap (shm.c:276)
   by 0x4A6250: compCheckRedirect (compwindow.c:168)
   by 0x4A64D0: compUnrealizeWindow (compwindow.c:274)
   by 0x452527: UnrealizeTree (window.c:2807)
   by 0x455356: UnmapWindow (window.c:2865)
   by 0x45540C: DeleteWindow (window.c:909)
   by 0x44BEB7: FreeResource (resource.c:596)

Invalid read of size 8
   at 0x80389C4: DrawableGone (glxext.c:131)
   by 0x44C564: FreeClientResources (resource.c:854)
   by 0x44C617: FreeAllResources (resource.c:871)
   by 0x422DFB: main (main.c:301)
 Address 0xf6d6438 is 40 bytes inside a block of size 208 free'd
   at 0x4C2756E: free (vg_replace_malloc.c:366)
   by 0x8038A80: DrawableGone (glxext.c:171)
   by 0x44C564: FreeClientResources (resource.c:854)
   by 0x42DFFD: CloseDownClient (dispatch.c:3461)
   by 0x45EA14: EstablishNewConnections (connection.c:838)
   by 0x432DA1: ProcessWorkQueue (dixutils.c:527)
   by 0x45B5D1: WaitForSomething (WaitFor.c:173)
   by 0x42E8D9: Dispatch (dispatch.c:367)
   by 0x422DC9: main (main.c:287)

Invalid read of size 4
   at 0x80389C8: DrawableGone (glxext.c:131)
   by 0x44C564: FreeClientResources (resource.c:854)
   by 0x44C617: FreeAllResources (resource.c:871)
   by 0x422DFB: main (main.c:301)
 Address 0xf6d6440 is 48 bytes inside a block of size 208 free'd
   at 0x4C2756E: free (vg_replace_malloc.c:366)
   by 0x8038A80: DrawableGone (glxext.c:171)
   by 0x44C564: FreeClientResources (resource.c:854)
   by 0x42DFFD: CloseDownClient (dispatch.c:3461)
   by 0x45EA14: EstablishNewConnections (connection.c:838)
   by 0x432DA1: ProcessWorkQueue (dixutils.c:527)
   by 0x45B5D1: WaitForSomething (WaitFor.c:173)
   by 0x42E8D9: Dispatch (dispatch.c:367)
   by 0x422DC9: main (main.c:287)


Process terminating with default action of signal 11 (SIGSEGV): dumping core
 General Protection Fault
   at 0x80389CB: DrawableGone (glxext.c:131)
   by 0x44C564: FreeClientResources (resource.c:854)
   by 0x44C617: FreeAllResources (resource.c:871)
   by 0x422DFB: main (main.c:301)



Looks like code tries to destroy already freed objects multiple times
and several fixes seems obvious:

in glxext.c  line 131:

    if (glxPriv->drawId != glxPriv->pDraw->id) {

Needs to check for 'pDraw' != NULL before dereferencing.

Doing  'free(this)'  in dix/resource.c:855  on object release
as well as in DrawableGone():171    glxPriv->destroy(glxPriv);

Cannot really work well - one of them should be probably removed.
(assuming the one in resource.c - though it looks tricky)

As I do not have insight knowledge - leaving to developers...

Comment 12 Zdenek Kabelac 2011-03-27 23:58:43 UTC
ooops - sorry - wrong bugzilla - should be in Bug 674464

Comment 13 Stanislaw Gruszka 2011-03-28 07:11:31 UTC
(In reply to comment #10)
> I'm seeing the very same problem - and same oops on my T61.
> 
> Some updated package must be behind this. I've tried even very old kernel
> 2.6.35 I keep on my this - and it gives now same error.
NetworkManager
https://bugzilla.gnome.org/show_bug.cgi?id=644551

> I  tried  'disable_hw_scan' hint - however I had to compile linus vanilla
> kernel 2.6.39-pre-rc kernel to get my wifi usable.

So both disable_hw_scan and update to .39 is needed to make problem gone, right? Did you try to apply f844a709a7d8f8be61a571afc31dfaca9e779621 on 2.6.38? I wonder if that fix is enough or we need some more patches from 2.6.39.

Comment 14 Stanislaw Gruszka 2011-04-22 12:13:41 UTC
*** Bug 691585 has been marked as a duplicate of this bug. ***

Comment 15 Stanislaw Gruszka 2011-04-22 12:13:50 UTC
*** Bug 698815 has been marked as a duplicate of this bug. ***

Comment 16 Fedora Update System 2011-04-22 15:55:47 UTC
kernel-2.6.38.3-18.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/kernel-2.6.38.3-18.fc15

Comment 17 Christian Jann 2011-04-22 21:38:37 UTC
Created attachment 494317 [details]
Screenshot from the crash

I've installed these new kernel packages (https://admin.fedoraproject.org/updates/kernel-2.6.38.3-18.fc15) but I still can't get a wlan connection via NetworkManager.

[chris@linuxbox ~]$ uname -r
2.6.38.3-18.fc15.x86_64
[chris@linuxbox ~]$ cat /var/log/messages  |grep iwl |tail -n 20
Apr 22 22:30:59 linuxbox kernel: [  958.671084] iwl3945 0000:0b:00.0: Failed to get channel info for channel 140 [0]
Apr 22 22:31:59 linuxbox kernel: [ 1018.669055] iwl3945 0000:0b:00.0: Failed to get channel info for channel 140 [0]
Apr 22 22:32:59 linuxbox kernel: [ 1078.676182] iwl3945 0000:0b:00.0: Failed to get channel info for channel 140 [0]
Apr 22 22:33:59 linuxbox kernel: [ 1138.680049] iwl3945 0000:0b:00.0: Failed to get channel info for channel 140 [0]
Apr 22 22:35:02 linuxbox kernel: [   22.078580] iwl3945: Intel(R) PRO/Wireless 3945ABG/BG Network Connection driver for Linux, in-tree:ds
Apr 22 22:35:02 linuxbox kernel: [   22.078584] iwl3945: Copyright(c) 2003-2010 Intel Corporation
Apr 22 22:35:02 linuxbox kernel: [   22.078741] iwl3945 0000:0b:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
Apr 22 22:35:02 linuxbox kernel: [   22.132250] iwl3945 0000:0b:00.0: Tunable channels: 13 802.11bg, 23 802.11a channels
Apr 22 22:35:02 linuxbox kernel: [   22.132255] iwl3945 0000:0b:00.0: Detected Intel Wireless WiFi Link 3945ABG
Apr 22 22:35:05 linuxbox NetworkManager[882]: <info> (wlan0): new 802.11 WiFi device (driver: 'iwl3945' ifindex: 3)
Apr 22 22:35:05 linuxbox kernel: [   29.883851] iwl3945 0000:0b:00.0: loaded firmware version 15.32.2.9
Apr 22 22:35:05 linuxbox kernel: [   29.951270] iwl3945 0000:0b:00.0: Error setting Tx power (-5).
Apr 22 22:52:36 linuxbox kernel: [   22.353413] iwl3945: Intel(R) PRO/Wireless 3945ABG/BG Network Connection driver for Linux, in-tree:ds
Apr 22 22:52:36 linuxbox kernel: [   22.353417] iwl3945: Copyright(c) 2003-2010 Intel Corporation
Apr 22 22:52:36 linuxbox kernel: [   22.353568] iwl3945 0000:0b:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
Apr 22 22:52:36 linuxbox kernel: [   22.407136] iwl3945 0000:0b:00.0: Tunable channels: 13 802.11bg, 23 802.11a channels
Apr 22 22:52:36 linuxbox kernel: [   22.407141] iwl3945 0000:0b:00.0: Detected Intel Wireless WiFi Link 3945ABG
Apr 22 22:52:40 linuxbox NetworkManager[928]: <info> (wlan0): new 802.11 WiFi device (driver: 'iwl3945' ifindex: 4)
Apr 22 22:52:40 linuxbox kernel: [   30.472493] iwl3945 0000:0b:00.0: loaded firmware version 15.32.2.9
Apr 22 22:52:40 linuxbox kernel: [   30.539113] iwl3945 0000:0b:00.0: Error setting Tx power (-5).

And I get a kernel crash after running these commands (Screenshot from the crash attached).
ifconfig wlan0 down
iwconfig wlan0 channel 11
iwconfig wlan0 mode ad-hoc
iwconfig wlan0 essid some_network
iwconfig wlan0 key 01234567890123456789012345   
ifconfig wlan0 up
ifconfig wlan0 192.168.0.88
route add -net default gw 192.168.0.1 wlan0

Comment 18 Fedora Update System 2011-04-23 01:14:49 UTC
Package kernel-2.6.38.3-18.fc15:
* should fix your issue,
* was pushed to the Fedora 15 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-2.6.38.3-18.fc15'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/kernel-2.6.38.3-18.fc15
then log in and leave karma (feedback).

Comment 19 Stan King 2011-04-24 23:09:47 UTC
Although I'm not the originator of this bug report, I'm getting a similar iwl3945 error, more similar to what is being reported in https://bugzilla.redhat.com/show_bug.cgi?id=683571.  After updating to kernel kernel-2.6.38.3-18.fc15, I still get the following, for whatever it's worth.  I've trimmed the left side to help it fit here.

Should I report this in that other report as well?


 [   28.872724] iwl3945 0000:0c:00.0: Microcode SW error detected. Restarting 0x82000008.
 [   28.876310] iwl3945 0000:0c:00.0: Loaded firmware version: 15.32.2.9
 [   28.879943] iwl3945 0000:0c:00.0: Start IWL Error Log Dump:
 [   28.883524] iwl3945 0000:0c:00.0: Status: 0x0002A2E4, count: 1
 [   28.887099] iwl3945 0000:0c:00.0: Desc       Time       asrtPC  blink2 ilink1  nmiPC   Line
 [   28.890932] iwl3945 0000:0c:00.0: SYSASSERT     (0x5) 0000000552 0x008B6 0x13756 0x00320 0x00000 764
 [   28.890934] 
 [   28.898305] iwl3945 0000:0c:00.0: Start IWL Event Log Dump: display last 20 count
 [   28.902062] iwl3945 0000:0c:00.0: 0001364725       0x00000001      0462
 [   28.903011] iwl3945 0000:0c:00.0: 0001364863       0x04170010      0401
 [   28.903011] iwl3945 0000:0c:00.0: 0001364920       0x00000175      1005
 [   28.903011] iwl3945 0000:0c:00.0: 0001364921       0x00000000      1001
 [   28.903011] iwl3945 0000:0c:00.0: 0001364922       0x00000000      1002
 [   28.903011] iwl3945 0000:0c:00.0: 0000000012       0x000001b8      1005
 [   28.903011] iwl3945 0000:0c:00.0: 0000000040       0x000000d1      0103
 [   28.903011] iwl3945 0000:0c:00.0: 0000000125       0x04180018      0401
 [   28.903011] iwl3945 0000:0c:00.0: 0000000214       0x04190097      0401
 [   28.903011] iwl3945 0000:0c:00.0: 0000000217       0x00000001      0451
 [   28.903011] iwl3945 0000:0c:00.0: 0000000237       0x00000000      0451
 [   28.903011] iwl3945 0000:0c:00.0: 0000000314       0x041a0047      0401
 [   28.903011] iwl3945 0000:0c:00.0: 0000000390       0x041b0047      0401
 [   28.903011] iwl3945 0000:0c:00.0: 0000000513       0x441c0080      0401
 [   28.903011] iwl3945 0000:0c:00.0: 0000000526       0x0000000d      0452
 [   28.903011] iwl3945 0000:0c:00.0: 0000000531       0x001f3b6e      0450
 [   28.903011] iwl3945 0000:0c:00.0: 0000000537       0x001f3b6e      0450
 [   28.903011] iwl3945 0000:0c:00.0: 0000000542       0x001f3b6e      0450
 [   28.903011] iwl3945 0000:0c:00.0: 0000000547       0x001f3b6e      0450
 [   28.903011] iwl3945 0000:0c:00.0: 0000000553       0x00000100      0125
 [   28.962249] iwl3945 0000:0c:00.0: Error Reply type 0x000002FC cmd REPLY_SCAN_CMD (0x80) seq 0x441C ser 0x00340000
 [   28.969271] iwl3945 0000:0c:00.0: Can't stop Rx DMA.
 [   29.037305] iwl3945 0000:0c:00.0: Error setting Tx power (-5).
 [   29.269626] 802.1Q VLAN Support v1.8 Ben Greear <greearb>
 [   29.272376] All bugs added by David S. Miller <davem>

Comment 20 Stanislaw Gruszka 2011-04-26 05:48:20 UTC
This problem is not fixed yet ...

Comment 21 Fedora Update System 2011-04-27 02:38:55 UTC
kernel-2.6.38.3-18.fc15 has been pushed to the Fedora 15 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 22 Daniel Mach 2011-04-27 19:18:24 UTC
It doesn't seem to be fixed yet, hit this crash with kernel-2.6.38.3-18.fc15.i686.

Comment 23 Stanislaw Gruszka 2011-04-28 12:54:11 UTC
Fixes posted to stable and fedora mailing lists.
http://lists.fedoraproject.org/pipermail/kernel/2011-April/003081.html

Comment 24 Dan Williams 2011-04-28 22:45:23 UTC
The only thing I can think of with NM that's changed since F14/0.8.4 is that we now use the nl80211 wpa_supplicant driver if we can instead of WEXT.  NM still periodically scans just like it always did, but the nl80211 thing and the usage of wpa_supplicant 0.7.3 is the biggest wifi-related change.  So it's got to be something in the supplicant nl80211 code or in mac80211 itself that's causing the issue, since previously only WEXT was used.

Comment 25 Fedora Update System 2011-04-29 00:55:58 UTC
kernel-2.6.38.4-20.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/kernel-2.6.38.4-20.fc15

Comment 26 Stan King 2011-04-30 06:18:36 UTC
Stanislaw, the kernel-2.6.38.4-20.fc15 is working for me, now.  My iwl3945abg wireless interface no longer gets the errors I copied above, and it seems to function well.  After loading the firmware, it complains about not being able to set the TX power, but that apparently doesn't keep it from working.

Comment 27 Stanislaw Gruszka 2011-04-30 10:51:16 UTC
*** Bug 699447 has been marked as a duplicate of this bug. ***

Comment 28 Fedora Update System 2011-05-01 03:31:56 UTC
kernel-2.6.38.4-20.fc15 has been pushed to the Fedora 15 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 29 Christian Jann 2011-05-01 09:32:33 UTC
I'm still getting kernel crashes and no wireless connection (with and without options iwl3945 disable_hw_scan=1).
[chris@linuxbox ~]$ uname -r
2.6.38.4-20.fc15.x86_64

[chris@linuxbox ~]$ sudo service NetworkManager stop
[sudo] password for chris: 
Stopping NetworkManager (via systemctl):                   [  OK  ]
[chris@linuxbox ~]$ su -
Password: 
[root@linuxbox ~]#
iwconfig wlan0 channel 11
iwconfig wlan0 mode ad-hoc
iwconfig wlan0 essid some_network
iwconfig wlan0 key 00001111111111111000000000
ifconfig wlan0 up
-->Crash!

On F14 everything works fine.

Comment 30 Stanislaw Gruszka 2011-05-02 06:47:41 UTC
Christian, this is a different problem, can you open a new bug report for it and assign it to me?

Comment 31 Christian Jann 2011-05-02 12:31:20 UTC
OK Managed Mode is working fine:

[chris@linuxbox ~]$ iwconfig wlan0
wlan0     IEEE 802.11abg  ESSID:"eduroam"  
          Mode:Managed  Frequency:2.437 GHz  Access Point: 00:16:9D:F4:ED:51   
          Bit Rate=54 Mb/s   Tx-Power=15 dBm   
          Retry  long limit:7   RTS thr:off   Fragment thr:off
          Power Management:off
          Link Quality=57/70  Signal level=-53 dBm  
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:1  Invalid misc:25   Missed beacon:0

Only ad-hoc mode makes problems, but because Managed Mode is working I have updated all the other packages and now the boot hangs after "Started Network Time Service" and at the moment I don't have the time for further testing. I will reinstall F15 when there is a new ISO available (hopefully next week "Compose 'Final' RC").


Note You need to log in before you can comment on or make changes to this bug.