Bug 637612

Summary: iwlagn ucode SW error, SYSASSERT and WARNING: at net/wireless/core.c:633 wdev_cleanup_work
Product: [Fedora] Fedora Reporter: James <james>
Component: kernelAssignee: Stanislaw Gruszka <sgruszka>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 14CC: aquini, balay, christopherthe1, dougsland, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, mathieu-acct, sgruszka, yaroslav.sapozhnik
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-01-24 14:17:57 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
dmesg output showing the incident none

Description James 2010-09-26 21:14:35 UTC
Created attachment 449768 [details]
dmesg output showing the incident

Description of problem:
Was using notebook normally when my AP failed, so I switched to another. Also ran a few scans using "iw dev wlan0 scan". Then I had no connection and checked dmesg, whereupon I found


iwlagn 0000:02:00.0: Aborted scan still in progress after 100ms
wlan0: deauthenticating from 00:1f:9f:42:84:bd by local choice (reason=3)
iwlagn 0000:02:00.0: Microcode SW error detected.  Restarting 0x82000000.
iwlagn 0000:02:00.0: Loaded firmware version: 228.61.2.24
iwlagn 0000:02:00.0: Start IWL Error Log Dump:
iwlagn 0000:02:00.0: Status: 0x0002B3E4, count: 5
iwlagn 0000:02:00.0: Desc                               Time       data1      data2      line
iwlagn 0000:02:00.0: SYSASSERT                    (#05) 3934265523 0x00000000 0x00000000 1121


plus a dump (see attached dmesg output). I then reloaded iwlagn, and this appeared:


------------[ cut here ]------------
WARNING: at net/wireless/core.c:633 wdev_cleanup_work+0x52/0xbb [cfg80211]()
Hardware name: M720R
Modules linked in: ppp_mppe ppp_async crc_ccitt ppp_generic slhc aes_x86_64 aes_generic hidp fuse ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat rfcomm sco bridge stp llc bnep l2cap coretemp cpufreq_ondemand acpi_cpufreq freq_table mperf xt_physdev ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm uinput snd_hda_codec_si3054 arc4 snd_hda_codec_realtek ecb snd_hda_intel iwlagn(-) snd_hda_codec iwlcore mac80211 snd_hwdep snd_seq uvcvideo iTCO_wdt snd_seq_device snd_pcm sdhci_pci snd_timer snd videodev v4l1_compat btusb sdhci mmc_core wmi r8169 mii v4l2_compat_ioctl32 iTCO_vendor_support cfg80211 serio_raw i2c_i801 soundcore bluetooth snd_page_alloc joydev rfkill microcode firewire_ohci firewire_core crc_itu_t uhci_hcd i915 drm_kms_helper drm i2c_algo_bit i2c_core video output [last unloaded: scsi_wait_scan]
Pid: 783, comm: cfg80211 Not tainted 2.6.35.5-rhapsody.fc14.i915pp.noecd-229 #1
Call Trace:
 [<ffffffff81048d36>] warn_slowpath_common+0x80/0x98
 [<ffffffff81048d63>] warn_slowpath_null+0x15/0x17
 [<ffffffffa01ae58e>] wdev_cleanup_work+0x52/0xbb [cfg80211]
 [<ffffffff8105df4d>] worker_thread+0x1d3/0x27a
 [<ffffffffa01ae53c>] ? wdev_cleanup_work+0x0/0xbb [cfg80211]
 [<ffffffff81061dff>] ? autoremove_wake_function+0x0/0x34
 [<ffffffff8143d085>] ? _raw_spin_unlock_irqrestore+0x3c/0x3e
 [<ffffffff8105dd7a>] ? worker_thread+0x0/0x27a
 [<ffffffff8106192f>] kthread+0x7a/0x82
 [<ffffffff8100a9e4>] kernel_thread_helper+0x4/0x10
 [<ffffffff810618b5>] ? kthread+0x0/0x82
 [<ffffffff8100a9e0>] ? kernel_thread_helper+0x0/0x10
---[ end trace cc43b0bac39c24f6 ]---


This using a kernel built from Fedora 14 sources, kernel-2.6.35.5-30.fc14.

Version-Release number of selected component (if applicable):
kernel-2.6.35.5-30.fc14
iwl4965-firmware-228.61.2.24-2.fc12.noarch

How reproducible:
Unknown.

Comment 1 Stanislaw Gruszka 2010-09-29 18:05:43 UTC
In the past using swcrypto=1 module option helps with "Microcode SW error detected".

To recover from that situation, reloading iwlagn and iwlcore modules are needed.

Regarding warning in wdev_cleanup_work, some work is currently done in upstream kernel to prevent that. I'm going to backport it to F-14 kernel.

Comment 2 James 2010-09-29 18:19:20 UTC
(In reply to comment #1)
> In the past using swcrypto=1 module option helps with "Microcode SW error
> detected".
> 
> To recover from that situation, reloading iwlagn and iwlcore modules are
> needed.

In the case reported, swcrypto=1 was set in /etc/modprobe.d/local.conf. Recovery was, as you said, reloading iwlagn.

Comment 3 Christopher 2011-01-06 06:47:46 UTC
Seeing the same error on my laptop (Lenovo W500) using F14 kernel
2.6.35.10-74.fc14.x86_64

The swcrypto=1 workaround had no effect.

Comment 4 Yaroslav 2011-01-15 19:46:27 UTC
The same error on Lenovo T500 using F14. Kernel is 2.6.35.10-74.fc14.x86_64.

Comment 5 Stanislaw Gruszka 2011-01-24 14:12:11 UTC
wdev_cleanup_work problem is solved in 2.6.37. You can use upstream wireless drivers using compat-wireless http://people.redhat.com/sgruszka/compat_wireless.html . 

Regarding Microcode error, this is generally a firmware problem (or driver do something that kill firmware). This can be only fixed by Intel. If you still have this on upstream driver report problem to http://bugzilla.intellinuxwireless.org/ as described in http://intellinuxwireless.org/?n=fw_error_report

Comment 6 Stanislaw Gruszka 2011-01-24 14:17:57 UTC
(In reply to comment #1)
> Regarding warning in wdev_cleanup_work, some work is currently done in upstream
> kernel to prevent that. I'm going to backport it to F-14 kernel.

I'll will not backport. There was a lot of changes in iwlwifi driver between 2.6.35 and 2.6.37, that make backport very hard.  

You can use compat-wireless. I'm working to provide that packages in regular manner via yum repo.

Closing as WONTFIX.