Description of problem: I'm still seeing a lot of kernel oops on my fedora 13 machine with the latest kernel (i.e. turn on wireless and then wait less than 30 mins and I'm pretty much guaranteed to get a crash): Linux loso 2.6.33.3-85.fc13.x86_64 #1 SMP Thu May 6 18:09:49 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux WARNING: at drivers/net/wireless/iwlwifi/iwl-scan.c:658 iwl_fill_probe_req+0x75/0x99 [iwlcore]() Hardware name: VGN-SZ691N Modules linked in: snd_seq_dummy vboxnetadp vboxnetflt vboxdrv aes_x86_64 aes_generic fuse rfcomm sco bridge stp llc bnep l2cap autofs4 coretemp sunrpc cpufreq_ondemand acpi_cpufreq freq_table nf_conntrack_ipv6 ip6t_ipv6header ip6t_REJECT ip6table_filter ip6_tables ipv6 uinput nvidia(P) snd_hda_codec_idt snd_hda_intel arc4 snd_hda_codec ecb snd_hwdep uvcvideo snd_seq iwlagn snd_seq_device iwlcore sony_laptop videodev snd_pcm btusb v4l1_compat snd_timer v4l2_compat_ioctl32 bluetooth mac80211 iTCO_wdt tifm_7xx1 snd iTCO_vendor_support tifm_core i2c_i801 joydev cfg80211 soundcore snd_page_alloc rfkill sky2 microcode usb_storage firewire_ohci firewire_core crc_itu_t yenta_socket rsrc_nonstatic nouveau ttm drm_kms_helper drm i2c_algo_bit video output i2c_core [last unloaded: vboxdrv] Pid: 880, comm: iwlagn Tainted: P W 2.6.33.3-85.fc13.x86_64 #1 Call Trace: [<ffffffff8104b558>] warn_slowpath_common+0x77/0x8f [<ffffffff8104b57f>] warn_slowpath_null+0xf/0x11 [<ffffffffa0239690>] iwl_fill_probe_req+0x75/0x99 [iwlcore] [<ffffffffa023a721>] iwl_bg_request_scan+0x97a/0x1081 [iwlcore] [<ffffffffa02227aa>] ? iwl_set_tx_power+0xe2/0x11d [iwlcore] [<ffffffff81060d3d>] worker_thread+0x1a4/0x232 [<ffffffffa0239da7>] ? iwl_bg_request_scan+0x0/0x1081 [iwlcore] [<ffffffff81064817>] ? autoremove_wake_function+0x0/0x34 [<ffffffff81060b99>] ? worker_thread+0x0/0x232 [<ffffffff810643c7>] kthread+0x7a/0x82 [<ffffffff8100a924>] kernel_thread_helper+0x4/0x10 [<ffffffff8106434d>] ? kthread+0x0/0x82 [<ffffffff8100a920>] ? kernel_thread_helper+0x0/0x10 Here is another report I got: general protection fault: 0000 [#1] SMP last sysfs file: /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:38/PNP0C09:00/PNP0C0A:00/power_supply/BAT1/energy_full CPU 1 Pid: 884, comm: iwlagn Tainted: P 2.6.33.3-85.fc13.x86_64 #1 VAIO /VGN-SZ691N RIP: 0010:[<ffffffffa0220a12>] [<ffffffffa0220a12>] iwl_bg_request_scan+0xc6b/0x1081 [iwlcore] RSP: 0018:ffff880139885d30 EFLAGS: 00010293 RAX: ffff8800afed4400 RBX: ffff8801381b92c0 RCX: 00000000000000c0 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000000c1 RBP: ffff880139885e30 R08: ffff8801386f1b16 R09: 00000000ffffffff R10: 000000008ce81300 R11: 0000000000000000 R12: ffff8801381b11e0 R13: 0074006500440065 R14: ffff8801386f1b16 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff880005900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007fef0b531000 CR3: 0000000001a3b000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process iwlagn (pid: 884, threadinfo ffff880139884000, task ffff880139970000) Stack: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000200 000000000000000a 0000000000000035 ffff8801386f1800 000000c1000000c0 ffff880139885fd8 0000000000000000 Call Trace: [<ffffffff81060d3d>] worker_thread+0x1a4/0x232 [<ffffffffa021fda7>] ? iwl_bg_request_scan+0x0/0x1081 [iwlcore] [<ffffffff81064817>] ? autoremove_wake_function+0x0/0x34 [<ffffffff81060b99>] ? worker_thread+0x0/0x232 [<ffffffff810643c7>] kthread+0x7a/0x82 [<ffffffff8100a924>] kernel_thread_helper+0x4/0x10 [<ffffffff8106434d>] ? kthread+0x0/0x82 [<ffffffff8100a920>] ? kernel_thread_helper+0x0/0x10 Code: 65 48 8b 04 25 08 cc 00 00 89 8d 48 ff ff ff 48 89 85 68 ff ff ff 48 89 85 50 ff ff ff e9 36 02 00 00 48 63 55 a0 4c 8b 6c d0 38 <45> 39 7d 00 0f 85 20 02 00 00 41 0f b7 7d 04 e8 a2 fb ef ff 44 RIP [<ffffffffa0220a12>] iwl_bg_request_scan+0xc6b/0x1081 [iwlcore] RSP <ffff880139885d30> Both seem to be in the iwl_bg_request_scan method. I've noticed on some other threads that using a Cisco router seems to be triggering the bug. I am using a cisco router and haven't had the crash when I've been using other modems. http://www.gossamer-threads.com/lists/linux/kernel/1221699 Previously this has been working perfectly on fedora 10, 11 & 12. Version-Release number of selected component (if applicable): Fedora 13 Kernel: 2.6.33.3-85.fc13.x86_6 How reproducible: Reboot machine and wait 10-15 mins while using a Cisco router. I've never seen the crash on other wireless routers with the same computer. Steps to Reproduce: 1. Reboot 2. Use internet (light usage or heavy it doesn't matter) 3. Network stops 4. Disconnect network in NetworkManager 5. Reconnect network in NetworkManager 6. Internet comes back for a while 7. Kernel oops report generated 8. A few minutes later it hangs or goes through steps 3-6 and then hangs (caps lock and num lock flashing together and completely unresponsive). Actual results: Kernel oops / hang Expected results: Internet should work Additional info:
iwl_fill_probe_req: if (WARN_ON(left < ie_len)) return len; But still, it is a WARN_ON -- it shouldn't hang the box. Maybe it is indicative of some other failure? Hopefully the Intel team can shed some light?
What hardware is this? This kernel seems significantly different from 2.6.33.3. Could you please guide me on how to obtain the sources of this kernel? I think I asked for this help before but forgot how to do it, I'm sorry and will make sure to post what you send next somewhere where I will always get it.
(In reply to comment #2) > Could you please guide > me on how to obtain the sources of this kernel? You can get source RPM from here: http://koji.fedoraproject.org/koji/buildinfo?buildID=172010 All Fedora kernels are here: http://koji.fedoraproject.org/koji/packageinfo?packageID=8 PS. Was just passing by and saw your question.
(In reply to comment #2) > What hardware is this? It's a Sony Vaio VGN-SZ691N. Here are the iwlagn specific log entries: iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, 2.6.33.3-85.fc13.x86_64-kds iwlagn: Copyright(c) 2003-2009 Intel Corporation iwlagn 0000:06:00.0: power state changed by ACPI to D0 iwlagn 0000:06:00.0: power state changed by ACPI to D0 iwlagn 0000:06:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 iwlagn 0000:06:00.0: setting latency timer to 64 iwlagn 0000:06:00.0: Detected Intel Wireless WiFi Link 4965AGN REV=0x4 iwlagn 0000:06:00.0: Tunable channels: 11 802.11bg, 13 802.11a channels iwlagn 0000:06:00.0: irq 30 for MSI/MSI-X iwlagn 0000:06:00.0: firmware: requesting iwlwifi-4965-2.ucode iwlagn 0000:06:00.0: loaded firmware version 228.61.2.24 iwlagn 0000:06:00.0: iwl_tx_agg_start on ra = 00:22:6b:f8:63:46 tid = 0 iwlagn 0000:06:00.0: iwl_tx_agg_start on ra = 00:22:6b:f8:63:46 tid = 0 iwlagn 0000:06:00.0: power state changed by ACPI to D3 iwlagn 0000:06:00.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) iwlagn 0000:06:00.0: restoring config space at offset 0x4 (was 0x4, writing 0xf8000004) iwlagn 0000:06:00.0: restoring config space at offset 0x3 (was 0x0, writing 0x10) iwlagn 0000:06:00.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100006) iwlagn 0000:06:00.0: power state changed by ACPI to D0 iwlagn 0000:06:00.0: power state changed by ACPI to D0 iwlagn 0000:06:00.0: power state changed by ACPI to D0 iwlagn 0000:06:00.0: power state changed by ACPI to D0 > > This kernel seems significantly different from 2.6.33.3. Could you please guide > me on how to obtain the sources of this kernel? I think I asked for this help > before but forgot how to do it, I'm sorry and will make sure to post what you > send next somewhere where I will always get it.
The info from comment 3 is correct. FWIW, this kernel has the patches which were recently discussed on the stable list.
(In reply to comment #5) > The info from comment 3 is correct. FWIW, this kernel has the patches which > were recently discussed on the stable list. Any chance there are instructions out there to get the kernel source if you are not running Fedora? In the mean time, since this has the RF reset code I think we are looking at another incarnation of an internal scan race here. If you are supporting internal scanning (RF reset) then you really need the recent scan races fixes that Johannes and I sent upstream. Johannes's patch made it to linux-2.6: commit 88be026490ed89c2ffead81a52531fbac5507e01 Author: Johannes Berg <johannes.berg> Date: Wed Apr 7 00:21:36 2010 -0700 iwlwifi: fix scan races Mine didn't, it can be found on iwlwifi-2.6's wireless-2.6 branch and was submitted at:http://thread.gmane.org/gmane.linux.kernel.wireless.general/50897/focus=50899 Since you need these two anyway, any chance to build a kernel with them and retest?
http://koji.fedoraproject.org/koji/taskinfo?taskID=2194306 Test kernel above has the patches Reinette recommended in comment 6. Please give them a try and post the results here -- thanks!
Reinette, as for getting the sources w/o running Fedora...that could be difficult. Perhaps your local distro has rpm available? If so, then rpmbuild may still be the right tool to use. Otherwise, rpm2cpio (piped to cpio) can extract everthing. But you would still need to unpack the tarball and apply any patches in the proper order. Maybe a virtual host running a Fedora image would be easier? :-)
(In reply to comment #7) > http://koji.fedoraproject.org/koji/taskinfo?taskID=2194306 > > Test kernel above has the patches Reinette recommended in comment 6. Please > give them a try and post the results here -- thanks! Cheers, I've installed and am running the test kernel now. I'm going to try downloading a large file (fedora ISO probably) and see how that fares. I'll report back tomorrow morning to say how it's going. Thanks!
Yes I can confirm the patch seems to be holding. I've been using it for over 24 hours and no crash. Many thanks!