Created attachment 344272 [details] dmesg output from dumped kernel How reproducible: always Steps to Reproduce: 1. get machine with intel wifi 2. while true; do rmmod iwl3945; modprobe iwl3945; done 3. oops Actual results: oops Expected results: something ugly but not crash
Created attachment 344284 [details] contents of dmesg
/usr/src/debug/kernel-2.6.29/linux-2.6.29.x86_64/kernel/timer.c:932 ffffffff81052505: 4c 63 c0 movslq %eax,%r8 ffffffff81052508: 49 c1 e0 04 shl $0x4,%r8 ffffffff8105250c: 4f 8b 0c 10 mov (%r8,%r10,1),%r9 ffffffff81052510: 4f 8d 04 02 lea (%r10,%r8,1),%r8 ffffffff81052514: eb 13 jmp ffffffff81052529 <get_next_timer_interrupt+0x110> /usr/src/debug/kernel-2.6.29/linux-2.6.29.x86_64/kernel/timer.c:934 ffffffff81052516: 49 8b 79 10 mov 0x10(%r9),%rdi /usr/src/debug/kernel-2.6.29/linux-2.6.29.x86_64/kernel/timer.c:932 ffffffff8105251a: 4d 89 d9 mov %r11,%r9 /usr/src/debug/kernel-2.6.29/linux-2.6.29.x86_64/kernel/timer.c:934 ffffffff8105251d: 4c 39 e7 cmp %r12,%rdi ffffffff81052520: 4c 0f 48 e7 cmovs %rdi,%r12 /usr/src/debug/kernel-2.6.29/linux-2.6.29.x86_64/kernel/timer.c:932 ffffffff81052524: bf 01 00 00 00 mov $0x1,%edi ffffffff81052529: 4d 8b 19 mov (%r9),%r11 <==== ffffffff8105252c: 4d 39 c1 cmp %r8,%r9 index = slot = timer_jiffies & TVN_MASK; do { ===> list_for_each_entry(nte, varp->vec + slot, entry) { found = 1; if (time_before(nte->expires, expires)) expires = nte->expires; } /* * Do we still search for the first timer or are * we looking up the cascade buckets ? */ if (found) { /* Look at the cascade bucket(s)? */ if (!index || slot < index) break; return expires; } slot = (slot + 1) & TVN_MASK; } while (slot != index);
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle. Changing version to '11'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
I we have race in patch: linux-2.6-iwl3945-report-killswitch-changes-even-if-the-interface-is-down.patch When device is removed in iwl3945_pci_remove() first is call to: cancel_delayed_work_sync(&priv->rfkill_poll); and then: ieee80211_unregister_hw(priv->hw) -> iwl3945_mac_stop(struct ieee80211_hw *hw) -> queue_delayed_work(priv->workqueue, &priv->rfkill_poll, round_jiffies_relative(2 * HZ)); So after module unloading we can have armed timer, which access data from module memory region. Race is fixed in mainline by commit: commit d552bfb65241a35d48e44ddb0d27e0454f579ab4 Author: Kolekar, Abhijeet <abhijeet.kolekar> Date: Fri Dec 19 10:37:41 2008 +0800 iwl3945: release resources before shutting down Commit apply almost cleanly on fedora kernel sources, I tested it and can not reproduce the oops.
Update. After some more testing I we discovered we still have issues with modprobe, rmmod. When NetworkManager is working and I do: while true; do modprobe iwl3945; rmmod iwl3945 ; done Ctrl + C modprobe iwl3945 kernel bug occurs: ------------[ cut here ]------------ kernel BUG at drivers/net/wireless/iwlwifi/iwl3945-base.c:3352! invalid opcode: 0000 [#1] SMP last sysfs file: /sys/devices/pci0000:00/0000:00:1c.1/0000:03:00.0/firmware/0000:03:00.0/loading Modules linked in: iwl3945 fuse rfcomm bridge stp llc bnep sco l2cap sunrpc ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 cpufreq_ondemand acpi_cpufreq dm_multipath uinput snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_pcm arc4 snd_timer yenta_socket rsrc_nonstatic snd soundcore ecb joydev mac80211 btusb bluetooth i2c_i801 lib80211 iTCO_wdt iTCO_vendor_support e1000e snd_page_alloc nsc_ircc irda crc_ccitt cfg80211 thinkpad_acpi hwmon pcspkr i915 drm i2c_algo_bit i2c_core video output [last unloaded: iwl3945] Pid: 0, comm: swapper Not tainted (2.6.29.5my #1) 6369CTO EIP: 0060:[<f8c5dff5>] EFLAGS: 00010097 CPU: 0 EIP is at iwl3945_irq_tasklet+0x499/0x7be [iwl3945] EAX: e84b0000 EBX: e8418e60 ECX: ef8ae000 EDX: 00000000 ESI: e8419788 EDI: e84b6004 EBP: c08ffec8 ESP: c08ffe7c DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Process swapper (pid: 0, ti=c08fe000 task=c0889350 task.ti=c08fe000) Stack: e841b48c e841cde8 e8419758 0002f902 e841b498 e841b69c 801c122c e84193d4 00000282 00000008 00000002 00010000 00000001 00000000 80000008 00000001 e841ce98 e841ce9c c096fc00 c08ffee4 c0439a9c 00000000 c096c690 00000001 Call Trace: [<c0439a9c>] ? tasklet_action+0x8b/0xf7 [<c0439f39>] ? __do_softirq+0x99/0x139 [<c043a02b>] ? do_softirq+0x52/0x7e [<c043a196>] ? irq_exit+0x49/0x77 [<c040b00e>] ? do_IRQ+0x97/0xad [<c0409b2c>] ? common_interrupt+0x2c/0x34 [<c05b6d48>] ? acpi_idle_enter_bm+0x25f/0x2a9 [<c06748e0>] ? cpuidle_idle_call+0x65/0x9d [<c04085f0>] ? cpu_idle+0x72/0x92 [<c0705010>] ? rest_init+0x58/0x5a Code: f0 00 0f 84 3e 01 00 00 8b 4e 08 85 c9 0f 84 28 01 00 00 8b 81 a8 00 00 00 66 8b 40 06 0f b6 d4 81 e2 bf 00 00 00 83 fa 04 74 04 <0f> 0b eb fe f6 c4 40 88 45 f0 8b bb 50 27 00 00 75 06 8a 45 f0 EIP: [<f8c5dff5>] iwl3945_irq_tasklet+0x499/0x7be [iwl3945] SS:ESP 0068:c08ffe7c
This second oops is know mainline and reported here: http://marc.info/?l=linux-wireless&m=123147215829854&w=2
According bug report mail thread thread these additional fixes are needed: commit df833b1d73680f9f9dc72cbc3215edbbc6ab740d Author: Reinette Chatre <reinette.chatre> Date: Tue Apr 21 10:55:48 2009 -0700 iwlwifi: DMA fixes commit 8cd812bcda06645160b0b279e1a125271a73411c Author: Winkler, Tomas <tomas.winkler> Date: Fri Dec 19 10:37:43 2008 +0800 iwl3945: use iwl_rb_status Commit "iwlwifi: DMA fixes" is not trivial patch and it is hard to backport as is, without re-base with previous commits ... hmm.
We apply commit 638d0eb9197d1e285451f6594184fcfc9c2a5d44 Author: Chatre, Reinette <reinette.chatre> Date: Mon Jan 19 15:30:24 2009 -0800 iwl3945: add debugging for wrong command queue plus some other patches and I can no longer reproduce this bug with newer F11 kernel-2.6.29.6-217.2.3.fc11. So I'm closing this bug. Please reopen if you still have problem with that.