Bug 501117
| Summary: | kernel oops on iwl3945 | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Lukas Bezdicka <social> | ||||||
| Component: | kernel | Assignee: | Stanislaw Gruszka <sgruszka> | ||||||
| Status: | CLOSED WORKSFORME | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | low | ||||||||
| Version: | 11 | CC: | a.steffan, itamar, kernel-maint, sgruszka | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2009-08-14 08:56:04 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
Created attachment 344284 [details]
contents of dmesg
/usr/src/debug/kernel-2.6.29/linux-2.6.29.x86_64/kernel/timer.c:932
ffffffff81052505: 4c 63 c0 movslq %eax,%r8
ffffffff81052508: 49 c1 e0 04 shl $0x4,%r8
ffffffff8105250c: 4f 8b 0c 10 mov (%r8,%r10,1),%r9
ffffffff81052510: 4f 8d 04 02 lea (%r10,%r8,1),%r8
ffffffff81052514: eb 13 jmp ffffffff81052529 <get_next_timer_interrupt+0x110>
/usr/src/debug/kernel-2.6.29/linux-2.6.29.x86_64/kernel/timer.c:934
ffffffff81052516: 49 8b 79 10 mov 0x10(%r9),%rdi
/usr/src/debug/kernel-2.6.29/linux-2.6.29.x86_64/kernel/timer.c:932
ffffffff8105251a: 4d 89 d9 mov %r11,%r9
/usr/src/debug/kernel-2.6.29/linux-2.6.29.x86_64/kernel/timer.c:934
ffffffff8105251d: 4c 39 e7 cmp %r12,%rdi
ffffffff81052520: 4c 0f 48 e7 cmovs %rdi,%r12
/usr/src/debug/kernel-2.6.29/linux-2.6.29.x86_64/kernel/timer.c:932
ffffffff81052524: bf 01 00 00 00 mov $0x1,%edi
ffffffff81052529: 4d 8b 19 mov (%r9),%r11 <====
ffffffff8105252c: 4d 39 c1 cmp %r8,%r9
index = slot = timer_jiffies & TVN_MASK;
do {
===> list_for_each_entry(nte, varp->vec + slot, entry) {
found = 1;
if (time_before(nte->expires, expires))
expires = nte->expires;
}
/*
* Do we still search for the first timer or are
* we looking up the cascade buckets ?
*/
if (found) {
/* Look at the cascade bucket(s)? */
if (!index || slot < index)
break;
return expires;
}
slot = (slot + 1) & TVN_MASK;
} while (slot != index);
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle. Changing version to '11'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping I we have race in patch:
linux-2.6-iwl3945-report-killswitch-changes-even-if-the-interface-is-down.patch
When device is removed in iwl3945_pci_remove() first is call to:
cancel_delayed_work_sync(&priv->rfkill_poll);
and then:
ieee80211_unregister_hw(priv->hw)
-> iwl3945_mac_stop(struct ieee80211_hw *hw)
-> queue_delayed_work(priv->workqueue, &priv->rfkill_poll,
round_jiffies_relative(2 * HZ));
So after module unloading we can have armed timer, which access data
from module memory region. Race is fixed in mainline by commit:
commit d552bfb65241a35d48e44ddb0d27e0454f579ab4
Author: Kolekar, Abhijeet <abhijeet.kolekar>
Date: Fri Dec 19 10:37:41 2008 +0800
iwl3945: release resources before shutting down
Commit apply almost cleanly on fedora kernel sources, I tested it and can not reproduce the oops.
Update. After some more testing I we discovered we still have issues with modprobe, rmmod. When NetworkManager is working and I do: while true; do modprobe iwl3945; rmmod iwl3945 ; done Ctrl + C modprobe iwl3945 kernel bug occurs: ------------[ cut here ]------------ kernel BUG at drivers/net/wireless/iwlwifi/iwl3945-base.c:3352! invalid opcode: 0000 [#1] SMP last sysfs file: /sys/devices/pci0000:00/0000:00:1c.1/0000:03:00.0/firmware/0000:03:00.0/loading Modules linked in: iwl3945 fuse rfcomm bridge stp llc bnep sco l2cap sunrpc ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 cpufreq_ondemand acpi_cpufreq dm_multipath uinput snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_pcm arc4 snd_timer yenta_socket rsrc_nonstatic snd soundcore ecb joydev mac80211 btusb bluetooth i2c_i801 lib80211 iTCO_wdt iTCO_vendor_support e1000e snd_page_alloc nsc_ircc irda crc_ccitt cfg80211 thinkpad_acpi hwmon pcspkr i915 drm i2c_algo_bit i2c_core video output [last unloaded: iwl3945] Pid: 0, comm: swapper Not tainted (2.6.29.5my #1) 6369CTO EIP: 0060:[<f8c5dff5>] EFLAGS: 00010097 CPU: 0 EIP is at iwl3945_irq_tasklet+0x499/0x7be [iwl3945] EAX: e84b0000 EBX: e8418e60 ECX: ef8ae000 EDX: 00000000 ESI: e8419788 EDI: e84b6004 EBP: c08ffec8 ESP: c08ffe7c DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Process swapper (pid: 0, ti=c08fe000 task=c0889350 task.ti=c08fe000) Stack: e841b48c e841cde8 e8419758 0002f902 e841b498 e841b69c 801c122c e84193d4 00000282 00000008 00000002 00010000 00000001 00000000 80000008 00000001 e841ce98 e841ce9c c096fc00 c08ffee4 c0439a9c 00000000 c096c690 00000001 Call Trace: [<c0439a9c>] ? tasklet_action+0x8b/0xf7 [<c0439f39>] ? __do_softirq+0x99/0x139 [<c043a02b>] ? do_softirq+0x52/0x7e [<c043a196>] ? irq_exit+0x49/0x77 [<c040b00e>] ? do_IRQ+0x97/0xad [<c0409b2c>] ? common_interrupt+0x2c/0x34 [<c05b6d48>] ? acpi_idle_enter_bm+0x25f/0x2a9 [<c06748e0>] ? cpuidle_idle_call+0x65/0x9d [<c04085f0>] ? cpu_idle+0x72/0x92 [<c0705010>] ? rest_init+0x58/0x5a Code: f0 00 0f 84 3e 01 00 00 8b 4e 08 85 c9 0f 84 28 01 00 00 8b 81 a8 00 00 00 66 8b 40 06 0f b6 d4 81 e2 bf 00 00 00 83 fa 04 74 04 <0f> 0b eb fe f6 c4 40 88 45 f0 8b bb 50 27 00 00 75 06 8a 45 f0 EIP: [<f8c5dff5>] iwl3945_irq_tasklet+0x499/0x7be [iwl3945] SS:ESP 0068:c08ffe7c This second oops is know mainline and reported here: http://marc.info/?l=linux-wireless&m=123147215829854&w=2 According bug report mail thread thread these additional fixes are needed:
commit df833b1d73680f9f9dc72cbc3215edbbc6ab740d
Author: Reinette Chatre <reinette.chatre>
Date: Tue Apr 21 10:55:48 2009 -0700
iwlwifi: DMA fixes
commit 8cd812bcda06645160b0b279e1a125271a73411c
Author: Winkler, Tomas <tomas.winkler>
Date: Fri Dec 19 10:37:43 2008 +0800
iwl3945: use iwl_rb_status
Commit "iwlwifi: DMA fixes" is not trivial patch and it is hard to backport as is, without re-base with previous commits ... hmm.
We apply
commit 638d0eb9197d1e285451f6594184fcfc9c2a5d44
Author: Chatre, Reinette <reinette.chatre>
Date: Mon Jan 19 15:30:24 2009 -0800
iwl3945: add debugging for wrong command queue
plus some other patches and I can no longer reproduce this bug with newer F11 kernel-2.6.29.6-217.2.3.fc11. So I'm closing this bug. Please reopen if you still have problem with that.
|
Created attachment 344272 [details] dmesg output from dumped kernel How reproducible: always Steps to Reproduce: 1. get machine with intel wifi 2. while true; do rmmod iwl3945; modprobe iwl3945; done 3. oops Actual results: oops Expected results: something ugly but not crash