| Summary: | Kernel OOPS, eventual subsequent PANIC when unbinding ehci_hcd device | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Zach C <FxChiP> | ||||||
| Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 15 | CC: | gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2011-09-26 19:12:03 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Attachments: |
|
||||||||
|
Description
Zach C
2011-06-28 05:35:38 UTC
(In reply to comment #1) > Description of problem: > > Following the instructions for the workaround at bug 694191 -- that is, adding > a script to /etc/pm/sleep.d/20_custom-ehci_hcd to echo device IDs to "unbind" > in order to detach from those USB hubs -- I discovered that, before the suspend > happens, the kernel will OOPS and then PANIC soon after. > You need to post the actual messages you get, otherwise there's not much anyone can do about them. Is there an easy way to do that besides transcribing them twice? (In reply to comment #2) > Is there an easy way to do that besides transcribing them twice? Take a picture of the screen with a digital camera and attach that. I did manage to transcribe this kernel panic after one occurrence: Kernel panic - not syncing: Fatal exception in interrupt Pid: 0, comm: swapper Tainted: G D 2.6.38-8-32.fc15.x86_64 #1 Call Trace: <IRQ> [<ffffffff8146c6e6>] panic+0x91/0x19c [<ffffffff81476cc6>] oops_end+0xb4/0xc5 [<ffffffff8100d454>] die+0x5a/0x66 [<ffffffff814765c8>] do_trap+0x121/0x130 [<ffffffff8100aeaa>] do_invalid_op+0x94/0x9d [<ffffffff81257bbd>] ? alloc_iova+0x184/0x1dc [<ffffffff8106acc7>] ? queue_work_on+0x37/0x45 [<ffffffff8106ad0e>] ? ieee80211_queue_work+0x2e/0x35 [mac80211] [<ffffffff8100a85b>] invalid_op+0x1b/0x20 [<ffffffff81257bbd>] ? alloc_iova+0x184/0x1dc [<ffffffff814759c4>] ? _raw_spin_unlock_irqrestore+0x17/0x19 [<ffffffff810615b0>] ? __mod_timer+0x138/0x14a [<ffffffff8125ab98>] intel_alloc_iova+0x86/0xbc [<ffffffff8125af94>] __intel_map_single+0x9b/0x171 [<ffffffff8125b06a>] ? intel_map_page+0x0/0x43 [<ffffffff8125b0ab>] intel_map_page+0x41/0x43 [<ffffffffa04dd341>] dma_map_single_attrs.constprop.7+0x65/0x80 [ath9k] [<ffffffffa04de432>] ath_rx_tasklet+0x8fa/0x12f6 [ath9k] [<ffffffff810615b0>] ? __mod_timer+0x138/0x14a [<ffffffff814759c4>] ? _raw_spin_unlock_irqrestore+0x17/0x19 [<ffffffffa04dc24f>] ath9k_tasklet+0xa3/0x11b [ath9k] [<ffffffff8105a849>] tasklet_action+0x7f/0xd2 [<ffffffff8105ae4c>] __do_softirq+0xd2/0x19d [<ffffffff810226b9>] ? ack_APIC_irq+0x15/0x17 [<ffffffff8100fc99>] ? paravirt_read_tsc+0x9/0xd [<ffffffff8100aadc>] call_softirq+0x1c/0x30 [<ffffffff8100c101>] do_softirq+0x46/0x81 [<ffffffff8105afd0>] irq_exit+0x49/0x8b [<ffffffff8147c006>] do_IRQ+0x8e/0xa5 [<ffffffff81475f13>] ret_from_intr+0x0/0x15 <EOI> [<ffffffff8100fc99>] ? paravirt_read_tsc+0x9/0xd [<ffffffff81274567>] ? intel_idle+0xdb/0x100 [<ffffffff81274546>] ? intel_idle+0xba/0x100 [<ffffffff81398d54>] cpuidle_idle_call+0xe7/0x166 [<ffffffff81008321>] cpu_idle+0xa5/0xdf [<ffffffff81454cde>] rest_init+0x72/0x74 [<ffffffff81b58c2f>] start_kernel+0x3f2/0x3fe [<ffffffff81b582c4>] x86_64_start_reservations+0xaf/0xb3 [<ffffffff81b58140>] ? early_idt_handler+0x0/0x71 [<ffffffff81b583cf>] x86_64_start_kernel+0x107/0x116 panic occurred, switching back to text console ---- I would have gotten an example of an OOPS, but those moved way too fast for me to type up. Created attachment 510603 [details]
Crash snapshot
One picture taken of the screen during a panic/oops
Created attachment 510605 [details]
Crash snapshot 2
Another crash snapshot taken (sorry, camera and cameraman aren't so much near the quality they should be for this ;) )
These all only ever occur after unbinding ehci_hcd from the PCI devices it's bound to, even though they all look like they fail in different places. This failure also occurs on a kernel I compiled myself (2.6.39-ck2). I've also tried recompiling with ehci_hcd as a module and simply doing an rmmod on it; that panics the kernel far more quickly! For comparison purposes (if it helps at all), the previous version of Linux Mint worked, and the last time I checked, Ubuntu 11.04 worked as well. No kernel I have for F15 has worked, thus far. I have tried with KMS disabled (by both the nomodeset and radeon.modeset=0 boot args), pcie_aspm=force, these same things with the aforementioned 2.6.39-ck2, and also applied the fix at http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blobdiff;f=drivers/pci/iova.c;h=c5c274ab5c5a034abe91fb1f1f5dcf6380c9315e;hp=9606e599a47552f9119425c077f62a0c807d3b9f;hb=b0af8dfdd67699e25083478c63eedef2e72ebd85;hpb=25985edcedea6396277003854657b5f3cb31a628 (manually, but double-checked) to that same kernel and still no luck. (Why the last one? Some of the messages mentioned alloc_iova being at fault...) Vanilla kernel 3.0.0-rc6 fixes this issue for me and suspend again works as expected. (In reply to comment #8) > Vanilla kernel 3.0.0-rc6 fixes this issue for me and suspend again works as > expected. F15 is currently based on the final 3.0 release (2.6.40 is 3.0 renamed). I'm going to close this bug out given the fix should be included there. If this is still a problem, please reopen the bug. |