Bug 890780 - panic: exception in interrupt, possibly xhci related
Summary: panic: exception in interrupt, possibly xhci related
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 17
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-12-29 22:54 UTC by James Hogan
Modified: 2013-06-17 18:29 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2013-06-17 18:29:59 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Enhanced photograph of panic screen (155.38 KB, image/png)
2012-12-29 22:54 UTC, James Hogan
no flags Details

Description James Hogan 2012-12-29 22:54:57 UTC
Created attachment 670264 [details]
Enhanced photograph of panic screen

Description of problem:

Kernel panic (see attached photograph) with probably misbehaving USB device

Version-Release number of selected component (if applicable):

2.6.10-2.fc17.x86_64

How reproducible:

only seen once, not attempted to reproduce though I can give it a try if it's helpful.

Steps to Reproduce:

I had a device connected by USB device which was in a fairly fast restart loop which was causing some messages in the kernel log (like it was enumerating before the device was ready). Then suddenly it switched to text console with panic. I may be able to reproduce, but haven't attempted yet.
  
Actual results:

kernel panic. Switched to console and displayed the info in attached picture (some typed up below).

Additional info:

/var/log/messages running up to the point of panic:

Dec 28 21:39:51 balrog kernel: [11917.069992] usb 8-2: reset high-speed USB device number 15 using xhci_hcd
Dec 28 21:45:59 balrog kernel: [12284.523832] usb 8-2: reset high-speed USB device number 15 using xhci_hcd
Dec 28 21:46:18 balrog kernel: [12302.879244] usb 8-2: USB disconnect, device number 15
Dec 28 21:46:24 balrog kernel: [12309.328000] usb 8-2: new high-speed USB device number 16 using xhci_hcd
Dec 28 21:46:29 balrog kernel: [12314.324156] usb 8-2: device descriptor read/8, error -110
Dec 28 21:46:34 balrog kernel: [12319.420781] usb 8-2: device descriptor read/8, error -110
Dec 28 21:46:34 balrog kernel: [12319.673835] usb 8-2: new high-speed USB device number 17 using xhci_hcd
Dec 28 21:46:39 balrog kernel: [12324.658811] xhci_hcd 0000:02:00.0: Timeout while waiting for address device command
Dec 28 21:46:45 balrog kernel: [12329.844212] xhci_hcd 0000:02:00.0: Timeout while waiting for address device command
Dec 28 21:46:45 balrog kernel: [12330.044595] usb 8-2: device not accepting address 17, error -62
Dec 28 21:46:45 balrog kernel: [12330.044743] xhci_hcd 0000:02:00.0: Bad Slot ID 2
Dec 28 21:46:45 balrog kernel: [12330.044754] xhci_hcd 0000:02:00.0: Could not allocate xHCI USB device data structures
Dec 28 21:46:45 balrog kernel: [12330.044765] hub 8-0:1.0: couldn't allocate port 2 usb_device
Dec 28 21:47:01 balrog kernel: [12346.483121] usb 8-2: new high-speed USB device number 18 using xhci_hcd
Dec 28 21:47:06 balrog kernel: [12351.479168] usb 8-2: device descriptor read/8, error -110
Dec 28 21:47:11 balrog kernel: [12356.575937] usb 8-2: device descriptor read/8, error -110
Dec 28 21:47:12 balrog kernel: [12356.828958] usb 8-2: new high-speed USB device number 19 using xhci_hcd
Dec 28 21:47:17 balrog kernel: [12361.813958] xhci_hcd 0000:02:00.0: Timeout while waiting for address device command
Dec 28 21:47:22 balrog kernel: [12366.999331] xhci_hcd 0000:02:00.0: Timeout while waiting for address device command
Dec 28 21:47:22 balrog kernel: [12367.199738] usb 8-2: device not accepting address 19, error -62
Dec 28 21:47:22 balrog kernel: [12367.199880] xhci_hcd 0000:02:00.0: Bad Slot ID 3
Dec 28 21:47:22 balrog kernel: [12367.199890] xhci_hcd 0000:02:00.0: Could not allocate xHCI USB device data structures
Dec 28 21:47:22 balrog kernel: [12367.199902] hub 8-0:1.0: couldn't allocate port 2 usb_device
Dec 28 21:47:39 balrog kernel: [12383.594380] usb 8-2: new high-speed USB device number 20 using xhci_hcd
Dec 28 21:47:44 balrog kernel: [12388.590485] usb 8-2: device descriptor read/8, error -110
Dec 28 21:47:49 balrog kernel: [12393.687214] usb 8-2: device descriptor read/8, error -110
Dec 28 21:47:49 balrog kernel: [12393.940211] usb 8-2: new high-speed USB device number 21 using xhci_hcd
Dec 28 21:47:54 balrog kernel: [12398.925198] xhci_hcd 0000:02:00.0: Timeout while waiting for address device command
Dec 28 21:47:59 balrog kernel: [12404.110568] xhci_hcd 0000:02:00.0: Timeout while waiting for address device command
Dec 28 21:47:59 balrog kernel: [12404.310978] usb 8-2: device not accepting address 21, error -62
Dec 28 21:47:59 balrog kernel: [12404.311147] xhci_hcd 0000:02:00.0: Bad Slot ID 4
Dec 28 21:47:59 balrog kernel: [12404.311158] xhci_hcd 0000:02:00.0: Could not allocate xHCI USB device data structures
Dec 28 21:47:59 balrog kernel: [12404.311170] hub 8-0:1.0: couldn't allocate port 2 usb_device
Dec 28 21:48:16 balrog kernel: [12420.886103] usb 8-2: new high-speed USB device number 22 using xhci_hcd
Dec 28 21:48:21 balrog kernel: [12425.882233] usb 8-2: device descriptor read/8, error -110
Dec 28 21:48:26 balrog kernel: [12430.978942] usb 8-2: device descriptor read/8, error -110
Dec 28 21:48:26 balrog kernel: [12431.231922] usb 8-2: new high-speed USB device number 23 using xhci_hcd
Dec 28 21:48:31 balrog kernel: [12436.216925] xhci_hcd 0000:02:00.0: Timeout while waiting for address device command
Dec 28 21:48:37 balrog kernel: [12441.402308] xhci_hcd 0000:02:00.0: Timeout while waiting for address device command
Dec 28 21:48:37 balrog kernel: [12441.602707] usb 8-2: device not accepting address 23, error -62
Dec 28 21:48:37 balrog kernel: [12441.602855] xhci_hcd 0000:02:00.0: Bad Slot ID 5
Dec 28 21:48:37 balrog kernel: [12441.602862] xhci_hcd 0000:02:00.0: Could not allocate xHCI USB device data structures
Dec 28 21:48:37 balrog kernel: [12441.602871] hub 8-0:1.0: couldn't allocate port 2 usb_device
Dec 28 21:48:53 balrog kernel: [12458.029266] usb 8-2: new high-speed USB device number 24 using xhci_hcd
Dec 28 21:48:58 balrog kernel: [12463.025396] usb 8-2: device descriptor read/8, error -110
Dec 28 21:49:00 balrog kernel: [12464.739934] usb 8-2: device descriptor read/8, error -71
Dec 28 21:49:00 balrog kernel: [12464.890625] hub 8-0:1.0: unable to enumerate USB device on port 2
Dec 28 21:49:09 balrog kernel: [12473.327180] usb 8-2: new high-speed USB device number 26 using xhci_hcd
Dec 28 21:49:14 balrog kernel: [12478.323315] usb 8-2: device descriptor read/8, error -110
Dec 28 21:49:19 balrog kernel: [12483.419924] usb 8-2: device descriptor read/8, error -110
Dec 28 21:49:19 balrog kernel: [12483.673031] usb 8-2: new high-speed USB device number 27 using xhci_hcd
Dec 28 21:49:24 balrog kernel: [12488.658016] xhci_hcd 0000:02:00.0: Timeout while waiting for address device command
Dec 28 21:49:24 balrog kernel: [12488.858404] xhci_hcd 0000:02:00.0: ERROR: unexpected command completion code 0x0.
Dec 28 21:49:24 balrog kernel: [12489.058819] usb 8-2: device not accepting address 27, error -22
Dec 28 21:49:24 balrog kernel: [12489.211362] usb 8-2: new high-speed USB device number 28 using xhci_hcd
Dec 28 21:49:29 balrog kernel: [12494.207430] usb 8-2: device descriptor read/8, error -110
Dec 28 21:49:35 balrog kernel: [12499.304186] usb 8-2: device descriptor read/8, error -110
Dec 28 21:49:35 balrog kernel: [12499.557212] usb 8-2: new high-speed USB device number 29 using xhci_hcd
Dec 28 21:49:39 balrog kernel: [12503.955903] usb 8-2: Device not responding to set address.
Dec 28 21:49:39 balrog kernel: [12504.156451] usb 8-2: Device not responding to set address.
Dec 28 21:49:40 balrog kernel: [12504.356757] usb 8-2: device not accepting address 29, error -71
Dec 28 21:49:40 balrog kernel: [12504.356796] hub 8-0:1.0: unable to enumerate USB device on port 2
Dec 28 21:49:46 balrog kernel: [12510.396551] usb 8-2: new high-speed USB device number 30 using xhci_hcd
Dec 28 21:49:51 balrog kernel: [12515.392728] usb 8-2: device descriptor read/8, error -110
Dec 28 21:49:56 balrog kernel: [12520.489324] usb 8-2: device descriptor read/8, error -110
Dec 28 21:49:56 balrog kernel: [12520.742385] usb 8-2: new high-speed USB device number 31 using xhci_hcd
Dec 28 21:49:57 balrog kernel: [12521.625727] usb 8-2: Device not responding to set address.
Dec 28 21:49:57 balrog kernel: [12521.826223] usb 8-2: Device not responding to set address.
Dec 28 21:49:57 balrog kernel: [12522.026536] usb 8-2: device not accepting address 31, error -71
Dec 28 21:49:57 balrog kernel: [12522.077409] hub 8-0:1.0: unable to enumerate USB device on port 2
Dec 28 21:53:05 balrog kernel: [12708.732307] usb 8-2: new high-speed USB device number 33 using xhci_hcd
Dec 28 21:53:10 balrog kernel: [12Dec 28 21:59:14 balrog kernel: imklog 5.8.10, log source = /proc/kmsg started.
Dec 28 21:59:14 balrog rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="833" x-info="http://www.rsyslog.com"] start
Dec 28 21:59:14 balrog kernel: [    0.000000] Initializing cgroup subsys cpuset
Dec 28 21:59:14 balrog kernel: [    0.000000] Initializing cgroup subsys cpu
Dec 28 21:59:14 balrog kernel: [    0.000000] Linux version 3.6.10-2.fc17.x86_64 (mockbuild@) (gcc version 4.7.2 20120921 (Red Hat 4.7.2-2) (GCC) ) #1 SMP Tue Dec 11 18:07:34 UTC 2012
...

bits of attached photograph I could be bothered to type up:

Pid: 0, comm: swapper/2 Not tainted 3.6.10-2.fc17.x86_64 #1 Gigabyte Technology Co., Ltd. GA890GPA-UD3H/GA-890GP
Call Trace:
<IRQ>
[<ffffffff810970be>] ? update_cfs_shares+0xbe/0x120
[<ffffffff8146f2cb>] xhci_irq+0x2ab/0x1330
[<ffffffff81099ccb>] ? rebalance_domains+0x10b/0x180
[<ffffffff81470361>] xhci_msi_irq+0x11/0x20
[<ffffffff810e88f4>] handle_irq_event_percpu+0x54/0x1f0
[<ffffffff810e8ad1>] handle_irq_event+0x41/0x70
[<ffffffff810eb2ef>] handle_edge_irq+0x6f/0x110
[<ffffffff810160ff>] handle_irq+0xbf/0x150
[<ffffffff81622ef2>] ? __atomic_notifier_call_chain+0x12/0x20
[<ffffffff81622f16>] ? atomic_notifier_call_chain+0x16/0x20
[<ffffffff81628aca>] do_IRQ+0x5a/0xe0
[<ffffffff8161f3aa>] common_interrupt+0x6a/0a6a
<EOI>
[<ffffffff81042fd6>] ? native_safe_halt+0x6/0x10
[<ffffffff8101c84f>] default_idle+0x4f/0x1a0
[<ffffffff8101ca3a>] amd_e400_idle+0x9a/0x120
[<ffffffff8101d57e>] cpu_idle+0xfe/0x120
[<ffffffff8160dcb9>] start_secondary+0x23e/0x240

RIP  [<ffffffff8146e5b8>] handle_cmd_completion+0x4b8/0xb90
 RSP <ffff88020fc83d48>

Kernel panic - not syncing: Fatal exception in interrupt
panic occurred, switching back to text console

Comment 1 James Hogan 2012-12-31 12:26:49 UTC
I've hit this again with very similar backtrace (only ? frames differ).

Installed debuginfo and poked around in gdb

(gdb) info line *(&handle_cmd_completion + 0x4b8)
Line 1466 of "drivers/usb/host/xhci-ring.c" starts at address 0xffffffff8146e5b0 <handle_cmd_completion+1200>
   and ends at 0xffffffff8146e5be <handle_cmd_completion+1214>.

code:
1466 xhci->devs[slot_id]->cmd_status = GET_COMP_CODE(le32_to_cpu(event->status));
1467 complete(&xhci->addr_dev);

disassembly:
   0xffffffff8146e5b0 <+1200>:  mov    0x110(%r12,%rax,8),%rax
   0xffffffff8146e5b8 <+1208>:  mov    %edx,0x1c60(%rax)
   0xffffffff8146e5be <+1214>:  callq  0xffffffff8108c440 <complete>

so looking at the register dump, %rax is 0, suggesting that xhci->devs[slot_id] == NULL.

I'm building a vanilla v3.7 kernel to see if it happens on that too.

Comment 2 Josh Boyer 2013-03-11 19:36:14 UTC
Are you still seeing this with 3.7.9 or 3.8.2 in updates-testing?

Comment 3 James Hogan 2013-03-24 11:42:22 UTC
Yes, it's still reproducible with 3.8.3-103.fc17.x86_64 from updates-testing.

Comment 4 Alex G. 2013-05-17 21:28:20 UTC
I'm seeing this with 3.6.11-1.fc17.x86_64

I connect a device that fails to enumerate. 10% of the time, I see this panic.

Comment 5 Josh Boyer 2013-06-03 19:06:11 UTC
Are you still seeing this with the 3.9.4 kernel in updates-testing?

Comment 6 Josh Boyer 2013-06-17 18:29:59 UTC
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 2 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.


Note You need to log in before you can comment on or make changes to this bug.