Bug 1121288 - UAS USB hangs and crashes kernel
Summary: UAS USB hangs and crashes kernel
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Hans de Goede
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-07-18 21:47 UTC by fedora
Modified: 2014-08-23 12:49 UTC (History)
13 users (show)

Fixed In Version: kernel-3.15.7-200.fc20
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1128472 (view as bug list)
Environment:
Last Closed: 2014-08-01 06:02:02 UTC


Attachments (Terms of Use)
boot exceptions on Gigabyte (Eltron) motherboard (79.55 KB, image/jpeg)
2014-07-19 13:31 UTC, fedora
no flags Details
boot exceptions on Gigabyte (Eltron) motherboard (70.83 KB, image/jpeg)
2014-07-19 13:31 UTC, fedora
no flags Details
boot exceptions on Gigabyte (Eltron) motherboard (67.16 KB, image/jpeg)
2014-07-19 13:32 UTC, fedora
no flags Details
boot exceptions on Gigabyte (Eltron) motherboard (73.02 KB, image/jpeg)
2014-07-19 13:32 UTC, fedora
no flags Details
boot exceptions on Gigabyte (Eltron) motherboard (68.42 KB, image/jpeg)
2014-07-19 13:32 UTC, fedora
no flags Details
sections of /var/log/messages showing the kernel crash stack trace. (162.09 KB, text/plain)
2014-08-04 22:15 UTC, Hin-Tak Leung
no flags Details

Description fedora 2014-07-18 21:47:28 UTC
Description of problem:
Connecting a Pluggable USB3-SATA-UASP-1 device with attached drive to a gigabyte z68-ud3-b3 Motherboard causes exceptions in boot process and system not to boot.  Connecting the device after system boot causes complete system freeze. Behavior only exhibited once upgraded from F19 to F20 with 3.15 kernel.

Same type of device connected to USB 3 slot on ATI motherboard does not cause boot failure/system hang, though once I upgraded from 3.14 kernel to 3.15 kernel, OS will intermittently lose connectivity to the device, and recover after about 30 seconds.


Version-Release number of selected component (if applicable):
3.15.5-200.fc20.x86_64


How reproducible:
I've tested every combination of two separate USB devices, with two separate cables across two different machines. In all cases, the device works when booted to 3.14 kernel and fails when connected to 3.15 kernel.





Steps to Reproduce:
1. With no USB device connected
2. Boot machine to 3.15 kernel
3. Observe exception thrown the minute grub menu disappears

1. With USB 3 device connected
2. Boot machine to 3.15 kernel
3. Observe exception thrown the minute grub menu disappears
4. Observe system hang if using a Gigabyte z68-ud3 motherboard
5. Observe intermittent loss of connection to usb drive after mounting it on other mother boards


Actual results:
Failure to boot, hanged connectivity to drive

Expected results:
Functional access to drive

Additional info:
I've read several of the blog sites, specifically here:
http://hansdegoede.livejournal.com/14660.html
 and here
https://lists.fedoraproject.org/pipermail/kernel/2014-May/005231.html

which make clear the UAS (USB attached scsi) protocal is being used to talk to the high speed USB 3 attached devices only as of the 3.15 kernel. I have yet to find a work around to disable this protocol and verify it is the point of failure, but it does appear to be the single thing that changed since this worked last. Here is a message log dump of a usb loss of connectivity and corresponding recovery.


Jul 18 09:12:44 fedorahost kernel: [151866.171361] sd 13:0:0:0: [sdd] uas_eh_abort_handler ffff88042731f480 tag 7, inflight: CMD OUT
Jul 18 09:12:44 fedorahost kernel: sd 13:0:0:0: [sdd] uas_eh_abort_handler ffff88042731f480 tag 7, inflight: CMD OUT
Jul 18 09:12:47 fedorahost kernel: [151869.168824] scsi host13: uas_eh_task_mgmt: ABORT TASK timed out
Jul 18 09:12:47 fedorahost kernel: [151869.168839] sd 13:0:0:0: uas_eh_device_reset_handler
Jul 18 09:12:47 fedorahost kernel: [151869.168841] scsi host13: uas_eh_task_mgmt: LOGICAL UNIT RESET: error already running a task
Jul 18 09:12:47 fedorahost kernel: [151869.168844] scsi host13: uas_eh_bus_reset_handler start
Jul 18 09:12:47 fedorahost kernel: [151869.169002] usb 10-2: stat urb: killed, stream 9
Jul 18 09:12:47 fedorahost kernel: [151869.169124] sd 13:0:0:0: [sdd] uas_data_cmplt ffff88042731f480 tag 7, inflight: CMD abort
Jul 18 09:12:47 fedorahost kernel: [151869.169126] sd 13:0:0:0: [sdd] data cmplt err -2 stream 9
Jul 18 09:12:47 fedorahost kernel: [151869.169134] sd 13:0:0:0: [sdd] uas_zap_dead ffff88042731f480 tag 7, inflight: CMD abort
Jul 18 09:12:47 fedorahost kernel: [151869.169135] sd 13:0:0:0: [sdd] abort completed
Jul 18 09:12:47 fedorahost kernel: scsi host13: uas_eh_task_mgmt: ABORT TASK timed out
Jul 18 09:12:47 fedorahost kernel: sd 13:0:0:0: uas_eh_device_reset_handler
Jul 18 09:12:47 fedorahost kernel: scsi host13: uas_eh_task_mgmt: LOGICAL UNIT RESET: error already running a task
Jul 18 09:12:47 fedorahost kernel: scsi host13: uas_eh_bus_reset_handler start
Jul 18 09:12:47 fedorahost kernel: usb 10-2: stat urb: killed, stream 9
Jul 18 09:12:47 fedorahost kernel: sd 13:0:0:0: [sdd] uas_data_cmplt ffff88042731f480 tag 7, inflight: CMD abort
Jul 18 09:12:47 fedorahost kernel: sd 13:0:0:0: [sdd] data cmplt err -2 stream 9
Jul 18 09:12:47 fedorahost kernel: sd 13:0:0:0: [sdd] uas_zap_dead ffff88042731f480 tag 7, inflight: CMD abort
Jul 18 09:12:47 fedorahost kernel: sd 13:0:0:0: [sdd] abort completed
Jul 18 09:12:47 fedorahost kernel: [151869.273177] usb 10-2: reset SuperSpeed USB device number 3 using xhci_hcd
Jul 18 09:12:47 fedorahost kernel: usb 10-2: reset SuperSpeed USB device number 3 using xhci_hcd
Jul 18 09:12:47 fedorahost kernel: [151869.287418] xhci_hcd 0000:02:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff880078ef5400
Jul 18 09:12:47 fedorahost kernel: [151869.287423] xhci_hcd 0000:02:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff880078ef5448
Jul 18 09:12:47 fedorahost kernel: [151869.287426] xhci_hcd 0000:02:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff880078ef5490
Jul 18 09:12:47 fedorahost kernel: [151869.287429] xhci_hcd 0000:02:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff880078ef54d8
Jul 18 09:12:47 fedorahost kernel: xhci_hcd 0000:02:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff880078ef5400
Jul 18 09:12:47 fedorahost kernel: xhci_hcd 0000:02:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff880078ef5448
Jul 18 09:12:47 fedorahost kernel: xhci_hcd 0000:02:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff880078ef5490
Jul 18 09:12:47 fedorahost kernel: xhci_hcd 0000:02:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff880078ef54d8
Jul 18 09:12:47 fedorahost kernel: [151869.289670] scsi host13: uas_eh_bus_reset_handler success
Jul 18 09:12:47 fedorahost kernel: scsi host13: uas_eh_bus_reset_handler success
Jul 18 09:13:44 fedorahost avahi-daemon[756]: server.c: Packet too short or invalid while reading known answer record. (Maybe a UTF-8 problem?)
Jul 18 09:15:07 fedorahost avahi-daemon[756]: server.c: Packet too short or invalid while reading known answer record. (Maybe a UTF-8 problem?)
Jul 18 09:16:39 fedorahost kernel: [152100.919145] sd 13:0:0:0: [sdd] uas_eh_abort_handler ffff88042731f480 tag 7, inflight: CMD OUT
Jul 18 09:16:39 fedorahost kernel: sd 13:0:0:0: [sdd] uas_eh_abort_handler ffff88042731f480 tag 7, inflight: CMD OUT
Jul 18 09:16:42 fedorahost kernel: [152103.916568] scsi host13: uas_eh_task_mgmt: ABORT TASK timed out
Jul 18 09:16:42 fedorahost kernel: [152103.916576] sd 13:0:0:0: [sdd] uas_eh_abort_handler ffff88042731d680 tag 8, inflight: CMD OUT
Jul 18 09:16:42 fedorahost kernel: [152103.916578] scsi host13: uas_eh_task_mgmt: ABORT TASK: error already running a task
Jul 18 09:16:42 fedorahost kernel: [152103.916580] sd 13:0:0:0: [sdd] uas_eh_abort_handler ffff88042731f600 tag 9, inflight: CMD OUT
Jul 18 09:16:42 fedorahost kernel: [152103.916581] scsi host13: uas_eh_task_mgmt: ABORT TASK: error already running a task
Jul 18 09:16:42 fedorahost kernel: [152103.916584] sd 13:0:0:0: [sdd] uas_eh_abort_handler ffff88042731e280 tag 10, inflight: CMD OUT
Jul 18 09:16:42 fedorahost kernel: [152103.916585] scsi host13: uas_eh_task_mgmt: ABORT TASK: error already running a task
Jul 18 09:16:42 fedorahost kernel: [152103.916588] sd 13:0:0:0: [sdd] uas_eh_abort_handler ffff88042731e880 tag 11, inflight: CMD OUT
Jul 18 09:16:42 fedorahost kernel: [152103.916589] scsi host13: uas_eh_task_mgmt: ABORT TASK: error already running a task
Jul 18 09:16:42 fedorahost kernel: [152103.916591] sd 13:0:0:0: [sdd] uas_eh_abort_handler ffff88042731d200 tag 0, inflight: CMD OUT
Jul 18 09:16:42 fedorahost kernel: [152103.916592] scsi host13: uas_eh_task_mgmt: ABORT TASK: error already running a task
Jul 18 09:16:42 fedorahost kernel: [152103.916642] sd 13:0:0:0: uas_eh_device_reset_handler
Jul 18 09:16:42 fedorahost kernel: [152103.916645] scsi host13: uas_eh_task_mgmt: LOGICAL UNIT RESET: error already running a task
Jul 18 09:16:42 fedorahost kernel: [152103.916648] scsi host13: uas_eh_bus_reset_handler start
Jul 18 09:16:42 fedorahost kernel: [152103.916807] usb 10-2: stat urb: killed, stream 2
Jul 18 09:16:42 fedorahost kernel: [152103.916927] usb 10-2: stat urb: killed, stream 13
Jul 18 09:16:42 fedorahost kernel: [152103.916968] usb 10-2: stat urb: killed, stream 12
Jul 18 09:16:42 fedorahost kernel: [152103.917009] usb 10-2: stat urb: killed, stream 11
Jul 18 09:16:42 fedorahost kernel: [152103.917050] usb 10-2: stat urb: killed, stream 10
Jul 18 09:16:42 fedorahost kernel: [152103.917160] usb 10-2: stat urb: killed, stream 9
Jul 18 09:16:42 fedorahost kernel: [152103.917308] sd 13:0:0:0: [sdd] uas_data_cmplt ffff88042731d200 tag 0, inflight: CMD abort
Jul 18 09:16:42 fedorahost kernel: [152103.917309] sd 13:0:0:0: [sdd] data cmplt err -2 stream 2

Comment 1 Hans de Goede 2014-07-18 21:53:18 UTC
Thanks for the bug report, this problem is likely specific to the xhci chipset used on yuo motherboard, can you please provide the output of "lspci -nn" ?

Comment 2 fedora 2014-07-18 22:12:21 UTC
00:00.0 Host bridge [0600]: Intel Corporation 5520/5500/X58 I/O Hub to ESI Port [8086:3405] (rev 13)
00:01.0 PCI bridge [0604]: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 [8086:3408] (rev 13)
00:02.0 PCI bridge [0604]: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 2 [8086:3409] (rev 13)
00:03.0 PCI bridge [0604]: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 3 [8086:340a] (rev 13)
00:07.0 PCI bridge [0604]: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7 [8086:340e] (rev 13)
00:14.0 PIC [0800]: Intel Corporation 7500/5520/5500/X58 I/O Hub System Management Registers [8086:342e] (rev 13)
00:14.1 PIC [0800]: Intel Corporation 7500/5520/5500/X58 I/O Hub GPIO and Scratch Pad Registers [8086:3422] (rev 13)
00:14.2 PIC [0800]: Intel Corporation 7500/5520/5500/X58 I/O Hub Control Status and RAS Registers [8086:3423] (rev 13)
00:14.3 PIC [0800]: Intel Corporation 7500/5520/5500/X58 I/O Hub Throttle Registers [8086:3438] (rev 13)
00:1a.0 USB controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #4 [8086:3a37]
00:1a.1 USB controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #5 [8086:3a38]
00:1a.2 USB controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #6 [8086:3a39]
00:1a.7 USB controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #2 [8086:3a3c]
00:1b.0 Audio device [0403]: Intel Corporation 82801JI (ICH10 Family) HD Audio Controller [8086:3a3e]
00:1c.0 PCI bridge [0604]: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 1 [8086:3a40]
00:1c.2 PCI bridge [0604]: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 3 [8086:3a44]
00:1d.0 USB controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #1 [8086:3a34]
00:1d.1 USB controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #2 [8086:3a35]
00:1d.2 USB controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #3 [8086:3a36]
00:1d.7 USB controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #1 [8086:3a3a]
00:1e.0 PCI bridge [0604]: Intel Corporation 82801 PCI Bridge [8086:244e] (rev 90)
00:1f.0 ISA bridge [0601]: Intel Corporation 82801JIR (ICH10R) LPC Interface Controller [8086:3a16]
00:1f.2 IDE interface [0101]: Intel Corporation 82801JI (ICH10 Family) 4 port SATA IDE Controller #1 [8086:3a20]
00:1f.3 SMBus [0c05]: Intel Corporation 82801JI (ICH10 Family) SMBus Controller [8086:3a30]
00:1f.5 IDE interface [0101]: Intel Corporation 82801JI (ICH10 Family) 2 port SATA IDE Controller #2 [8086:3a26]
01:00.0 IDE interface [0101]: Marvell Technology Group Ltd. Device [1b4b:91a3] (rev 11)
02:00.0 USB controller [0c03]: NEC Corporation uPD720200 USB 3.0 Host Controller [1033:0194] (rev 03)
03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF116 [GeForce GTX 550 Ti] [10de:1244] (rev a1)
03:00.1 Audio device [0403]: NVIDIA Corporation GF116 High Definition Audio Controller [10de:0bee] (rev a1)
05:00.0 Ethernet controller [0200]: Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller [11ab:4364] (rev 12)
07:02.0 FireWire (IEEE 1394) [0c00]: VIA Technologies, Inc. VT6306/7/8 [Fire II(M)] IEEE 1394 OHCI Controller [1106:3044] (rev c0)
ff:00.0 Host bridge [0600]: Intel Corporation Device [8086:2c71] (rev 02)
ff:00.1 Host bridge [0600]: Intel Corporation Xeon 5600 Series QuickPath Architecture System Address Decoder [8086:2d81] (rev 02)
ff:02.0 Host bridge [0600]: Intel Corporation Xeon 5600 Series QPI Link 0 [8086:2d90] (rev 02)
ff:02.1 Host bridge [0600]: Intel Corporation Xeon 5600 Series QPI Physical 0 [8086:2d91] (rev 02)
ff:02.2 Host bridge [0600]: Intel Corporation Xeon 5600 Series Mirror Port Link 0 [8086:2d92] (rev 02)
ff:02.3 Host bridge [0600]: Intel Corporation Xeon 5600 Series Mirror Port Link 1 [8086:2d93] (rev 02)
ff:03.0 Host bridge [0600]: Intel Corporation Xeon 5600 Series Integrated Memory Controller Registers [8086:2d98] (rev 02)
ff:03.1 Host bridge [0600]: Intel Corporation Xeon 5600 Series Integrated Memory Controller Target Address Decoder [8086:2d99] (rev 02)
ff:03.4 Host bridge [0600]: Intel Corporation Xeon 5600 Series Integrated Memory Controller Test Registers [8086:2d9c] (rev 02)
ff:04.0 Host bridge [0600]: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Control [8086:2da0] (rev 02)
ff:04.1 Host bridge [0600]: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Address [8086:2da1] (rev 02)
ff:04.2 Host bridge [0600]: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Rank [8086:2da2] (rev 02)
ff:04.3 Host bridge [0600]: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Thermal Control [8086:2da3] (rev 02)
ff:05.0 Host bridge [0600]: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Control [8086:2da8] (rev 02)
ff:05.1 Host bridge [0600]: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Address [8086:2da9] (rev 02)
ff:05.2 Host bridge [0600]: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Rank [8086:2daa] (rev 02)
ff:05.3 Host bridge [0600]: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Thermal Control [8086:2dab] (rev 02)
ff:06.0 Host bridge [0600]: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Control [8086:2db0] (rev 02)
ff:06.1 Host bridge [0600]: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Address [8086:2db1] (rev 02)
ff:06.2 Host bridge [0600]: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Rank [8086:2db2] (rev 02)
ff:06.3 Host bridge [0600]: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Thermal Control [8086:2db3] (rev 02)

Comment 3 fedora 2014-07-18 22:50:55 UTC
Is there a flag I can set to disable UAS while a patch is worked on (assuming UAS is the problem)?

I quite literally work off my USB 3 portable drive and USB 2 (which still works fine)  speeds simply are too slow to do so effectively. I am currently booting to the 3.14 kernel, which also works, but causes other fairly serious inconveniences.

Comment 4 Hans de Goede 2014-07-19 08:39:05 UTC
Thanks for the lspci output, it seems that you've the same NEC xhci controller as I use, only you've a rev 3 one and I've a rev 4 one. On second thought, I don't think that is the cause though as I see no xhci related error messages in the log you've posted.

Looking closer at the logs, it seems that some scsi command is timing out, then the scsi layer tries an abort, that times out too, and then things get resurrected by a usb device reset, and everything works again for a while (aprox 4 minutes in the log above). I guess with the broken kernel things do work but you experience 33 seconds freezes every couple of minutes ?

This seems like the disk and/or the dock simply stop responding until reset. I realize this is probably a lot of work, but can you try with a different disk? This might just be a bust disk (uas will use tcq and thus hit the disk a lot harder, it should also makes things much faster).

(In reply to fedora from comment #3)
> Is there a flag I can set to disable UAS while a patch is worked on
> (assuming UAS is the problem)?

Yes, try adding:

usb-storage.quirks=174c:55aa:u 

To the kernel command line, this assumes that your USB3-SATA-UASP-1 device has the same usb-ids as
mine, if not adjust accordingly.

Comment 5 fedora 2014-07-19 12:42:00 UTC
Your description of the behavior of the device on that motherboard (ATI) is accurate to the letter. I own several disks and several docks, and on that motherboard, I had the opportunity to swap out docks and cables (not disks) and found the behavior to be identical.

On my Gigabyte motherboard, the behavior is different. Plugging in either dock with any disk causes the OS to complete freeze. No mouse movement on KDE desktop, no response to keyboard hotkeys to swap over to a console, and no apparent entries in the messages log. Its a complete system hang.

If I boot with the USB 3 device in place on the Gigabyte motherboard, it continually cycles through errors and never boots. 

On both  motherboards, with either dock and either disk, an exception appears immediately after the grub menu disappears, though I have no idea how to capture this error, short of photographing it. Perhaps I shall do just that and upload the photo. 

Here is the lspci -nn output for the Gigabyte motherboard that hangs:
00:00.0 Host bridge [0600]: Intel Corporation 2nd Generation Core Processor Family DRAM Controller [8086:0100] (rev 09)
00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200/2nd Generation Core Processor Family PCI Express Root Port [8086:0101] (rev 09)
00:16.0 Communication controller [0780]: Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1 [8086:1c3a] (rev 04)
00:1a.0 USB controller [0c03]: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 [8086:1c2d] (rev 05)
00:1b.0 Audio device [0403]: Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller [8086:1c20] (rev 05)
00:1c.0 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 [8086:1c10] (rev b5)
00:1c.5 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 6 [8086:1c1a] (rev b5)
00:1c.6 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 7 [8086:1c1c] (rev b5)
00:1c.7 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 8 [8086:1c1e] (rev b5)
00:1d.0 USB controller [0c03]: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 [8086:1c26] (rev 05)
00:1e.0 PCI bridge [0604]: Intel Corporation 82801 PCI Bridge [8086:244e] (rev a5)
00:1f.0 ISA bridge [0601]: Intel Corporation Z68 Express Chipset Family LPC Controller [8086:1c44] (rev 05)
00:1f.2 IDE interface [0101]: Intel Corporation 6 Series/C200 Series Chipset Family 4 port SATA IDE Controller [8086:1c00] (rev 05)
00:1f.3 SMBus [0c05]: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller [8086:1c22] (rev 05)
00:1f.5 IDE interface [0101]: Intel Corporation 6 Series/C200 Series Chipset Family 2 port SATA IDE Controller [8086:1c08] (rev 05)
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK106 [GeForce GTX 660] [10de:11c0] (rev a1)
01:00.1 Audio device [0403]: NVIDIA Corporation GK106 HDMI Audio Controller [10de:0e0b] (rev a1)
03:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 06)
04:00.0 PCI bridge [0604]: Integrated Technology Express, Inc. Device [1283:8892] (rev 10)
06:00.0 USB controller [0c03]: Etron Technology, Inc. EJ168 USB 3.0 Host Controller [1b6f:7023] (rev 01)

Comment 6 fedora 2014-07-19 12:42:28 UTC
Your description of the behavior of the device on that motherboard (ATI) is accurate to the letter. I own several disks and several docks, and on that motherboard, I had the opportunity to swap out docks and cables (not disks) and found the behavior to be identical.

On my Gigabyte motherboard, the behavior is different. Plugging in either dock with any disk causes the OS to complete freeze. No mouse movement on KDE desktop, no response to keyboard hotkeys to swap over to a console, and no apparent entries in the messages log. Its a complete system hang.

If I boot with the USB 3 device in place on the Gigabyte motherboard, it continually cycles through errors and never boots. 

On both  motherboards, with either dock and either disk, an exception appears immediately after the grub menu disappears, though I have no idea how to capture this error, short of photographing it. Perhaps I shall do just that and upload the photo. 

Here is the lspci -nn output for the Gigabyte motherboard that hangs:
00:00.0 Host bridge [0600]: Intel Corporation 2nd Generation Core Processor Family DRAM Controller [8086:0100] (rev 09)
00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200/2nd Generation Core Processor Family PCI Express Root Port [8086:0101] (rev 09)
00:16.0 Communication controller [0780]: Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1 [8086:1c3a] (rev 04)
00:1a.0 USB controller [0c03]: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 [8086:1c2d] (rev 05)
00:1b.0 Audio device [0403]: Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller [8086:1c20] (rev 05)
00:1c.0 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 [8086:1c10] (rev b5)
00:1c.5 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 6 [8086:1c1a] (rev b5)
00:1c.6 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 7 [8086:1c1c] (rev b5)
00:1c.7 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 8 [8086:1c1e] (rev b5)
00:1d.0 USB controller [0c03]: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 [8086:1c26] (rev 05)
00:1e.0 PCI bridge [0604]: Intel Corporation 82801 PCI Bridge [8086:244e] (rev a5)
00:1f.0 ISA bridge [0601]: Intel Corporation Z68 Express Chipset Family LPC Controller [8086:1c44] (rev 05)
00:1f.2 IDE interface [0101]: Intel Corporation 6 Series/C200 Series Chipset Family 4 port SATA IDE Controller [8086:1c00] (rev 05)
00:1f.3 SMBus [0c05]: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller [8086:1c22] (rev 05)
00:1f.5 IDE interface [0101]: Intel Corporation 6 Series/C200 Series Chipset Family 2 port SATA IDE Controller [8086:1c08] (rev 05)
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK106 [GeForce GTX 660] [10de:11c0] (rev a1)
01:00.1 Audio device [0403]: NVIDIA Corporation GK106 HDMI Audio Controller [10de:0e0b] (rev a1)
03:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 06)
04:00.0 PCI bridge [0604]: Integrated Technology Express, Inc. Device [1283:8892] (rev 10)
06:00.0 USB controller [0c03]: Etron Technology, Inc. EJ168 USB 3.0 Host Controller [1b6f:7023] (rev 01)

Comment 7 Hans de Goede 2014-07-19 12:52:56 UTC
To be clear, you only get the exception on the gigabyte motherboard, not one the ones where you get the 33 seconds pauses, right ?

The issue on the gigabyte motherboard sounds like it is the same one as this one:

https://bugzilla.kernel.org/show_bug.cgi?id=80101

Which basically boils down to uas + etron_xhci + asmedia_usb_sata_bridge = broken. I've ordered my own etron pci-e addon card and that literary has arrived about an hour ago. I'll try to look into this sometime during the coming week.

A picture of the exception would be welcome extra info.

Comment 8 fedora 2014-07-19 13:09:03 UTC
No, I get exceptions on both motherboards after the grub menu disappears. The exception on the gigabyte motherboard is fatal, the ATI recovers and boots. I'm not certain, but I believe the exception occurs whether or not there is a disk in the dock. So long as the dock is plugged into the USB 3 port, an exception will appear after grub menu disappears.

Comment 9 Hans de Goede 2014-07-19 13:15:38 UTC
Hmm, on the one which does boot you should be able to see the exception in dmesg, can you do:
dmesg > log directly after boot and then attach the generated log file here ?

Comment 10 fedora 2014-07-19 13:29:37 UTC
Unfortunately, I won't have access to that machine until Monday morning in California (UTC-7). I'll be sure and post the message then. 

Your USB quirks kernel flag worked like a charm. Device operating normally now on old driver.

I took several photos of the stack traces/dumps from this machine on a failed boot with USB3 attached. I will upload them immediately.

Comment 11 fedora 2014-07-19 13:31:21 UTC
Created attachment 919276 [details]
boot exceptions on Gigabyte (Eltron) motherboard

Comment 12 fedora 2014-07-19 13:31:44 UTC
Created attachment 919277 [details]
boot exceptions on Gigabyte (Eltron) motherboard

Comment 13 fedora 2014-07-19 13:32:05 UTC
Created attachment 919278 [details]
boot exceptions on Gigabyte (Eltron) motherboard

Comment 14 fedora 2014-07-19 13:32:23 UTC
Created attachment 919279 [details]
boot exceptions on Gigabyte (Eltron) motherboard

Comment 15 fedora 2014-07-19 13:32:47 UTC
Created attachment 919280 [details]
boot exceptions on Gigabyte (Eltron) motherboard

Comment 16 Hans de Goede 2014-07-28 16:02:25 UTC
Thanks for the bug report.

My own Etron pci-e addon card has arrived last week, and I've spend 3 full days debugging its buggy bulk streams implementation. But in the end its bulk streams support is simply too buggy.

So I've written a patch blacklisting streams on this controller. This should make things automatically fall-back to usb-storage on this controller.

I've added the patch for this to the official Fedora 20 kernel packages, so it should get picked up by the next F-20 kernel build. When that happens please test without the usb_storage quirk on the kernel command-line, and let me know if things still work on the Etron controller.

As for the problem on the non Etron controller, it would be great if you could test this with another disk. If you've more info on that issue please open a new bug. Lets use this bug only for tracking the Etron issue.

Comment 17 Fedora Update System 2014-07-29 03:56:13 UTC
kernel-3.15.7-200.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/kernel-3.15.7-200.fc20

Comment 18 Fedora Update System 2014-07-30 07:03:37 UTC
Package kernel-3.15.7-200.fc20:
* should fix your issue,
* was pushed to the Fedora 20 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-3.15.7-200.fc20'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2014-9010/kernel-3.15.7-200.fc20
then log in and leave karma (feedback).

Comment 19 Fedora Update System 2014-08-01 06:02:02 UTC
kernel-3.15.7-200.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 20 Hin-Tak Leung 2014-08-04 22:01:15 UTC
Have uas related crash with 3.15.7-200.fc20.x86_64. Due to another problem (https://bugzilla.redhat.com/show_bug.cgi?id=1094983), I skipped most of 3.14.x and went from using 3.13.10-200.fc20.x86_64 to 3.15.7-200.fc20.x86_64, for regular use. kernel 3.15.7-200.fc20.x86_64 crashes with the insertion of a drive which works okay under 3.13.10-200.fc20.x86_64 . 

the top part of the crash is:

Aug  4 00:57:22 localhost kernel: [102253.840388] ------------[ cut here ]------------
Aug  4 00:57:22 localhost kernel: [102253.840396] WARNING: CPU: 1 PID: 28236 at lib/list_debug.c:53 __list_del_entry+0x63/0xd0()
Aug  4 00:57:22 localhost kernel: [102253.840399] list_del corruption, ffff880012e25938->next is LIST_POISON1 (dead000000100100)
...
Aug  4 00:57:22 localhost kernel: [102253.840474] CPU: 1 PID: 28236 Comm: kworker/u4:1 Not tainted 3.15.7-200.fc20.x86_64 #1
...
Aug  4 00:57:22 localhost kernel: [102253.840512]  [<ffffffff81366a33>] __list_del_entry+0x63/0xd0
Aug  4 00:57:22 localhost kernel: [102253.840517]  [<ffffffffa06db62c>] uas_mark_cmd_dead+0x5c/0xd0 [uas]
Aug  4 00:57:22 localhost kernel: [102253.840521]  [<ffffffffa06ddf0e>] uas_eh_abort_handler+0x9e/0x104 [uas]
Aug  4 00:57:22 localhost kernel: [102253.840525]  [<ffffffff81483e1f>] scmd_eh_abort_handler+0xbf/0x480
...

I'll attach the full crash log next.

The external usb drive is: 
Bus 001 Device 004: ID 0bc2:2312 Seagate RSS LLC


$ lspci -nn | grep USB
00:13.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB600 USB (OHCI0) [1002:4387]
00:13.1 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB600 USB (OHCI1) [1002:4388]
00:13.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB600 USB (OHCI2) [1002:4389]
00:13.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB600 USB (OHCI3) [1002:438a]
00:13.4 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB600 USB (OHCI4) [1002:438b]
00:13.5 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB600 USB Controller (EHCI) [1002:4386]

Comment 21 Hin-Tak Leung 2014-08-04 22:15:11 UTC
Created attachment 924014 [details]
sections of /var/log/messages showing the kernel crash stack trace.

Comment 22 Hin-Tak Leung 2014-08-04 22:15:49 UTC
should I file new or should somebody re-open this?

Comment 23 fedora 2014-08-15 19:27:02 UTC
Sorry for the delay on this. For the one that experiences the 30 second timeouts reading USB, I finally got the dmesg of the exception occurring after grub selects the os to boot. Here is the relevant portion.

[    0.925452] ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    0.925591] ata8.00: ATAPI: MARVELL VIRTUALL, 1.09, max UDMA/66
[    0.925745] ata8.00: configured for UDMA/66
[    0.926494] ata1: SATA link down (SStatus 0 SControl 300)
[    0.928844] scsi 7:0:0:0: Processor         Marvell  91xx Config      1.01 PQ: 0 ANSI: 5
[    0.937301] ata12: SATA link down (SStatus 0 SControl 300)
[    0.948017] ata11: SATA link down (SStatus 0 SControl 300)
[    0.949452] ata8.00: exception Emask 0x1 SAct 0x0 SErr 0x0 action 0x6
[    0.949501] ata8.00: irq_stat 0x40000001
[    0.949544] scsi 7:0:0:0: CDB:
[    0.949546] Inquiry: 12 01 00 00 ff 00
[    0.949556] ata8.00: cmd a0/01:00:00:00:01/00:00:00:00:00/a0 tag 2 dma 16640 in
         res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x3 (HSM violation)
[    0.949618] ata8: hard resetting link
[    1.171375] usb 5-1: new low-speed USB device number 2 using uhci_hcd
[    1.256199] ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    1.256551] ata8.00: configured for UDMA/66
[    1.256678] ata8: EH complete

Comment 24 Hin-Tak Leung 2014-08-17 06:07:19 UTC
My issue was filed separately as bug 1128472 and fixed as off kernel-3.15.10-200.fc20/3.16.1-300.fc21.x86_64 .

Comment 25 Hans de Goede 2014-08-23 12:49:10 UTC
(In reply to fedora from comment #23)
> Sorry for the delay on this. For the one that experiences the 30 second
> timeouts reading USB, I finally got the dmesg of the exception occurring
> after grub selects the os to boot. Here is the relevant portion.
> 
> [    0.925452] ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> [    0.925591] ata8.00: ATAPI: MARVELL VIRTUALL, 1.09, max UDMA/66
> [    0.925745] ata8.00: configured for UDMA/66
> [    0.926494] ata1: SATA link down (SStatus 0 SControl 300)
> [    0.928844] scsi 7:0:0:0: Processor         Marvell  91xx Config     
> 1.01 PQ: 0 ANSI: 5
> [    0.937301] ata12: SATA link down (SStatus 0 SControl 300)
> [    0.948017] ata11: SATA link down (SStatus 0 SControl 300)
> [    0.949452] ata8.00: exception Emask 0x1 SAct 0x0 SErr 0x0 action 0x6
> [    0.949501] ata8.00: irq_stat 0x40000001
> [    0.949544] scsi 7:0:0:0: CDB:
> [    0.949546] Inquiry: 12 01 00 00 ff 00
> [    0.949556] ata8.00: cmd a0/01:00:00:00:01/00:00:00:00:00/a0 tag 2 dma
> 16640 in
>          res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x3 (HSM violation)
> [    0.949618] ata8: hard resetting link
> [    1.171375] usb 5-1: new low-speed USB device number 2 using uhci_hcd
> [    1.256199] ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> [    1.256551] ata8.00: configured for UDMA/66
> [    1.256678] ata8: EH complete

That seems to be unrelated to uas / usb in general. Can you please file a new bug for tracking this ?


Note You need to log in before you can comment on or make changes to this bug.