Hide Forgot
Description of problem: Got Orico USB3 -> SATA Enclosure (holds two SATA drives) that behave badly if UAS isn't disabled in kernel 4.4.x In kernel 4.3.x no problem was experienced. Version-Release number of selected component (if applicable): kernel 4.4.3-300.fc23.x86_64 (also 4.4.2). No problems in 4.3.5-300.fc23.x86_64 and previous. How reproducible: Try to write to both drives at the same time. Actual results: Mar 02 11:22:08 larry kernel: md123: Warning: Device sdc8 is misaligned Mar 02 11:22:08 larry kernel: md123: Warning: Device sdb8 is misaligned Mar 04 16:24:23 larry kernel: sd 2:0:0:0: [sdb] tag#3 CDB: Write(10) 2a 00 e8 0a ab f9 00 04 00 00 Mar 04 16:24:23 larry kernel: sd 2:0:0:0: [sdb] tag#2 uas_eh_abort_handler 0 uas-tag 3 inflight: CMD OUT Mar 04 16:24:23 larry kernel: sd 2:0:0:0: [sdb] tag#2 CDB: Write(10) 2a 00 e8 0a a7 f9 00 04 00 00 Mar 04 16:24:23 larry kernel: sd 2:0:0:0: [sdb] tag#1 uas_eh_abort_handler 0 uas-tag 2 inflight: CMD OUT Mar 04 16:24:23 larry kernel: sd 2:0:0:0: [sdb] tag#1 CDB: Write(10) 2a 00 e8 0a a3 f9 00 04 00 00 Mar 04 16:24:23 larry kernel: scsi host2: uas_eh_bus_reset_handler start Mar 04 16:24:28 larry kernel: usb 2-1.4: Disable of device-initiated U1 failed. Mar 04 16:25:20 larry kernel: md/raid1:md0: Disk failure on sdb3, disabling device. md/raid1:md0: Operation continuing on 1 devices. Expected results: Normal working drive as experienced in kernel 4.3.5 and previous. Additional info: # lsusb -v -d 152d:9561 Bus 002 Device 004: ID 152d:9561 JMicron Technology Corp. / JMicron USA Technology Corp. Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 3.00 bDeviceClass 0 bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 9 idVendor 0x152d JMicron Technology Corp. / JMicron USA Technology Corp. idProduct 0x9561 bcdDevice 1.05 iManufacturer 1 JMicron iProduct 2 JMS56x Series iSerial 5 00000000000000000000 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 121 bNumInterfaces 1 bConfigurationValue 1 iConfiguration 4 USB Mass Storage bmAttributes 0xc0 Self Powered MaxPower 2mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 8 Mass Storage bInterfaceSubClass 6 SCSI bInterfaceProtocol 80 Bulk-Only iInterface 6 MSC Bulk-Only Transfer Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 15 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x02 EP 2 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 15 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 1 bNumEndpoints 4 bInterfaceClass 8 Mass Storage bInterfaceSubClass 6 SCSI bInterfaceProtocol 98 iInterface 10 MSC BOT/UAS Transfer Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x01 EP 1 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Command pipe (0x01) Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x82 EP 2 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 MaxStreams 32 Status pipe (0x02) Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x83 EP 3 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 15 MaxStreams 32 Data-in pipe (0x03) Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x04 EP 4 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 15 MaxStreams 32 Data-out pipe (0x04) Binary Object Store Descriptor: bLength 5 bDescriptorType 15 wTotalLength 22 bNumDeviceCaps 2 USB 2.0 Extension Device Capability: bLength 7 bDescriptorType 16 bDevCapabilityType 2 bmAttributes 0x00000002 HIRD Link Power Management (LPM) Supported SuperSpeed USB Device Capability: bLength 10 bDescriptorType 16 bDevCapabilityType 3 bmAttributes 0x00 wSpeedsSupported 0x000e Device can operate at Full Speed (12Mbps) Device can operate at High Speed (480Mbps) Device can operate at SuperSpeed (5Gbps) bFunctionalitySupport 1 Lowest fully-functional device speed is Full Speed (12Mbps) bU1DevExitLat 10 micro seconds bU2DevExitLat 32 micro seconds can't get debug descriptor: Resource temporarily unavailable Device Status: 0x000d Self Powered U1 Enabled U2 Enabled Solution: ======== Solution was found by copying solution from bug 1260207, where: /etc/modprobe.d/usb-storage.conf was created with the following in it: options usb-storage quirks=152d:9561:u
If this worked fine in 4.3, but doesn't work in 4.4, I'm not entirely sure it's the same as the bug you've pointed to. Hans, do you know of any changes in 4.4 that would cause this issue?
Hi, (In reply to Josh Boyer from comment #1) > If this worked fine in 4.3, but doesn't work in 4.4, I'm not entirely sure > it's the same as the bug you've pointed to. > > Hans, do you know of any changes in 4.4 that would cause this issue? I do not know about any uas changes explaining this, I guess there may have been some xhci driver or usb-hub driver changes in 4.4 which trigger this. funnybutton, it is probably best if you send a mail about this to the linux-usb list: http://vger.kernel.org/vger-lists.html#linux-usb Note you do not need to be subscribed to send mails to this list, if you Cc yourself on the original mail you should get all replies. Please also add me to the Cc. Regards, Hans
(In reply to Hans de Goede from comment #2) > Hi, > > (In reply to Josh Boyer from comment #1) > > If this worked fine in 4.3, but doesn't work in 4.4, I'm not entirely sure > > it's the same as the bug you've pointed to. > > > > Hans, do you know of any changes in 4.4 that would cause this issue? > > I do not know about any uas changes explaining this, I guess there may have > been some xhci driver or usb-hub driver changes in 4.4 which trigger this. > > funnybutton, it is probably best if you send a mail about this to the > linux-usb list: > http://vger.kernel.org/vger-lists.html#linux-usb > > Note you do not need to be subscribed to send mails to this list, if you Cc > yourself on the original mail you should get all replies. Please also add me > to the Cc. > > Regards, > > Hans Thanks for the replies. I have been looking at the mailing lists mentioned before posting there. I have noticed this: https://kernel.googlesource.com/pub/scm/linux/kernel/git/stable/linux-stable/+/9fa62b1a31c96715aef34f25000e882ed4ac4876%5E!/#F0 specifically: US_FL_BROKEN_FUA flag now being set for my device. Could this be the cause? I am looking into how to remove this patch (I am a newb) and see what happens.
(In reply to funnybutton from comment #3) > (In reply to Hans de Goede from comment #2) > > Hi, > > > > (In reply to Josh Boyer from comment #1) > > > If this worked fine in 4.3, but doesn't work in 4.4, I'm not entirely sure > > > it's the same as the bug you've pointed to. > > > > > > Hans, do you know of any changes in 4.4 that would cause this issue? > > > > I do not know about any uas changes explaining this, I guess there may have > > been some xhci driver or usb-hub driver changes in 4.4 which trigger this. > > > > funnybutton, it is probably best if you send a mail about this to the > > linux-usb list: > > http://vger.kernel.org/vger-lists.html#linux-usb > > > > Note you do not need to be subscribed to send mails to this list, if you Cc > > yourself on the original mail you should get all replies. Please also add me > > to the Cc. > > > > Regards, > > > > Hans > > Thanks for the replies. I have been looking at the mailing lists mentioned > before posting there. I have noticed this: > > https://kernel.googlesource.com/pub/scm/linux/kernel/git/stable/linux-stable/ > +/9fa62b1a31c96715aef34f25000e882ed4ac4876%5E!/#F0 > > specifically: > > US_FL_BROKEN_FUA flag now being set for my device. > > Could this be the cause? > > I am looking into how to remove this patch (I am a newb) and see what > happens. This patch only applies to 152d:0567, where as you've a 152d:9561 enclosure, so this patch does not affect you. If anything the problem might be that you need US_FL_BROKEN_FUA too, unfortunately this flag cannot be set via quirks so you need to rebuild your kernel to test this. Also can you please attach the full dmesg from the problem occuring, it feels as if you're leaving out quite a few bits from dmesg. E.g. does the scsi layer say something like: > Jun 26 20:47:14 wiggum kernel: [156019.870956] sd 22:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA When the driver initializes the drive ? Note given the large time between the probe of the device and the error occuring I do not think this is FUA related.
Created attachment 1134274 [details] dmesg of kernel 4.3.5 with uas defaulting on
Created attachment 1134275 [details] dmesg of kernel 4.4.3 with uas defaulting on, includes tests showing the failure.
Created attachment 1134276 [details] dmesg of kernel 4.4.35 with uas option off
Created attachment 1134277 [details] test of FUA off
Created attachment 1134278 [details] dmesg of kernel 4.4.3 with uas option off
dmesg's uploaded and a FUA disabled test done. I have never done this before so hopefully I didn't do anything silly. The FUA off did not seem to help. Also does libata.fua=0 act as a test to stop FUA usage in this situation? I have noticed that with uas off in 4.4.3 there is no read cache. I think I noticed also that the first alert message comes from the copying from device. Could the read cache have been broken by something? Read cache was in use by 4.3.5 with no obvious problems. Thanks for looking at this. :)
Ok, I think I've found the problem because your enclosure is multi-lun you're getting more commands submitted then we can handle, which is caused by this commit: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/usb/storage/uas.c?id=64d513ac31bd02a3c9b69ef04444f36c196f9a9d I'll mail the author asking him for help.
(In reply to funnybutton from comment #10) > > Also does libata.fua=0 act as a test to stop FUA usage in this situation? > no, usb devices do not use libata at all. the bridge itself is a SCSI-ATA Translation Layer. > I have noticed that with uas off in 4.4.3 there is no read cache. I think I > noticed also that the first alert message comes from the copying from > device. Could the read cache have been broken by something? > > Read cache was in use by 4.3.5 with no obvious problems. > How do you figure? You mean you see something like this in dmesg? [ 2.094561] sd N:0:0:0: [sdX] ..., read cache: disabled, ... What's the output of `sdparm /dev/sdX | grep RCD`?
(In reply to Tom Yan from comment #12) > > How do you figure? You mean you see something like this in dmesg? > > [ 2.094561] sd N:0:0:0: [sdX] ..., read cache: disabled, ... No I just noticed that "read cache" is not mentioned at all in 4.4.3 with uas off. > What's the output of `sdparm /dev/sdX | grep RCD`? In both working 4.3.5 and not-working 4.4.3 with uas on, both drives state: RCD 0 [cha: n, def: 0, sav: 0]
(In reply to funnybutton from comment #13) > > No I just noticed that "read cache" is not mentioned at all in 4.4.3 with > uas off. > Sorry I wasn't aware that you uploaded your dmesg. It seems that your enclosure respond to SCSI MODE SENSE command differently when in uas and in bot (usb-storage) mode. https://bugzilla.redhat.com/attachment.cgi?id=1134275: [ 2.163571] scsi 2:0:0:0: Direct-Access WDC WD20 EARX-008FB0 0105 PQ: 0 ANSI: 6 [ 2.170504] scsi 2:0:0:1: Direct-Access ST2000DM 001-1CH164 0105 PQ: 0 ANSI: 6 [ 2.174438] sd 2:0:0:1: [sdc] Mode Sense: 67 00 10 08 [ 2.174580] sd 2:0:0:0: [sdb] Mode Sense: 67 00 10 08 [ 2.175046] sd 2:0:0:1: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 2.176495] sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA https://bugzilla.redhat.com/attachment.cgi?id=1134278: [ 2.999684] scsi 2:0:0:0: Direct-Access WDC WD20 EARX-008FB0 0105 PQ: 0 ANSI: 6 [ 3.000254] scsi 2:0:0:1: Direct-Access ST2000DM 001-1CH164 0105 PQ: 0 ANSI: 6 [ 3.003837] sd 2:0:0:1: [sdc] Mode Sense: 47 00 10 08 [ 3.005039] sd 2:0:0:1: [sdc] No Caching mode page found [ 3.005048] sd 2:0:0:1: [sdc] Assuming drive cache: write through [ 3.006127] sd 2:0:0:0: [sdb] Mode Sense: 47 00 10 08 [ 3.006604] sd 2:0:0:0: [sdb] No Caching mode page found [ 3.006613] sd 2:0:0:0: [sdb] Assuming drive cache: write through So nothing has gone bad. It's just the enclosure is not decent enough :P > > In both working 4.3.5 and not-working 4.4.3 with uas on, both drives state: > > RCD 0 [cha: n, def: 0, sav: 0] I bet you won't see that line with uas off on any of the kernel versions. You can try `sdparm -6 /dev/sdX` as well if interested.
Hi, I've written a patch which will hopefully fix this. I've started a scratch-build with this patch: http://koji.fedoraproject.org/koji/taskinfo?taskID=13366765 once the build is finished. please download kernel-core and kernel-modules for your arch and install them using: "sudo rpm -ivh kernel*.rpm" from the commandline. Remove the quirks you added to use usb-storage, reboot into the new kernel and test if the problem is fixed. Once you've successfully tested this patch I'll submit it upstream. Thanks and Regards, Hans
(In reply to Hans de Goede from comment #15) > Hi, > > I've written a patch which will hopefully fix this. I've started a > scratch-build with this patch: > All looks good. sd looks good: [ 1.223662] sd 0:0:0:0: [sda] 976773168 512-byte logical blocks: (500 GB/466 GiB) [ 1.223710] sd 0:0:0:0: Attached scsi generic sg0 type 0 [ 1.223961] sd 0:0:0:0: [sda] Write Protect is off [ 1.223975] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 [ 1.224150] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 1.336180] sd 0:0:0:0: [sda] Attached SCSI disk [ 1.976443] sd 2:0:0:0: Attached scsi generic sg1 type 0 [ 1.976666] sd 2:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB) [ 1.976720] sd 2:0:0:1: Attached scsi generic sg2 type 0 [ 1.977065] sd 2:0:0:1: [sdc] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB) [ 1.977978] sd 2:0:0:0: [sdb] Write Protect is off [ 1.977984] sd 2:0:0:0: [sdb] Mode Sense: 67 00 10 08 [ 1.978406] sd 2:0:0:1: [sdc] Write Protect is off [ 1.978412] sd 2:0:0:1: [sdc] Mode Sense: 67 00 10 08 [ 1.978554] sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 1.978965] sd 2:0:0:1: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 2.053780] sd 2:0:0:0: [sdb] Attached SCSI disk [ 2.054426] sd 2:0:0:1: [sdc] Attached SCSI disk RAIDs not complaining: [ 20.654118] md/raid1:md127: active with 2 out of 2 mirrors [ 20.654329] created bitmap (7 pages) for device md127 [ 20.654683] md127: bitmap initialized from disk: read 1 pages, set 0 of 14091 bits [ 20.656829] md127: detected capacity change from 0 to 1891150856192 [ 20.660340] md/raid1:md124: active with 2 out of 2 mirrors [ 20.662564] md/raid1:md126: active with 2 out of 2 mirrors [ 20.662735] md/raid1:md125: active with 2 out of 2 mirrors [ 20.662796] md125: detected capacity change from 0 to 31457280000 [ 21.056403] created bitmap (1 pages) for device md124 [ 21.083985] created bitmap (1 pages) for device md126 [ 21.084184] md124: bitmap initialized from disk: read 1 pages, set 0 of 32 bits [ 21.084297] md126: bitmap initialized from disk: read 1 pages, set 0 of 938 bits [ 21.137781] md124: detected capacity change from 0 to 2098135040 [ 21.202787] md126: detected capacity change from 0 to 62914560000 copying between drives in the caddy has no problems. copying from RAID in the caddy = 128 MB/s Thank you for your efforts, all looks good :)
Thanks for testing. Fedora kernel team, can one of you add the patch fixing this (submitted upstream, will attach it shortly) to the Fedora kernels for now ? Regards, Hans
Created attachment 1137996 [details] [PATCH] uas: Limit qdepth at the scsi-host level Note this is only necessary for 4.4 and newer, this fixes an uas regression introduced in 4.4.
Applied to all branches. This should show up in the 4.4.7 release whenever that happens.
kernel-4.4.6-301.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-7e602c0e5e
kernel-4.4.6-201.fc22 has been submitted as an update to Fedora 22. https://bodhi.fedoraproject.org/updates/FEDORA-2016-ed5110c4bb
kernel-4.4.6-201.fc22 has been pushed to the Fedora 22 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-ed5110c4bb
kernel-4.4.6-301.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-7e602c0e5e
kernel-4.5.0-302.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-81fd1b03aa
kernel-4.5.0-302.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.
kernel-4.4.6-301.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.
kernel-4.4.6-201.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report.