Bug 1315013
Summary: | JMicron USB to SATA Bridge (152d:9561) JMS56x Series requires usb-storage quirks to disable uas | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | funnybutton <funnybutton> |
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> |
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 23 | CC: | gansalmon, hdegoede, itamar, jonathan, kernel-maint, labbott, madhu.chinakonda, mchehab |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | kernel-4.5.0-302.fc24 kernel-4.4.6-301.fc23 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2016-04-02 15:54:06 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Attachments: |
Description
funnybutton
2016-03-05 17:16:16 UTC
If this worked fine in 4.3, but doesn't work in 4.4, I'm not entirely sure it's the same as the bug you've pointed to. Hans, do you know of any changes in 4.4 that would cause this issue? Hi, (In reply to Josh Boyer from comment #1) > If this worked fine in 4.3, but doesn't work in 4.4, I'm not entirely sure > it's the same as the bug you've pointed to. > > Hans, do you know of any changes in 4.4 that would cause this issue? I do not know about any uas changes explaining this, I guess there may have been some xhci driver or usb-hub driver changes in 4.4 which trigger this. funnybutton, it is probably best if you send a mail about this to the linux-usb list: http://vger.kernel.org/vger-lists.html#linux-usb Note you do not need to be subscribed to send mails to this list, if you Cc yourself on the original mail you should get all replies. Please also add me to the Cc. Regards, Hans (In reply to Hans de Goede from comment #2) > Hi, > > (In reply to Josh Boyer from comment #1) > > If this worked fine in 4.3, but doesn't work in 4.4, I'm not entirely sure > > it's the same as the bug you've pointed to. > > > > Hans, do you know of any changes in 4.4 that would cause this issue? > > I do not know about any uas changes explaining this, I guess there may have > been some xhci driver or usb-hub driver changes in 4.4 which trigger this. > > funnybutton, it is probably best if you send a mail about this to the > linux-usb list: > http://vger.kernel.org/vger-lists.html#linux-usb > > Note you do not need to be subscribed to send mails to this list, if you Cc > yourself on the original mail you should get all replies. Please also add me > to the Cc. > > Regards, > > Hans Thanks for the replies. I have been looking at the mailing lists mentioned before posting there. I have noticed this: https://kernel.googlesource.com/pub/scm/linux/kernel/git/stable/linux-stable/+/9fa62b1a31c96715aef34f25000e882ed4ac4876%5E!/#F0 specifically: US_FL_BROKEN_FUA flag now being set for my device. Could this be the cause? I am looking into how to remove this patch (I am a newb) and see what happens. (In reply to funnybutton from comment #3) > (In reply to Hans de Goede from comment #2) > > Hi, > > > > (In reply to Josh Boyer from comment #1) > > > If this worked fine in 4.3, but doesn't work in 4.4, I'm not entirely sure > > > it's the same as the bug you've pointed to. > > > > > > Hans, do you know of any changes in 4.4 that would cause this issue? > > > > I do not know about any uas changes explaining this, I guess there may have > > been some xhci driver or usb-hub driver changes in 4.4 which trigger this. > > > > funnybutton, it is probably best if you send a mail about this to the > > linux-usb list: > > http://vger.kernel.org/vger-lists.html#linux-usb > > > > Note you do not need to be subscribed to send mails to this list, if you Cc > > yourself on the original mail you should get all replies. Please also add me > > to the Cc. > > > > Regards, > > > > Hans > > Thanks for the replies. I have been looking at the mailing lists mentioned > before posting there. I have noticed this: > > https://kernel.googlesource.com/pub/scm/linux/kernel/git/stable/linux-stable/ > +/9fa62b1a31c96715aef34f25000e882ed4ac4876%5E!/#F0 > > specifically: > > US_FL_BROKEN_FUA flag now being set for my device. > > Could this be the cause? > > I am looking into how to remove this patch (I am a newb) and see what > happens. This patch only applies to 152d:0567, where as you've a 152d:9561 enclosure, so this patch does not affect you. If anything the problem might be that you need US_FL_BROKEN_FUA too, unfortunately this flag cannot be set via quirks so you need to rebuild your kernel to test this. Also can you please attach the full dmesg from the problem occuring, it feels as if you're leaving out quite a few bits from dmesg. E.g. does the scsi layer say something like: > Jun 26 20:47:14 wiggum kernel: [156019.870956] sd 22:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA When the driver initializes the drive ? Note given the large time between the probe of the device and the error occuring I do not think this is FUA related. Created attachment 1134274 [details]
dmesg of kernel 4.3.5 with uas defaulting on
Created attachment 1134275 [details]
dmesg of kernel 4.4.3 with uas defaulting on, includes tests showing the failure.
Created attachment 1134276 [details]
dmesg of kernel 4.4.35 with uas option off
Created attachment 1134277 [details]
test of FUA off
Created attachment 1134278 [details]
dmesg of kernel 4.4.3 with uas option off
dmesg's uploaded and a FUA disabled test done. I have never done this before so hopefully I didn't do anything silly. The FUA off did not seem to help. Also does libata.fua=0 act as a test to stop FUA usage in this situation? I have noticed that with uas off in 4.4.3 there is no read cache. I think I noticed also that the first alert message comes from the copying from device. Could the read cache have been broken by something? Read cache was in use by 4.3.5 with no obvious problems. Thanks for looking at this. :) Ok, I think I've found the problem because your enclosure is multi-lun you're getting more commands submitted then we can handle, which is caused by this commit: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/usb/storage/uas.c?id=64d513ac31bd02a3c9b69ef04444f36c196f9a9d I'll mail the author asking him for help. (In reply to funnybutton from comment #10) > > Also does libata.fua=0 act as a test to stop FUA usage in this situation? > no, usb devices do not use libata at all. the bridge itself is a SCSI-ATA Translation Layer. > I have noticed that with uas off in 4.4.3 there is no read cache. I think I > noticed also that the first alert message comes from the copying from > device. Could the read cache have been broken by something? > > Read cache was in use by 4.3.5 with no obvious problems. > How do you figure? You mean you see something like this in dmesg? [ 2.094561] sd N:0:0:0: [sdX] ..., read cache: disabled, ... What's the output of `sdparm /dev/sdX | grep RCD`? (In reply to Tom Yan from comment #12) > > How do you figure? You mean you see something like this in dmesg? > > [ 2.094561] sd N:0:0:0: [sdX] ..., read cache: disabled, ... No I just noticed that "read cache" is not mentioned at all in 4.4.3 with uas off. > What's the output of `sdparm /dev/sdX | grep RCD`? In both working 4.3.5 and not-working 4.4.3 with uas on, both drives state: RCD 0 [cha: n, def: 0, sav: 0] (In reply to funnybutton from comment #13) > > No I just noticed that "read cache" is not mentioned at all in 4.4.3 with > uas off. > Sorry I wasn't aware that you uploaded your dmesg. It seems that your enclosure respond to SCSI MODE SENSE command differently when in uas and in bot (usb-storage) mode. https://bugzilla.redhat.com/attachment.cgi?id=1134275: [ 2.163571] scsi 2:0:0:0: Direct-Access WDC WD20 EARX-008FB0 0105 PQ: 0 ANSI: 6 [ 2.170504] scsi 2:0:0:1: Direct-Access ST2000DM 001-1CH164 0105 PQ: 0 ANSI: 6 [ 2.174438] sd 2:0:0:1: [sdc] Mode Sense: 67 00 10 08 [ 2.174580] sd 2:0:0:0: [sdb] Mode Sense: 67 00 10 08 [ 2.175046] sd 2:0:0:1: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 2.176495] sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA https://bugzilla.redhat.com/attachment.cgi?id=1134278: [ 2.999684] scsi 2:0:0:0: Direct-Access WDC WD20 EARX-008FB0 0105 PQ: 0 ANSI: 6 [ 3.000254] scsi 2:0:0:1: Direct-Access ST2000DM 001-1CH164 0105 PQ: 0 ANSI: 6 [ 3.003837] sd 2:0:0:1: [sdc] Mode Sense: 47 00 10 08 [ 3.005039] sd 2:0:0:1: [sdc] No Caching mode page found [ 3.005048] sd 2:0:0:1: [sdc] Assuming drive cache: write through [ 3.006127] sd 2:0:0:0: [sdb] Mode Sense: 47 00 10 08 [ 3.006604] sd 2:0:0:0: [sdb] No Caching mode page found [ 3.006613] sd 2:0:0:0: [sdb] Assuming drive cache: write through So nothing has gone bad. It's just the enclosure is not decent enough :P > > In both working 4.3.5 and not-working 4.4.3 with uas on, both drives state: > > RCD 0 [cha: n, def: 0, sav: 0] I bet you won't see that line with uas off on any of the kernel versions. You can try `sdparm -6 /dev/sdX` as well if interested. Hi, I've written a patch which will hopefully fix this. I've started a scratch-build with this patch: http://koji.fedoraproject.org/koji/taskinfo?taskID=13366765 once the build is finished. please download kernel-core and kernel-modules for your arch and install them using: "sudo rpm -ivh kernel*.rpm" from the commandline. Remove the quirks you added to use usb-storage, reboot into the new kernel and test if the problem is fixed. Once you've successfully tested this patch I'll submit it upstream. Thanks and Regards, Hans (In reply to Hans de Goede from comment #15) > Hi, > > I've written a patch which will hopefully fix this. I've started a > scratch-build with this patch: > All looks good. sd looks good: [ 1.223662] sd 0:0:0:0: [sda] 976773168 512-byte logical blocks: (500 GB/466 GiB) [ 1.223710] sd 0:0:0:0: Attached scsi generic sg0 type 0 [ 1.223961] sd 0:0:0:0: [sda] Write Protect is off [ 1.223975] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 [ 1.224150] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 1.336180] sd 0:0:0:0: [sda] Attached SCSI disk [ 1.976443] sd 2:0:0:0: Attached scsi generic sg1 type 0 [ 1.976666] sd 2:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB) [ 1.976720] sd 2:0:0:1: Attached scsi generic sg2 type 0 [ 1.977065] sd 2:0:0:1: [sdc] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB) [ 1.977978] sd 2:0:0:0: [sdb] Write Protect is off [ 1.977984] sd 2:0:0:0: [sdb] Mode Sense: 67 00 10 08 [ 1.978406] sd 2:0:0:1: [sdc] Write Protect is off [ 1.978412] sd 2:0:0:1: [sdc] Mode Sense: 67 00 10 08 [ 1.978554] sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 1.978965] sd 2:0:0:1: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 2.053780] sd 2:0:0:0: [sdb] Attached SCSI disk [ 2.054426] sd 2:0:0:1: [sdc] Attached SCSI disk RAIDs not complaining: [ 20.654118] md/raid1:md127: active with 2 out of 2 mirrors [ 20.654329] created bitmap (7 pages) for device md127 [ 20.654683] md127: bitmap initialized from disk: read 1 pages, set 0 of 14091 bits [ 20.656829] md127: detected capacity change from 0 to 1891150856192 [ 20.660340] md/raid1:md124: active with 2 out of 2 mirrors [ 20.662564] md/raid1:md126: active with 2 out of 2 mirrors [ 20.662735] md/raid1:md125: active with 2 out of 2 mirrors [ 20.662796] md125: detected capacity change from 0 to 31457280000 [ 21.056403] created bitmap (1 pages) for device md124 [ 21.083985] created bitmap (1 pages) for device md126 [ 21.084184] md124: bitmap initialized from disk: read 1 pages, set 0 of 32 bits [ 21.084297] md126: bitmap initialized from disk: read 1 pages, set 0 of 938 bits [ 21.137781] md124: detected capacity change from 0 to 2098135040 [ 21.202787] md126: detected capacity change from 0 to 62914560000 copying between drives in the caddy has no problems. copying from RAID in the caddy = 128 MB/s Thank you for your efforts, all looks good :) Thanks for testing. Fedora kernel team, can one of you add the patch fixing this (submitted upstream, will attach it shortly) to the Fedora kernels for now ? Regards, Hans Created attachment 1137996 [details]
[PATCH] uas: Limit qdepth at the scsi-host level
Note this is only necessary for 4.4 and newer, this fixes an uas regression introduced in 4.4.
Applied to all branches. This should show up in the 4.4.7 release whenever that happens. kernel-4.4.6-301.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-7e602c0e5e kernel-4.4.6-201.fc22 has been submitted as an update to Fedora 22. https://bodhi.fedoraproject.org/updates/FEDORA-2016-ed5110c4bb kernel-4.4.6-201.fc22 has been pushed to the Fedora 22 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-ed5110c4bb kernel-4.4.6-301.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-7e602c0e5e kernel-4.5.0-302.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-81fd1b03aa kernel-4.5.0-302.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report. kernel-4.5.0-302.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report. kernel-4.4.6-301.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report. kernel-4.4.6-201.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report. |