Description of problem: After copying large files (ca. 8GB) to encrypted USB disks attached to a USB 3 hub, unmounting the disk frequently fails, kernel issues messages xHCI xhci_drop_endpoint called with disabled ep... followed by SCSI driver errors / file system corruption: [sdc] Unhandled error code sd 10:0:0:0: [sdc] Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK sd 10:0:0:0: [sdc] CDB: Write(10): 2a 00 1d 04 18 00 00 00 08 00 end_request: I/O error, dev sdc, sector 486807552 Buffer I/O error on device dm-2, logical block 60850176 lost page write due to I/O error on dm-2 Version-Release number of selected component (if applicable): All Fedora 16 kernels after about 3.0 Steps to Reproduce: 1. create ext4 on top of LUKS filesystem on 2 USB3 disks 2. attach disks to USB3 hub 3. copy 6-8GB file simultaneously to both disks 4. try to unmount / detach disks via DBUS With a probability of about 50%, the error occurs Actual results: File system corruption, possible data loss. Expected results: Filesystems must be successfully flushed to USB3 disks, and then unmounted cleanly. Additional info: I consider this a critical bug w/ high fixing priority: o data loss due to filesystem corruption can occur (I had one filesystem's root directory destroyed), which is especially nasty as USB drives are frequently used as media for storing important backups o the severity of the bug is completely masked if the GUI is used (i.e. "Safely Remove Drive" is clicked): the operation hangs for ca. 20s, then some rather uninformative error message like "cannot unmount drive" is issued -- but that the drive suffered a hard I/O error and might be corrupted is not brought to the user's attention!! o increasing USB filesystem memory with e.g. modprobe usbcore usbfs_memory_mb=1000 does not provide a workaround. o I saw discussions on 2 Linux boards that the bug is known, and that a fix might be available in upstream releases of kernel / libusb -- please backport this solution to Fedora 16 ASAP! HW are LaCie / Hitachi HDDs attached to a Samsung 900X3A notebook Thanks, Stefan
What is the most recent kernel you have seen this on?
Sorry for the missing info, it's 3.4.9-1 Meanwhile, I experienced the bug once even w/o writing anything to the USB drives -- just having them attached and mounted for a while was enough.
I think I have the same problem. Below is a trace from /var/log/messages. Machine running Fedora 17 with latest patches as of 21st September, 2012. [root@LP000138 dev]# uname -a Linux xxx.com 3.5.3-1.fc17.x86_64 #1 SMP Wed Aug 29 18:46:34 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux Plugging in a USB 3 external drive into DELL Latitude E5430 Sep 21 16:45:37 LP000138 kernel: [ 384.217898] usb 2-1.8.2: USB disconnect, device number 6 Sep 21 16:45:51 LP000138 kernel: [ 397.838596] usb 4-2: new SuperSpeed USB device number 13 using xhci_hcd Sep 21 16:45:51 LP000138 kernel: [ 397.850754] usb 4-2: New USB device found, idVendor=1058, idProduct=0730 Sep 21 16:45:51 LP000138 kernel: [ 397.850761] usb 4-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 Sep 21 16:45:51 LP000138 kernel: [ 397.850765] usb 4-2: Product: My Passport 0730 Sep 21 16:45:51 LP000138 kernel: [ 397.850768] usb 4-2: Manufacturer: Western Digital Sep 21 16:45:51 LP000138 kernel: [ 397.850771] usb 4-2: SerialNumber: 575838314134315435343738 Sep 21 16:45:51 LP000138 kernel: [ 397.851783] scsi10 : usb-storage 4-2:1.0 Sep 21 16:45:51 LP000138 mtp-probe: checking bus 4, device 13: "/sys/devices/pci0000:00/0000:00:14.0/usb4/4-2" Sep 21 16:45:51 LP000138 mtp-probe: bus: 4, device: 13 was not an MTP device Sep 21 16:46:02 LP000138 kernel: [ 408.851792] scsi 10:0:0:0: Direct-Access WD My Passport 0730 1016 PQ: 0 ANSI: 6 Sep 21 16:46:02 LP000138 kernel: [ 408.853261] sd 10:0:0:0: Attached scsi generic sg2 type 0 Sep 21 16:46:02 LP000138 kernel: [ 408.853464] sd 10:0:0:0: [sdb] 1953458176 512-byte logical blocks: (1.00 TB/931 GiB) Sep 21 16:46:02 LP000138 kernel: [ 408.853634] sd 10:0:0:0: [sdb] Write Protect is off Sep 21 16:46:02 LP000138 kernel: [ 408.853790] sd 10:0:0:0: [sdb] No Caching mode page present Sep 21 16:46:02 LP000138 kernel: [ 408.853795] sd 10:0:0:0: [sdb] Assuming drive cache: write through Sep 21 16:46:02 LP000138 kernel: [ 408.854507] sd 10:0:0:0: [sdb] No Caching mode page present Sep 21 16:46:02 LP000138 kernel: [ 408.854513] sd 10:0:0:0: [sdb] Assuming drive cache: write through Sep 21 16:46:02 LP000138 kernel: [ 408.864533] sdb: Sep 21 16:46:02 LP000138 kernel: [ 408.865243] sd 10:0:0:0: [sdb] No Caching mode page present Sep 21 16:46:02 LP000138 kernel: [ 408.865249] sd 10:0:0:0: [sdb] Assuming drive cache: write through Sep 21 16:46:02 LP000138 kernel: [ 408.865254] sd 10:0:0:0: [sdb] Attached SCSI disk Sep 21 16:46:02 LP000138 kernel: [ 408.869471] usb 4-2: Disable of device-initiated U1 failed. Sep 21 16:46:02 LP000138 kernel: [ 408.869544] usb 4-2: Disable of device-initiated U2 failed. Sep 21 16:46:02 LP000138 kernel: [ 408.971140] usb 4-2: Device not responding to set address. Sep 21 16:46:02 LP000138 kernel: [ 409.171888] usb 4-2: Device not responding to set address. Sep 21 16:46:03 LP000138 kernel: [ 409.372762] usb 4-2: device not accepting address 13, error -71 Sep 21 16:46:03 LP000138 kernel: [ 409.474820] usb 4-2: Device not responding to set address. Sep 21 16:46:03 LP000138 kernel: [ 409.675726] usb 4-2: Device not responding to set address. Sep 21 16:46:03 LP000138 kernel: [ 409.876604] usb 4-2: device not accepting address 13, error -71 Sep 21 16:46:03 LP000138 kernel: [ 409.978673] usb 4-2: Device not responding to set address. Sep 21 16:46:03 LP000138 kernel: [ 410.179580] usb 4-2: Device not responding to set address. Sep 21 16:46:04 LP000138 kernel: [ 410.380427] usb 4-2: device not accepting address 13, error -71 Sep 21 16:46:04 LP000138 kernel: [ 410.482512] usb 4-2: Device not responding to set address. Sep 21 16:46:04 LP000138 kernel: [ 410.683379] usb 4-2: Device not responding to set address. Sep 21 16:46:04 LP000138 kernel: [ 410.884248] usb 4-2: device not accepting address 13, error -71 Sep 21 16:46:04 LP000138 kernel: [ 410.884304] usb 4-2: USB disconnect, device number 13 Sep 21 16:46:04 LP000138 kernel: [ 410.885735] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff8801e18279c0 Sep 21 16:46:04 LP000138 kernel: [ 410.885744] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff8801e1827980 Sep 21 16:46:04 LP000138 udevd[3097]: inotify_add_watch(6, /dev/sdb, 10) failed: No such file or directory Sep 21 16:46:04 LP000138 kernel: [ 410.987331] usb 4-2: Device not responding to set address. Sep 21 16:46:04 LP000138 kernel: [ 411.188216] usb 4-2: Device not responding to set address. Sep 21 16:46:05 LP000138 kernel: [ 411.389114] usb 4-2: device not accepting address 14, error -71 Sep 21 16:46:05 LP000138 kernel: [ 411.491163] usb 4-2: Device not responding to set address. Sep 21 16:46:05 LP000138 kernel: [ 411.692060] usb 4-2: Device not responding to set address. Sep 21 16:46:05 LP000138 kernel: [ 411.892911] usb 4-2: device not accepting address 15, error -71 Sep 21 16:46:05 LP000138 kernel: [ 411.994978] usb 4-2: Device not responding to set address. Sep 21 16:46:05 LP000138 kernel: [ 412.195913] usb 4-2: Device not responding to set address. Sep 21 16:46:06 LP000138 kernel: [ 412.396781] usb 4-2: device not accepting address 16, error -71 Sep 21 16:46:06 LP000138 kernel: [ 412.498800] usb 4-2: Device not responding to set address. Sep 21 16:46:06 LP000138 kernel: [ 412.699756] usb 4-2: Device not responding to set address. Sep 21 16:46:06 LP000138 kernel: [ 412.900598] usb 4-2: device not accepting address 17, error -71 Sep 21 16:46:06 LP000138 kernel: [ 412.900626] hub 4-0:1.0: unable to enumerate USB device on port 2
# Mass update to all open bugs. Kernel 3.6.2-1.fc16 has just been pushed to updates. This update is a significant rebase from the previous version. Please retest with this kernel, and let us know if your problem has been fixed. In the event that you have upgraded to a newer release and the bug you reported is still present, please change the version field to the newest release you have encountered the issue with. Before doing so, please ensure you are testing the latest kernel update in that release and attach any new and relevant information you may have gathered. If you are not the original bug reporter and you still experience this bug, please file a new report, as it is possible that you may be seeing a different problem. (Please don't clone this bug, a fresh bug referencing this bug in the comment is sufficient).
With kernel version 3.6.2-1.fc16.x86_64, I could not reproduce the bug in ca. 20 copying actions. I am not sure that it is completely gone (as I was not able to trigger it deterministically with the older kernels), but at least the probability of transfer errors has decreased significantly.
This message is a reminder that Fedora 16 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 16. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '16'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 16's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 16 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged to click on "Clone This Bug" and open it against that version of Fedora. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Fedora 16 changed to end-of-life (EOL) status on 2013-02-12. Fedora 16 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.