From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20041020 Galeon/1.3.18 Description of problem: I have a couple of old hard discs that were a RAID set, and which I'm trying to copy back onto new discs by mounting them in a couple of USB caddies. Unfortunately, after a while of a 'cp -ar' running, one of the discs gets reset three times, then goes dead: Nov 21 21:29:36 xxx kernel: usb 1-6: reset high speed USB device using address 2 Nov 21 21:30:50 xxx kernel: usb 1-6: reset high speed USB device using address 2 Nov 21 21:31:50 xxx kernel: usb 1-6: reset high speed USB device using address 2 Nov 21 21:31:50 xxx kernel: usb 1-6: device not accepting address 2, error -71 Nov 21 21:31:50 xxx kernel: scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 0 lun 0 The timings appear to be the same between each USB reset message. Might this problem be related to < https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=128602#c12>. The caddies are connected to the onboard Intel USB 2.0 ports on an i845PE motherboard. /proc/bus/usb/devices has this to say: T: Bus=01 Lev=01 Prnt=01 Port=04 Cnt=01 Dev#= 3 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=067b ProdID=3507 Rev= 0.01 S: Manufacturer=Prolific Technology Inc. S: Product=ATAPI-6 Bridge Controller S: SerialNumber=C5 C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr= 2mA I: If#= 0 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms Version-Release number of selected component (if applicable): 2.6.9-1.678_FC3 How reproducible: Sometimes Steps to Reproduce: 1. Attach USB caddies 2. Bring RAID and LV online 3. Start copying from RAID0 set Actual Results: Observe as first drive (/dev/sda) goes offline after three resets Expected Results: Both drives stay online and copy completes successfully. Additional info:
Interestingly, I copied a larger amount of data from a RAID1 set held on the same two drives in the same two caddies connected to the same two USB ports without *any* resets. Curious.
Further notes: Firstly, the exact chronology is: 1) cp hangs 2) USB device is reset three times 3) pseudo-SCSI device disappears from system Secondly, I've plugged one drive into a seperate USB controller (a NEC, that's also included on the motherboard) and the problem occurs then also. Thirdly, powering on the drives in the reverse order causes scsi1 (i.e. /dev/sdb) rather than scsi0 to go offline. I'm going to try swapping the drives in the caddies to eliminate the possibility of a hardware fault in one of the caddies. I'll also try the -681 kernel.
OK, after both an upgrade to the -681 kernel AND swapping the discs between the two caddies (but powering them on in the /original/ order), it's still the same disc (i.e different caddy) that goes offline. Ergo, I'm pretty sure the caddies are working fine, and the discs were working fine too, before I pulled them from my machine and put them in the caddies. Ask if you need further information.
I've just tried dd'ing from the each disc, and I get the same behaviour with the original scsi0 device, so we can definitely rule out md and LVM as being possible culprits. scsi1 (i.e. the second disc of the RAID set) dd's fine. Also, I've noticed that the sector that it blows up on is the same across all these tests, so I reckon it's probably a failed sector on the disc. I'll try dd'ing /dev/zero to it to get the firmware to remap it and re-open if I'm able to dd from both discs cleanly. Sorry for the noise.