Description of problem: I have two external disks, which I connect via eSATA, to take backups, alternating between the two disks. Up until kernel 3.2.10-3.fc16, this would work just fine. With kernel 3.3.0-4.fc16, the disk is recognized the first time after booting (I have only tried it with the disk already connected and powered up.) When I change the disk, I cannot get it to be recognized, until I reboot. I have reverted to kernel 3.2.10-3.fc16 for the time being. Version-Release number of selected component (if applicable): 3.3.0-4.fc16.x86_64 How reproducible: Almost always. If I unplug and re-plug the disk very quickly, it might get recognized again. Steps to Reproduce: 1. Boot the computer, with one disk already connected and powered up. 2. Mount the disk. 3. Unmount the disk and power it down. 4. Disconnect the disk, wait a bit (half a minute should be more than enough), connect the second disk, and power it up. 5. Mount the disk. Actual results: The disk is not mounted. In the console, I see messages like the following: Mar 28 12:24:57 ######### kernel: [92113.439787] ata1: exception Emask 0x10 SAct 0x0 SErr 0x990000 action 0xe frozen Mar 28 12:24:57 ######### kernel: [92113.439918] ata1: irq_stat 0x00400000, PHY RDY changed Mar 28 12:24:57 ######### kernel: [92113.439997] ata1: SError: { PHYRdyChg 10B8B Dispar LinkSeq } Mar 28 12:24:57 ######### kernel: [92113.440118] ata1: hard resetting link Mar 28 12:24:57 ######### kernel: [92114.163030] ata1: SATA link down (SStatus 0 SControl 300) (I have redacted the host name.) Expected results: The disk should be mounted. Additional info: The SATA controller on which the disks are connected is a JMicron Technology Corp. JMB360 AHCI Controller (rev 02) I have tried rescanning with echo "- - -" > /sys/class/scsi_host/hostN/scan for all valid values of N, but it did not help.
Could you try this kernel: http://koji.fedoraproject.org/koji/taskinfo?taskID=3937168 and post dmesg output from 3.2.10 and 3.3 as well.
Created attachment 573377 [details] Output of dmesg for kernels 3.2.10-3 and 3.3.0-5.6
The problem persists with kernel 3.3.0-5.6. I have uploaded an attachment width the output of dmesg for kernels 3.2.10-3 and 3.3.0-5.6.
External esata devices are heavily broken with kernel 3.3 (see also related bugreport https://bugzilla.redhat.com/show_bug.cgi?id=806676). After fresh booting a 3.3 kernel you can only once attach an external esata device and use it. If you detach the device you will not be able to attach a new one. Sometimes also detaching is stuck with a kernel irq task running at 100% (see bug 806676). If you do one suspend cycle the esata port is no longer recognized. The problem was already mentioned on LKML and a patch was suggested by Lin Ming (see https://lkml.org/lkml/2012/3/12/798). Last update was a week ago, as far as I see the patch was not yet accepted.
So, I guess we wait until the bug is fixed upstream. Meanwhile, would it make sense to reinstate kernel 3.2.10-3 in the repositories, to make downgrading easier for those affected by this bug?
Created attachment 573616 [details] Lin Ming Patch to resolve E-Sata Hotplugging I can now confirm that the attached patch indeed resolves the problem. I added it to the current kernel from koji (3.3.0-5.fc16) and rebuild it for x86_64. I can provide the rpms for testing. However, since the problem exists upstream I would suggest to push it there.
(In reply to comment #5) > So, I guess we wait until the bug is fixed upstream. Meanwhile, would it make > sense to reinstate kernel 3.2.10-3 in the repositories, to make downgrading > easier for those affected by this bug? We can't do that. We'll look at the attached patch and see if it's suitable for backporting.
I can confirm that kernel-3.3.0-5, with the attached patch, fixes the problem I reported. (I used the sources from kernel-3.3.0-5.fc17.src.rpm)
The patch also works with kernel 3.3.0-8.
I've applied the submitted patch to all Fedora branches. It will be in the next submitted update.
kernel-3.3.1-3.fc17 has been submitted as an update for Fedora 17. https://admin.fedoraproject.org/updates/kernel-3.3.1-3.fc17
kernel-3.3.1-3.fc16 has been submitted as an update for Fedora 16. https://admin.fedoraproject.org/updates/kernel-3.3.1-3.fc16
Package kernel-3.3.1-3.fc17: * should fix your issue, * was pushed to the Fedora 17 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing kernel-3.3.1-3.fc17' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2012-5346/kernel-3.3.1-3.fc17 then log in and leave karma (feedback).
kernel-3.3.1-3.fc16 has been pushed to the Fedora 16 stable repository. If problems still persist, please make note of it in this bug report.
kernel-3.3.1-5.fc16 has been submitted as an update for Fedora 16. https://admin.fedoraproject.org/updates/kernel-3.3.1-5.fc16
kernel-3.3.1-5.fc17 has been submitted as an update for Fedora 17. https://admin.fedoraproject.org/updates/kernel-3.3.1-5.fc17
kernel-2.6.43.1-5.fc15 has been submitted as an update for Fedora 15. https://admin.fedoraproject.org/updates/kernel-2.6.43.1-5.fc15
This problem is still present in kernel-3.3.1-5.fc16 and ja@minix ~ 1$ uname -a Linux minix 3.3.1-5.fc17.x86_64 #1 SMP Tue Apr 10 20:42:28 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux I have been providing success/failure feedback against 806676 (still open) This was probably not the correct location Is there more useful testing I can perform?
kernel-3.3.1-5.fc16 has been pushed to the Fedora 16 stable repository. If problems still persist, please make note of it in this bug report.
kernel-2.6.43.2-2.fc15 has been submitted as an update for Fedora 15. https://admin.fedoraproject.org/updates/kernel-2.6.43.2-2.fc15
kernel-3.3.1-5.fc17 has been pushed to the Fedora 17 stable repository. If problems still persist, please make note of it in this bug report.
Double rechecked on my machine - the problem is still present ja@minix ~ 22$ uname -a Linux minix 3.3.1-5.fc16.x86_64 #1 SMP Tue Apr 10 19:56:52 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux Absolutely no indication in dmesg that the disk has be plugged in the second time around ---------------------------------------------------------------------- ja@minix ~ 24$ cat dmesg_disk_add_remove_add_3.3.1-5.fc16.x86_64 Clean reboot ja@minix ~ 2$ uname -a Linux minix 3.3.1-5.fc16.x86_64 #1 SMP Tue Apr 10 19:56:52 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux Plug in disk the first time ja@minix ~ 7$ mount |grep sd /dev/sda4 on / type ext4 (rw,noatime,seclabel,user_xattr,acl,barrier=1,data=ordered,discard) /dev/sda3 on /boot type ext4 (rw,noatime,seclabel,user_xattr,acl,barrier=1,data=ordered,discard) /dev/sdb1 on /media/wd250 type ext4 (rw,relatime,seclabel,user_xattr,acl,barrier=1,data=ordered) -------------------------------------------------------------- [ 136.296735] ata4: exception Emask 0x10 SAct 0x0 SErr 0x40c0000 action 0xe frozen [ 136.296744] ata4: irq_stat 0x00000040, connection status changed [ 136.296753] ata4: SError: { CommWake 10B8B DevExch } [ 136.296767] ata4: hard resetting link [ 137.174196] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 137.174823] ata4.00: ATA-7: WDC WD2500KS-00MJB0, 02.01C03, max UDMA/133 [ 137.174833] ata4.00: 488397168 sectors, multi 0: LBA48 [ 137.175519] ata4.00: configured for UDMA/133 [ 137.175534] ata4: EH complete [ 137.175732] scsi 3:0:0:0: Direct-Access ATA WDC WD2500KS-00M 02.0 PQ: 0 ANSI: 5 [ 137.176181] sd 3:0:0:0: [sdb] 488397168 512-byte logical blocks: (250 GB/232 GiB) [ 137.176435] sd 3:0:0:0: [sdb] Write Protect is off [ 137.176443] sd 3:0:0:0: [sdb] Mode Sense: 00 3a 00 00 [ 137.176634] sd 3:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 137.176650] sd 3:0:0:0: Attached scsi generic sg1 type 0 [ 137.221163] sdb: sdb1 [ 137.223369] sd 3:0:0:0: [sdb] Attached SCSI disk [ 137.570148] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null) [ 137.570155] SELinux: initialized (dev sdb1, type ext4), uses xattr ----------------------------------------------------------------- [root@minix ~]# umount /dev/sdb1 No new messages in dmesg [ 136.296735] ata4: exception Emask 0x10 SAct 0x0 SErr 0x40c0000 action 0xe frozen [ 136.296744] ata4: irq_stat 0x00000040, connection status changed [ 136.296753] ata4: SError: { CommWake 10B8B DevExch } [ 136.296767] ata4: hard resetting link [ 137.174196] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 137.174823] ata4.00: ATA-7: WDC WD2500KS-00MJB0, 02.01C03, max UDMA/133 [ 137.174833] ata4.00: 488397168 sectors, multi 0: LBA48 [ 137.175519] ata4.00: configured for UDMA/133 [ 137.175534] ata4: EH complete [ 137.175732] scsi 3:0:0:0: Direct-Access ATA WDC WD2500KS-00M 02.0 PQ: 0 ANSI: 5 [ 137.176181] sd 3:0:0:0: [sdb] 488397168 512-byte logical blocks: (250 GB/232 GiB) [ 137.176435] sd 3:0:0:0: [sdb] Write Protect is off [ 137.176443] sd 3:0:0:0: [sdb] Mode Sense: 00 3a 00 00 [ 137.176634] sd 3:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 137.176650] sd 3:0:0:0: Attached scsi generic sg1 type 0 [ 137.221163] sdb: sdb1 [ 137.223369] sd 3:0:0:0: [sdb] Attached SCSI disk [ 137.570148] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null) [ 137.570155] SELinux: initialized (dev sdb1, type ext4), uses xattr ------------------------------------------------------------------ Unplug disk [ 372.979293] ata4: exception Emask 0x10 SAct 0x0 SErr 0x90202 action 0xe frozen [ 372.979302] ata4: irq_stat 0x00400000, PHY RDY changed [ 372.979312] ata4: SError: { RecovComm Persist PHYRdyChg 10B8B } [ 372.979323] ata4: hard resetting link [ 373.702176] ata4: SATA link down (SStatus 0 SControl 300) [ 378.702211] ata4: hard resetting link [ 379.007177] ata4: SATA link down (SStatus 0 SControl 300) [ 379.007195] ata4: limiting SATA link speed to 1.5 Gbps [ 384.007200] ata4: hard resetting link [ 384.312177] ata4: SATA link down (SStatus 0 SControl 310) [ 384.312194] ata4.00: disabled [ 384.312220] ata4: EH complete [ 384.312237] ata4.00: detaching (SCSI 3:0:0:0) [ 384.314742] sd 3:0:0:0: [sdb] Synchronizing SCSI cache [ 384.314810] sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK [ 384.314820] sd 3:0:0:0: [sdb] Stopping disk [ 384.314837] sd 3:0:0:0: [sdb] START_STOP FAILED [ 384.314842] sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK ------------------------------------------------------------- Plug in disk again No change in dmesg !! No indication that the disk has been plugged in [ 372.979293] ata4: exception Emask 0x10 SAct 0x0 SErr 0x90202 action 0xe frozen [ 372.979302] ata4: irq_stat 0x00400000, PHY RDY changed [ 372.979312] ata4: SError: { RecovComm Persist PHYRdyChg 10B8B } [ 372.979323] ata4: hard resetting link [ 373.702176] ata4: SATA link down (SStatus 0 SControl 300) [ 378.702211] ata4: hard resetting link [ 379.007177] ata4: SATA link down (SStatus 0 SControl 300) [ 379.007195] ata4: limiting SATA link speed to 1.5 Gbps [ 384.007200] ata4: hard resetting link [ 384.312177] ata4: SATA link down (SStatus 0 SControl 310) [ 384.312194] ata4.00: disabled [ 384.312220] ata4: EH complete [ 384.312237] ata4.00: detaching (SCSI 3:0:0:0) [ 384.314742] sd 3:0:0:0: [sdb] Synchronizing SCSI cache [ 384.314810] sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK [ 384.314820] sd 3:0:0:0: [sdb] Stopping disk [ 384.314837] sd 3:0:0:0: [sdb] START_STOP FAILED [ 384.314842] sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
It looks as if this may be my/the problem http://www.spinics.net/lists/linux-ide/msg43173.html (Date: Fri, 13 Apr 2012 09:24:07 +08002012_04_13) "... > The fundamental problem with this patch is that all SATA ports are > hotpluggable... even the ones the firmware/silicon failed to mark as > hotpluggable via AHCI's PORT_CMD_MPSP | PORT_CMD_HPCP So the acceptable solution is to add runtime pm support for hotpluggable port. I'll send new patches. Thanks, Lin Ming"
There is more activity on this bug here (it is obviously not closed!) (I am unsure of the relationship between Red Hat Bugzilla and the people referenced below) http://www.spinics.net/lists/linux-ide/msg43236.html To: Mark Lord <kernel@xxxxxxxxxxxx> Subject: Re: Hotplug borked after suspend/resume in Linux-3.3 ? From: Jeff Garzik <jgarzik@xxxxxxxxx> Date: Wed, 18 Apr 2012 02:18:40 -0400 Cc: Lin Ming <ming.m.lin@xxxxxxxxx>, Tejun Heo <htejun@xxxxxxxxx>, linux-ide@xxxxxxxxxxxxxxx In-reply-to: <4F8E1D0A.8000203> List-id: <linux-ide.vger.kernel.org> User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20120329 Thunderbird/11.0.1 On 04/17/2012 09:46 PM, Mark Lord wrote: On 12-04-17 09:37 PM, Mark Lord wrote: On 12-04-17 09:29 PM, Lin Ming wrote: I'm working on the hotplug issue fix. Before the fix is ready, here is the one-line patch. Could you give it a try? .. --- a/drivers/ata/libata-transport.c +++ b/drivers/ata/libata-transport.c @@ -294,6 +294,7 @@ int ata_tport_add(struct device *parent, device_enable_async_suspend(dev); pm_runtime_set_active(dev); pm_runtime_enable(dev); + pm_runtime_forbid(dev); .. I'm rebuilding the kernel right now.. should take about 5min or less to test. Yeah, that (by itself) is enough to make things work again. This looks like the one-liner that we really need upstream and in -stable. Jeff? I'll now use it instead of the (much larger) "v2 disable runtime pm" thing. Yeah I like that a -whole- lot better... Will push upstream tomorrow.
(In reply to comment #24) > There is more activity on this bug here (it is obviously not closed!) > (I am unsure of the relationship between Red Hat Bugzilla and the people > referenced below) > > http://www.spinics.net/lists/linux-ide/msg43236.html > > To: Mark Lord <kernel@xxxxxxxxxxxx> > Subject: Re: Hotplug borked after suspend/resume in Linux-3.3 ? > From: Jeff Garzik <jgarzik@xxxxxxxxx> > Date: Wed, 18 Apr 2012 02:18:40 -0400 > Cc: Lin Ming <ming.m.lin@xxxxxxxxx>, Tejun Heo <htejun@xxxxxxxxx>, > linux-ide@xxxxxxxxxxxxxxx > In-reply-to: <4F8E1D0A.8000203> > List-id: <linux-ide.vger.kernel.org> > User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20120329 > Thunderbird/11.0.1 > On 04/17/2012 09:46 PM, Mark Lord wrote: > On 12-04-17 09:37 PM, Mark Lord wrote: > On 12-04-17 09:29 PM, Lin Ming wrote: > > I'm working on the hotplug issue fix. > > Before the fix is ready, here is the one-line patch. > Could you give it a try? > .. > --- a/drivers/ata/libata-transport.c > +++ b/drivers/ata/libata-transport.c > @@ -294,6 +294,7 @@ int ata_tport_add(struct device *parent, > device_enable_async_suspend(dev); > pm_runtime_set_active(dev); > pm_runtime_enable(dev); > + pm_runtime_forbid(dev); Here is a scratch build with the original patch removed, and the one above added. When it completes, I would appreciate if you could test it and let me know how it works. http://koji.fedoraproject.org/koji/taskinfo?taskID=4001505
ja@minix ~ 7$ uname -a Linux minix 3.3.2-3.1.fc16.x86_64 #1 SMP Wed Apr 18 12:51:06 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux It is now possible to plugin/mount/umount/unplug external eSATAp devices (I have tried two, SSD and mechanical disks, several times) as and when required without a reboot being necessary! Many thanks John
(In reply to comment #26) > ja@minix ~ 7$ uname -a > Linux minix 3.3.2-3.1.fc16.x86_64 #1 SMP Wed Apr 18 12:51:06 UTC 2012 x86_64 > x86_64 x86_64 GNU/Linux > > It is now possible to plugin/mount/umount/unplug external eSATAp devices > (I have tried two, SSD and mechanical disks, several times) > as and when required without a reboot being necessary! Great. Thank you for testing. I'll get this patch rolled into the next update later today.
Kernel 3.3.2-3.1.fc16.x86_64 fixed the bug in my case, as well, where kernel 3.3.1-5.fc16.x86_64 had also worked.
kernel-3.3.2-8.fc17 has been submitted as an update for Fedora 17. https://admin.fedoraproject.org/updates/kernel-3.3.2-8.fc17
kernel-3.3.2-6.fc16 has been submitted as an update for Fedora 16. https://admin.fedoraproject.org/updates/kernel-3.3.2-6.fc16
kernel-2.6.43.2-6.fc15 has been submitted as an update for Fedora 15. https://admin.fedoraproject.org/updates/kernel-2.6.43.2-6.fc15
Package kernel-3.3.2-8.fc17: * should fix your issue, * was pushed to the Fedora 17 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing kernel-3.3.2-8.fc17' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2012-6344/kernel-3.3.2-8.fc17 then log in and leave karma (feedback).
kernel-3.3.2-8.fc17 has been pushed to the Fedora 17 stable repository. If problems still persist, please make note of it in this bug report.
kernel-3.3.2-6.fc16 has been pushed to the Fedora 16 stable repository. If problems still persist, please make note of it in this bug report.
kernel-2.6.43.2-6.fc15 has been pushed to the Fedora 15 stable repository. If problems still persist, please make note of it in this bug report.