Bug 433287

Summary: Failing to wake up sleepy usb disk
Product: Red Hat Enterprise Linux 4 Reporter: Aleksandar Milivojevic <alex>
Component: kernelAssignee: Pete Zaitcev <zaitcev>
Status: CLOSED WONTFIX QA Contact: Martin Jenner <mjenner>
Severity: medium Docs Contact:
Priority: low    
Version: 4.6Keywords: Reopened
Target Milestone: rc   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 480389 (view as bug list) Environment:
Last Closed: 2009-01-16 20:02:49 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Aleksandar Milivojevic 2008-02-18 12:46:49 UTC
Description of problem:
I've external USB hard drive, Seagate FreeAgent Pro 500GB, with single ext3
formatted partition.  The drive is factory configured to take a nap (spin down)
after set period of inactivity.  Normally, any activity should spin the hard
drive up.  This works fine on Windows, MacOS X, and with couple of routers that
support sharing USB storage that I tried out.  No special drivers, no special
software.

However, on Red Hat Enterprise Linux, bad things happen if hard drive spins
itself down due to inactivity.  If the file system was not mounted, it will fail
to mount.  If the file system was mounted, on next file system access kernel
reports disk unresponsive, remounts file system read-only, any access (read or
write) to it fails with errors, and disk stays in sleep mode (does not spin up).

If I run fdisk on USB hard drive while it is in sleep mode, the disk spins up
and fdisk works (no errors in kernel or anywhere else).  Just as it should.  I
guess whatever happens when fdisk is run, should happen on normal file system
access too (but it doesn't).

Here's some logs:

Feb 18 04:38:39 toporko kernel: Device sdb not ready.
Feb 18 04:38:39 toporko kernel: end_request: I/O error, dev sdb, sector 13215
Feb 18 04:38:39 toporko kernel: Device sdb not ready.
Feb 18 04:38:39 toporko kernel: end_request: I/O error, dev sdb, sector 13231
Feb 18 04:38:39 toporko kernel: Buffer I/O error on device sdb1, logical block 1646
Feb 18 04:38:39 toporko kernel: lost page write due to I/O error on sdb1
Feb 18 04:38:39 toporko kernel: Aborting journal on device sdb1.
Feb 18 04:38:59 toporko kernel: ext3_abort called.
Feb 18 04:38:59 toporko kernel: EXT3-fs error (device sdb1):
ext3_journal_start_sb: Detected aborted journal
Feb 18 04:38:59 toporko kernel: Remounting filesystem read-only
Feb 18 04:39:09 toporko kernel: Device sdb not ready.
Feb 18 04:39:09 toporko kernel: end_request: I/O error, dev sdb, sector 279445583
Feb 18 04:39:09 toporko kernel: Buffer I/O error on device sdb1, logical block
34930690
Feb 18 04:39:09 toporko kernel: lost page write due to I/O error on sdb1

Current ugly workaround I'm using is to run "while true; do touch /foo/bar;
sleep 60; done" whenever I mount exteral drive to prevent it from taking a nap.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Needed: USB hard drive that likes to nap (in my case, Seagate FreeAgent Pro
500GB)
2. Create ext3 file system on it
3. Wait some time until disk spins down
4. If file system was not yet mounted, mount will fail
5. If file system was mounted, kernel will report disk as unresponsive and
remount file system read-only, the drive will not spin up, and any access to it
will fail
  
Actual results:
Drive does not spin up on access.

Expected results:
Drive should spin up on access (just as it does when connected to anything other
than Linux box).

Additional info:

Comment 1 Aleksandar Milivojevic 2008-02-18 13:37:26 UTC
After reporting the bug, I found this page:

http://osdir.com/ml/linux.kernel.firewire.user/2006-09/msg00008.html

It looks like there was a fix for a very similar problem with an IEEE-1394
device in upstream kernel:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e31f59ce593b073ee14241781edfb0637697eeb6

Maybe my USB device needs something similar...


Comment 2 Aleksandar Milivojevic 2008-02-20 02:28:13 UTC
Well...  two things...  bad news and good news.

Bad news is that touching an empty file will not prevent disk from spinning down.

Good news, fix exists in upstream kernel.  The parameter allow_restart needs to
be set to one for the device.  IMO, this should probably be default for
USB/FireWire connected devices, and probably wouldn't hurt if it was default for
all devices.  As explained on this page:

http://www.nslu2-linux.org/wiki/FAQ/DealWithAutoSpinDownOnSeagateFreeAgent

Could this feature be backported to RHEL4?  Seagate FreeAgent drives are not the
only drives affected.  I saw reports from people owning Western Digital drives
having the same problems with their hard drives spinning down (and it is not
possible to disable it on those either).

It would be nice if RHEL4 kernel would show some love for our stuborn USB hard
drives :-)

Comment 3 Aleksandar Milivojevic 2008-02-22 07:20:04 UTC
Instead of waiting if it is possible to backport allow_restart option to 2.6.9
kernel, I've upgraded my box.  The fix described at page I quoted seems to work
nicely in 2.6.18 kernel.  I've created udev rule that says:

SUBSYSTEM=="scsi_disk",DRIVER=="sd",SYSFS{vendor}=="Seagate",SYSFS{model}=="FreeAgent
Pro",RUN+="/usr/local/bin/freeagent %k"

which runs simple "freeagent" script when device is attached:

#!/bin/sh

echo 1 > /sys/class/scsi_disk/${1}/allow_restart

Works perfectly.

Still it would be nice if there was solution for folks out there with RHEL4 on
their boxes.

Apperently, allow_restart=1 is going to become default for all USB storage
devices in 2.6.24 kernel.  Would be nice if it was default on RHEL4/5.

Comment 4 Prarit Bhargava 2008-05-08 12:28:00 UTC
Please be advised, that Bugzilla is not a support tool. It is an
Engineering and Community tool. So although all changes to Enterprise
Linux go through Bugzilla and Red Hat considers issues directly entered
into Bugzilla valuable feedback, there is no SLA around it.

If this is a production issue, please report it to your Red Hat Support contact.

Thank you.


Comment 5 Aleksandar Milivojevic 2008-05-08 15:39:17 UTC
No, it's not a production issue.  It's a feedback, and I'll let you decide if it's valuable or not.

There's a trivial fix in upstream kernel for this, that sets allow_restart to 1 by default for USB attached 
storage.  I believe the fix was introduced in 2.6.24-rc4.  It was just couple of lines of code, and it should 
be trivial to backport to RHEL5 and possibly to RHEL4 too.  I can't find the exact patch right now, but if you 
guys are interested, I'll try to dig it out.

In the meantime (or if you guys decide not to backport kernel fix), I hope that the above user-space 
workaround for RHEL5 will be helpful to your customers that run into the same problem.

Comment 6 Aleksandar Milivojevic 2008-05-08 18:04:53 UTC
Ok, found the upstream patch.  It is exactly one line of code:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-
2.6.24.y.git;a=commitdiff;h=f09e495df27d80ae77005ddb2e93df18ec24d04a;hp=9e3285dba5cac12d65
6da66fd7d420ff1bc0ecc0


Comment 7 Pete Zaitcev 2009-01-16 20:02:49 UTC
I'm going to close this, since we're folding the discretionary RHEL 4
development after 4.8.

Comment 9 Aleksandar Milivojevic 2009-01-17 00:22:30 UTC
What about RHEL 5 (which is also affected) and beyond?

Comment 10 Pete Zaitcev 2009-01-17 00:56:30 UTC
For RHEL 5, you're on cc for the bug 480389. I may get it in under the
20% rule, stay tuned.

For the beyond, the fix is in Linus' tree, so no problem.