Red Hat Bugzilla – Bug 101555
Kernel doesn't report write errors to applications
Last modified: 2005-10-31 17:00:50 EST
Description of problem:
Writes to block devices / filesystem mounted over block devices can fail if the
block device is suddenly unreachable (e.g. link loss to RAID). Kernel should
not acknowledge userspace synced writes in this situation, but surprisingly
thats exactly what it does.
Version-Release number of selected component (if applicable):
2.4.18-24 Redhat Errata kernel
Steps to Reproduce:
1. from userspace code do synced writes to a block device file (e.g. /dev/sdd)
which is actually a RAID LUN / disconnectable SCSI disk. use strace to track
the write progress
2. disconnect cable / disk
3. keep tracking write progress
last write which started before the disconneciton is hung for some time.
after a while that write returns ok, and all writes after that return ok as
after rebooting the machine the data is of course missing
1. first write which cannot complete due to disconnection should hang forever.
(process should be in D state until a hard kill)
this is how SuSE kernel 2.4.20 behaves
2. better - return EIO or other error code to the userspace and let it
handle/report the error as it sees fit.
This was discussed in the linux-kernel mailing list some time ago:
[PATCH 2.4] Report write errors to applications -
suggested patch to marcello's 2.4.21pre3 kernel
9999_fsync-msync-async-errors-1 - contains patch for the aa kernel
Created attachment 93368 [details]
rtest is an easy test to reproduce the problem.
use synced writes
rtest -filename=/dev/sdd -count=1000 -sync=1
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases,
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/