Bug 101555 - Kernel doesn't report write errors to applications
Summary: Kernel doesn't report write errors to applications
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 8.0
Hardware: i686
OS: Linux
high
high
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-08-03 17:43 UTC by yuval yeret
Modified: 2005-10-31 22:00 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2004-09-30 15:41:23 UTC
Embargoed:


Attachments (Terms of Use)
rtest is an easy test to reproduce the problem. (440.16 KB, application/octet-stream)
2003-08-03 17:47 UTC, yuval yeret
no flags Details

Description yuval yeret 2003-08-03 17:43:49 UTC
Description of problem:

Writes to block devices / filesystem mounted over block devices can fail if the 
block device is suddenly unreachable (e.g. link loss to RAID). Kernel should 
not acknowledge userspace synced writes in this situation, but surprisingly 
thats exactly what it does. 

Version-Release number of selected component (if applicable):
2.4.18-24 Redhat Errata kernel


How reproducible:
Every time. 


Steps to Reproduce:
1. from userspace code do synced writes to a block device file (e.g. /dev/sdd) 
which is actually a RAID LUN / disconnectable SCSI disk. use strace to track 
the write progress
2. disconnect cable / disk
3. keep tracking write progress
    
Actual results:
last write which started before the disconneciton is hung for some time. 
after a while that write returns ok, and all writes after that return ok as 
well.
after rebooting the machine the data is of course missing

Expected results:
1. first write which cannot complete due to disconnection should hang forever. 
(process should be in D state until a hard kill)
this is how SuSE kernel 2.4.20 behaves
2. better - return EIO or other error code to the userspace and let it 
handle/report the error as it sees fit. 

Additional info:
This was discussed in the linux-kernel mailing list some time ago:
[PATCH 2.4] Report write errors to applications - 
http://lists.insecure.org/lists/linux-kernel/2003/Jan/7178.html (contains 
suggested patch to marcello's 2.4.21pre3 kernel

http://www.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.21pre5aa1/
9999_fsync-msync-async-errors-1 - contains patch for the aa kernel

Comment 1 yuval yeret 2003-08-03 17:47:45 UTC
Created attachment 93368 [details]
rtest is an easy test to reproduce the problem. 

use synced writes

example usage:

rtest -filename=/dev/sdd -count=1000 -sync=1

Comment 2 Bugzilla owner 2004-09-30 15:41:23 UTC
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/



Note You need to log in before you can comment on or make changes to this bug.