Red Hat Bugzilla – Bug 147870
O_DIRECT to sparse areas of files give incomplete writes
Last modified: 2007-11-30 17:07:06 EST
Does this have a corresponding Issue Tracker?
Issue Tracker 66021 filed for this.
From User-Agent: XML-RPC
Tried to recreate the issue *without* success on 2.4.21-27.0.1.ELsmp kernel
with the uploaded test case on aic7xxx driver. Did this problem only show
up with cciss driver ? Only with pwrite64() call ? Could you modify the
uploaded ptest.c to closely resemble what you have done ?
File uploaded: ptest.c
This event sent from IssueTracker by wcheng
Thanks for the new ptest2.c file in the issue tracker. Retrying ...
Well, it refuses to happen on my test machine (Pentium III hyperthreaded). I'll
let it loop overnight to see how it goes. If nothing happens, will move the test
to another bigger smp box.
I couldnt reproduce on single cpu with ide or firewire (which also turns up as
scsi). I couldnt reproduce on dual cpu with ide either. So far only on
dual cpu with scsi. Hth.
Yesterday's box is on aic7xxx SCSI (hyperthreaded) - nothing happened. Now the
test is running on Dell PowerEdge 1600SC (2 cpus hyperthreaded to 4) on top of
mptscsih scsi. No news yet.
IT ticket escalated into engineering team. Three issues:
1. The do_generic_direct_write doesn't pass the correct error and write status
back to caller. Will re-package what we have found so far into a patch format.
This work is more or less complete.
2. Upon errors returned by do_generic_direct_write, we need to fall back to
buffer write. The current buffer write code path doesn't pass the correct error
(if there is one) back to caller. This is bugzilla 116900.
3. Bert found another issue (ENOSPC) that was documented in the IT ticket:
"about the buffered write: I found that prepare_write is returning -28 (-ENOSPC)
, while there's plenty of room left on the fs. It seems that the most probable
cause of this is ext3_get_block which cals ext3_new_block which returns -ENOSPC
as a default error. Probably somewhere in there ENOSPC is masking something that
should have been EAGAIN ? Perhaps a possible solution would be to check the
available space upon encountering -ENOSPC and retry the operation if there is
enough space available ? Looking at the 2.6 code I see that ext3_prepare_write
has some additional code there to do such retries; Perhaps this can be
backported to the rhel3 kernel ?"
Bert, does this bugzilla need to remain private? If not, please uncheck
the "Oracle Confidential Group" box below. Thanks.
Making this bug private as a whole, but marking initial comment private
(in lieu of Bert's response to comment #30).
A fix for this problem has just been committed to the RHEL3 U7
patch pool this evening (in kernel version 2.4.21-37.12.EL).
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.