Bug 147870 - O_DIRECT to sparse areas of files give incomplete writes
O_DIRECT to sparse areas of files give incomplete writes
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
i386 Linux
medium Severity high
: ---
: ---
Assigned To: Larry Woodman
Brian Brock
:
Depends On:
Blocks: 168424
  Show dependency treegraph
 
Reported: 2005-02-11 17:10 EST by Bert Barbe
Modified: 2007-11-30 17:07 EST (History)
11 users (show)

See Also:
Fixed In Version: RHSA-2006-0144
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-03-15 10:50:32 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Comment 1 Suzanne Hillman 2005-02-14 11:43:04 EST
Does this have a corresponding Issue Tracker?
Comment 2 Van Okamura 2005-02-15 15:30:34 EST
Issue Tracker 66021 filed for this.
Comment 6 Issue Tracker 2005-03-28 11:29:59 EST
From User-Agent: XML-RPC

Tried to recreate the issue *without* success on 2.4.21-27.0.1.ELsmp kernel
with the uploaded test case on aic7xxx driver. Did this problem only show
up with cciss driver ? Only with pwrite64() call ? Could you modify the
uploaded ptest.c to closely resemble what you have done ? 

File uploaded: ptest.c

This event sent from IssueTracker by wcheng
 issue 66021
it_file 37308
Comment 7 Wendy Cheng 2005-04-01 01:02:30 EST
Thanks for the new ptest2.c file in the issue tracker. Retrying ...
Comment 8 Wendy Cheng 2005-04-01 01:54:41 EST
Well, it refuses to happen on my test machine (Pentium III hyperthreaded). I'll
let it loop overnight to see how it goes. If nothing happens, will move the test
to another bigger smp box. 
Comment 9 Bert Barbe 2005-04-01 10:15:44 EST
I couldnt reproduce on single cpu with ide or firewire (which also turns up as
scsi). I couldnt reproduce on dual cpu with ide either. So far only on
dual cpu with scsi. Hth.
Comment 10 Wendy Cheng 2005-04-01 11:28:06 EST
Yesterday's box is on aic7xxx SCSI (hyperthreaded) - nothing happened. Now the
test is running on Dell PowerEdge 1600SC (2 cpus hyperthreaded to 4) on top of
mptscsih scsi. No news yet. 
Comment 16 Wendy Cheng 2005-06-15 13:13:34 EDT
IT ticket escalated into engineering team. Three issues:
                                                                               
                                     
1. The do_generic_direct_write doesn't pass the correct error and write status
back to caller. Will re-package what we have found so far into a patch format.
This work is more or less complete.
2. Upon errors returned by do_generic_direct_write, we need to fall back to
buffer write. The current buffer write code path doesn't pass the correct error
(if there is one) back to caller. This is bugzilla 116900.
3. Bert found another issue (ENOSPC) that was documented in the IT ticket:
"about the buffered write: I found that prepare_write is returning -28 (-ENOSPC)
, while there's plenty of room left on the fs. It seems that the most probable
cause of this is ext3_get_block which cals ext3_new_block which returns -ENOSPC
as a default error. Probably somewhere in there ENOSPC is masking something that
should have been EAGAIN ? Perhaps a possible solution would be to check the
available space upon encountering -ENOSPC and retry the operation if there is
enough space available ? Looking at the 2.6 code I see that ext3_prepare_write
has some additional code there to do such retries; Perhaps this can be
backported to the rhel3 kernel ?"
Comment 30 Ernie Petrides 2005-10-10 19:15:07 EDT
Bert, does this bugzilla need to remain private?  If not, please uncheck
the "Oracle Confidential Group" box below.  Thanks.

Comment 36 Ernie Petrides 2005-10-21 20:54:26 EDT
Making this bug private as a whole, but marking initial comment private
(in lieu of Bert's response to comment #30).
Comment 43 Ernie Petrides 2005-11-30 02:28:40 EST
A fix for this problem has just been committed to the RHEL3 U7
patch pool this evening (in kernel version 2.4.21-37.12.EL).
Comment 48 Red Hat Bugzilla 2006-03-15 10:50:32 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0144.html

Note You need to log in before you can comment on or make changes to this bug.