Bug 147870 - O_DIRECT to sparse areas of files give incomplete writes
Summary: O_DIRECT to sparse areas of files give incomplete writes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: i386
OS: Linux
medium
high
Target Milestone: ---
Assignee: Larry Woodman
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks: 168424
TreeView+ depends on / blocked
 
Reported: 2005-02-11 22:10 UTC by Bert Barbe
Modified: 2007-11-30 22:07 UTC (History)
11 users (show)

Fixed In Version: RHSA-2006-0144
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-03-15 15:50:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2006:0144 0 qe-ready SHIPPED_LIVE Moderate: Updated kernel packages available for Red Hat Enterprise Linux 3 Update 7 2006-03-15 05:00:00 UTC

Comment 1 Suzanne Hillman 2005-02-14 16:43:04 UTC
Does this have a corresponding Issue Tracker?

Comment 2 Van Okamura 2005-02-15 20:30:34 UTC
Issue Tracker 66021 filed for this.

Comment 6 Issue Tracker 2005-03-28 16:29:59 UTC
From User-Agent: XML-RPC

Tried to recreate the issue *without* success on 2.4.21-27.0.1.ELsmp kernel
with the uploaded test case on aic7xxx driver. Did this problem only show
up with cciss driver ? Only with pwrite64() call ? Could you modify the
uploaded ptest.c to closely resemble what you have done ? 

File uploaded: ptest.c

This event sent from IssueTracker by wcheng
 issue 66021
it_file 37308

Comment 7 Wendy Cheng 2005-04-01 06:02:30 UTC
Thanks for the new ptest2.c file in the issue tracker. Retrying ...

Comment 8 Wendy Cheng 2005-04-01 06:54:41 UTC
Well, it refuses to happen on my test machine (Pentium III hyperthreaded). I'll
let it loop overnight to see how it goes. If nothing happens, will move the test
to another bigger smp box. 

Comment 9 Bert Barbe 2005-04-01 15:15:44 UTC
I couldnt reproduce on single cpu with ide or firewire (which also turns up as
scsi). I couldnt reproduce on dual cpu with ide either. So far only on
dual cpu with scsi. Hth.

Comment 10 Wendy Cheng 2005-04-01 16:28:06 UTC
Yesterday's box is on aic7xxx SCSI (hyperthreaded) - nothing happened. Now the
test is running on Dell PowerEdge 1600SC (2 cpus hyperthreaded to 4) on top of
mptscsih scsi. No news yet. 

Comment 16 Wendy Cheng 2005-06-15 17:13:34 UTC
IT ticket escalated into engineering team. Three issues:
                                                                               
                                     
1. The do_generic_direct_write doesn't pass the correct error and write status
back to caller. Will re-package what we have found so far into a patch format.
This work is more or less complete.
2. Upon errors returned by do_generic_direct_write, we need to fall back to
buffer write. The current buffer write code path doesn't pass the correct error
(if there is one) back to caller. This is bugzilla 116900.
3. Bert found another issue (ENOSPC) that was documented in the IT ticket:
"about the buffered write: I found that prepare_write is returning -28 (-ENOSPC)
, while there's plenty of room left on the fs. It seems that the most probable
cause of this is ext3_get_block which cals ext3_new_block which returns -ENOSPC
as a default error. Probably somewhere in there ENOSPC is masking something that
should have been EAGAIN ? Perhaps a possible solution would be to check the
available space upon encountering -ENOSPC and retry the operation if there is
enough space available ? Looking at the 2.6 code I see that ext3_prepare_write
has some additional code there to do such retries; Perhaps this can be
backported to the rhel3 kernel ?"


Comment 30 Ernie Petrides 2005-10-10 23:15:07 UTC
Bert, does this bugzilla need to remain private?  If not, please uncheck
the "Oracle Confidential Group" box below.  Thanks.



Comment 36 Ernie Petrides 2005-10-22 00:54:26 UTC
Making this bug private as a whole, but marking initial comment private
(in lieu of Bert's response to comment #30).

Comment 43 Ernie Petrides 2005-11-30 07:28:40 UTC
A fix for this problem has just been committed to the RHEL3 U7
patch pool this evening (in kernel version 2.4.21-37.12.EL).


Comment 48 Red Hat Bugzilla 2006-03-15 15:50:32 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0144.html



Note You need to log in before you can comment on or make changes to this bug.