Does this have a corresponding Issue Tracker?
Issue Tracker 66021 filed for this.
From User-Agent: XML-RPC Tried to recreate the issue *without* success on 2.4.21-27.0.1.ELsmp kernel with the uploaded test case on aic7xxx driver. Did this problem only show up with cciss driver ? Only with pwrite64() call ? Could you modify the uploaded ptest.c to closely resemble what you have done ? File uploaded: ptest.c This event sent from IssueTracker by wcheng issue 66021 it_file 37308
Thanks for the new ptest2.c file in the issue tracker. Retrying ...
Well, it refuses to happen on my test machine (Pentium III hyperthreaded). I'll let it loop overnight to see how it goes. If nothing happens, will move the test to another bigger smp box.
I couldnt reproduce on single cpu with ide or firewire (which also turns up as scsi). I couldnt reproduce on dual cpu with ide either. So far only on dual cpu with scsi. Hth.
Yesterday's box is on aic7xxx SCSI (hyperthreaded) - nothing happened. Now the test is running on Dell PowerEdge 1600SC (2 cpus hyperthreaded to 4) on top of mptscsih scsi. No news yet.
IT ticket escalated into engineering team. Three issues: 1. The do_generic_direct_write doesn't pass the correct error and write status back to caller. Will re-package what we have found so far into a patch format. This work is more or less complete. 2. Upon errors returned by do_generic_direct_write, we need to fall back to buffer write. The current buffer write code path doesn't pass the correct error (if there is one) back to caller. This is bugzilla 116900. 3. Bert found another issue (ENOSPC) that was documented in the IT ticket: "about the buffered write: I found that prepare_write is returning -28 (-ENOSPC) , while there's plenty of room left on the fs. It seems that the most probable cause of this is ext3_get_block which cals ext3_new_block which returns -ENOSPC as a default error. Probably somewhere in there ENOSPC is masking something that should have been EAGAIN ? Perhaps a possible solution would be to check the available space upon encountering -ENOSPC and retry the operation if there is enough space available ? Looking at the 2.6 code I see that ext3_prepare_write has some additional code there to do such retries; Perhaps this can be backported to the rhel3 kernel ?"
Bert, does this bugzilla need to remain private? If not, please uncheck the "Oracle Confidential Group" box below. Thanks.
Making this bug private as a whole, but marking initial comment private (in lieu of Bert's response to comment #30).
A fix for this problem has just been committed to the RHEL3 U7 patch pool this evening (in kernel version 2.4.21-37.12.EL).
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2006-0144.html