Bug 137194
Summary: | NFS short writes cause file corruption with NFS O_DIRECT | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Chuck Lever <cel> | ||||||
Component: | kernel | Assignee: | Steve Dickson <steved> | ||||||
Status: | CLOSED WONTFIX | QA Contact: | Brian Brock <bbrock> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 3.0 | CC: | greg.marsden, jturner, petrides, riel, sct, staubach | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
URL: | http://client.linux-nfs.org/ | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2006-01-19 18:30:21 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Chuck Lever
2004-10-26 16:53:22 UTC
Created attachment 105801 [details]
potential fix for this problem (diff against 2.4.21-20.EL)
Hey Chuck, Has this type of corruption been reported by any customers? no customer reports, the bug was found by code inspection. Created attachment 115302 [details]
updated patch
Chuck,
The original patch did not compile in a current RHEL3 kernel. So
I wanted to run this by you to ensure its correct. With my testing
the patch seem not seem to cause any regressions, but, unfortunately,
I was not able to reproduce the corruption either
i'm not sure why there is an "args.request" in the patch i attached. the 2.4.21-20.EL source i have here uses "args.count" just as your new patch does. looks good. This patch does not look right to me. It is valid for NFS servers to write less data than was requested. There is no error implied when an NFS server does so because it may have done so for its own reasons. Of course, the NFS server may have written less data than requested because it did encounter some sort of out of space or exceeded quota limit. The client can discover this by generating another request to write the remainder of the data. If a real error existed which prevented the server from writing the full data the first time, then an error will be returned on this additional request. An NFS server is responsible for either storing the data that it has indicated that it has or returning an error to indicate why it could not. The client is responsible for storing all of the data requested by an application or returning an error indicating why it could not. A short return to the write(2) system call is generally interpreted by applications as an error having occurred. In this case, if the NFS client returns short, when no error has actually occurred, then the application may misbehave needlessly. The NFS client should implement proper support to handle short write returns. It should not matter whether the WRITE requests are being generated from the data cache or from an O_DIRECT request. hi peter- i agree that a server is allowed to return a short write, and that it is usually not an error. given the constraints on resources and ABI compatibility, however, the patch i have provided is only damage control for RHEL 3, and nothing more. if Red Hat has the resources to implement complete and ABI-compatible support for handling short reads and writes in both the cached and direct I/O paths in RHEL 3, then by all means, have at it. as 2.6 kernels are evolving, the NFS client in those kernels will eventually have complete support for handling short reads and writes in both the cached and direct I/O paths. i do not agree, however, that a short return from write(2) will cause "needless application misbehavior". if an app can't handle a short write, then it is poorly written and should be fixed. short writes will happen no matter what, and applications must be able to recover properly. Due the the fact there has been not one reported problem of this nature and the proposed patch does introduce a functionality change (i.e. short writes). So I am very concern that fix of this type (or any type for that matter) has a high potently of introducing a regression. Therefore, I'm closing this bug as WONTFIX. |