From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2) Gecko/20040803 Description of problem: suppose an application wants to write 96KB directly. the client turns this into 3 32KB on-the-wire writes, A, B, and C. NFS servers do not have to return an error if they write only some of the requested bytes: this is called a "short write." Case 1: Normal NFS direct writes |----A----||----B----||----C----| Case 2: today: NFS server returns a short write |----A----||--B--||----C----|hole Case 3: possible: NFS server returns a short write |----A----||--B--|hole|----C----| Case 4: preferred: NFS server returns a short write |----A----||--B--||----hole-----| the NFS cached path has some recovery (or at least reporting) capability for short writes, the direct path does not. see below for analysis. Version-Release number of selected component (if applicable): kernel-2.4.21-20.EL How reproducible: Didn't try Steps to Reproduce: you'd have to rig a server to return an occasional short write, and then run an application on your client that performed direct writes and verified the contents of the file afterwards (OraSim, for example). Actual Results: today's client behaves like case 2. this can result in data being written to the wrong offset in the file. Expected Results: case 3 is a possible fix, but i argue that case 4 gives the most flexibility to the application for detection of and recovery from a short write, without rewriting the NFS direct write path to retry a short write. Additional info: a similar patch is destined for 2.6. (http://client.linux-nfs.org/). i will attach a patch to fix this in the RHEL 3.0 update 3 NFS client.
Created attachment 105801 [details] potential fix for this problem (diff against 2.4.21-20.EL)
Hey Chuck, Has this type of corruption been reported by any customers?
no customer reports, the bug was found by code inspection.
Created attachment 115302 [details] updated patch Chuck, The original patch did not compile in a current RHEL3 kernel. So I wanted to run this by you to ensure its correct. With my testing the patch seem not seem to cause any regressions, but, unfortunately, I was not able to reproduce the corruption either
i'm not sure why there is an "args.request" in the patch i attached. the 2.4.21-20.EL source i have here uses "args.count" just as your new patch does. looks good.
This patch does not look right to me. It is valid for NFS servers to write less data than was requested. There is no error implied when an NFS server does so because it may have done so for its own reasons. Of course, the NFS server may have written less data than requested because it did encounter some sort of out of space or exceeded quota limit. The client can discover this by generating another request to write the remainder of the data. If a real error existed which prevented the server from writing the full data the first time, then an error will be returned on this additional request. An NFS server is responsible for either storing the data that it has indicated that it has or returning an error to indicate why it could not. The client is responsible for storing all of the data requested by an application or returning an error indicating why it could not. A short return to the write(2) system call is generally interpreted by applications as an error having occurred. In this case, if the NFS client returns short, when no error has actually occurred, then the application may misbehave needlessly. The NFS client should implement proper support to handle short write returns. It should not matter whether the WRITE requests are being generated from the data cache or from an O_DIRECT request.
hi peter- i agree that a server is allowed to return a short write, and that it is usually not an error. given the constraints on resources and ABI compatibility, however, the patch i have provided is only damage control for RHEL 3, and nothing more. if Red Hat has the resources to implement complete and ABI-compatible support for handling short reads and writes in both the cached and direct I/O paths in RHEL 3, then by all means, have at it. as 2.6 kernels are evolving, the NFS client in those kernels will eventually have complete support for handling short reads and writes in both the cached and direct I/O paths. i do not agree, however, that a short return from write(2) will cause "needless application misbehavior". if an app can't handle a short write, then it is poorly written and should be fixed. short writes will happen no matter what, and applications must be able to recover properly.
Due the the fact there has been not one reported problem of this nature and the proposed patch does introduce a functionality change (i.e. short writes). So I am very concern that fix of this type (or any type for that matter) has a high potently of introducing a regression. Therefore, I'm closing this bug as WONTFIX.