Bug 151284
Summary: | mmap of file over NFS corrupts data | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Aleksandar Milivojevic <alex> | ||||
Component: | kernel | Assignee: | Steve Dickson <steved> | ||||
Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 4.0 | CC: | barryn, davej, kanderso, k.georgiou, peterm, riel | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i386 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2005-06-08 15:13:59 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 137160 | ||||||
Attachments: |
|
Description
Aleksandar Milivojevic
2005-03-16 17:53:58 UTC
Steve, this is probably a NOTABUG, but it would be nice if you could verify that ;) One addtional note. Depending on the size of file I'm creating, some blocks do make it to the NFS server. For example, if I create 100kB file, no blocks seems to be commited to the NFS server. If I create 100MB file, some blocks do make it to actual disk storage ("du -sk filename" shows that file uses 11MB of disk space, so I'm still missing like 89MB). This is probably dependent on client machine's RAM size and how much of it is free for caching. Thinking about it some more - the data should be written to the server some 30 seconds after the munmap. While the file is mmaped nothing needs to be written according to POSIX, but after the unmap the pages are marked dirty and the dirty file data flushing code should kick in. Aleksandar, does the data get written out to the server after a few minutes, or does it not get written at all ? Well, I waited for almost half an hour, and it was not written to the server. "du -sk filename" on both client and server returns same numbers (as if the file is sparse). However, when reading the file on the client, I can see the data. I will reboot the client, and see if that will flush the data. Will report back in couple of minutes. After the client was rebooted, all changes to the file were lost forever. BTW, off-topic, how come that bug report is not word-wrapped (like the comments)? Kind of almost impossible to read... Created attachment 112056 [details]
the program that demonstrates problem
The program (still under development) where I first saw the problem. "mkfile
-v 100k foo" should create file and allocate disk blocks. "mkfile -nv 100k
bar" should only create the fine and not allocate any disk blocks.
Too see the problem more clearly, modify memset line to read:
memset(buf, '.', len);
and run the program as "mkfile -v 100k foo" on NFS client. If you are able to
reproduce the problem, "less foo" on the NFS client (linux in my environment)
should show the file full of dots. "less foo" on NFS server (solaris 9 box in
my environment) should show the file full of nuls. The NFS partition in my
case was automounted home directory:
automount(pid2356) on /home type autofs
(rw,fd=5,pgrp=2356,minproto=2,maxproto=4)
nfsserver:/path_to/amilivojevic on /home/amilivojevic type nfs
(rw,addr=1.2.3.4)
I've just tested this on Red Hat 7.3 machine running kernel-2.4.20-24.7. On 2.4.20 kernel, everything seems to works correctly. So the bug must have been introduced somewhere in 2.5 or 2.6. Hope this info will help track down where the bug is. I applied patch from http://marc.theaimsgroup.com/?l=bk-commits-head&m=111094632300379&w=2 to kernel-2.6.9-5.0.3.EL SRPM and rebuilt, rebooted, and rerun mkfile program (version that uses mmap calls). Seems that everything works correctly now. All changed pages were comited to NFS server. I'll leave patched kernel running on my desktop. In case there are any problems, I'll let you know. Hopefully there'll be updated official kernel soon. IMO, this bug makes NFS dangerous to use in production (with 2.6.9-5.0.3.EL kernel). Just wondering if this fix will be included in forthcomming RHEL 4.1? yes, it will be in U1. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-420.html |