Description of problem: When any program (and there are a lot of them!) uses "open" from fcntl.h with the O_WRONLY and O_TRUNC flags on a file where the group permissions allow writing but the owner permissions do not, the file is deleted on the server (like the O_TRUNC flag says to do). However, the open call returns an "EPERM" error, causing the program to think that the call has failed. This generally leaves the file zeroed out. This problem does not affect nfs shares orginating from FC5 machines. Only from our freebsd server (5.2.1-RELEASE-p9). However, this problem did not occur with FC2 or RHEL3. Version-Release number of selected component (if applicable): uname -r gives 2.6.17-1.2139_FC5smp for our FC5 machines How reproducible: Every time. Steps to Reproduce: 1. Create file "deadmeat.txt" -- compose your graduate thesis there. 2. Set file owner to foo. Set group to bar. Set permissions to 660. (The next step can be done from any application that uses open with the O_TRUNC flag. I chose vim here.) 3. From user notfoo in group bar, open deadmeat.txt in vim. 4. Run ":wq" Actual results: vim gives error: "E212: Can't open file for writing" The contents of deadmeat.txt (your graduate thesis) are zeroed out. Expected results: No error from vim, the save should have proceeded normally. Additional info: Fedora Core 2 and all of our RHEL 3 servers do not have this problem with our FreeBSD server. This points to a Fedora Core 5 problem. You can write a custom C program to test this more directly. Just verify that this call returns -1 on a file owned by someone else, but in the user's group: open("filename", O_WRONLY|O_TRUNC, 0660); Make sure to include fcntl.h
Sorry. Steps to reproduce 1.) should have said to create deadmeat.txt in a directory from a BSD server shared over nfs.
Steve, are you already looking into this? Someone here offered to set up a bsd server for me to investigate with if not.
No... have not looked into this... Buts lets take up the offer on the server get an tethereal trace of the problem.Something similar to: tethereal -w /tmp/data.pcap host <server> ; bzip2 /tmp/data.pcap
Created attachment 137772 [details] pcap of bug Heres a pcap of the bug. The file in question was ~/thras/test The contents (before deletion) were something along the line of: hello world good The server is "userhost" The client (from which tethereal was being run) is speare5-1-17
Joel, I'll try this here too, but is there any chance you can do a test with a newer FBSD server (6.1)? It seems like it's the server's responsibility to get this right, in the end...
I'd have to set up a 6.1 server, I don't have one around. If the other person mentioned in comment #2 already has one up, that might be the easiest way. I don't see that it can be blamed on the server though, since Ubuntu (6.06 LTS), RHEL 3, and Fedora Core 2 all get this right. And the server seems to be responding the O_TRUNC call correctly and zeroing out the file.
That's fine, we have a 6.1 server set up here now, I'll give it a shot soon (but something urgent came up, so this needs to wait just a little bit). Out of curiosity, is the file zeroed out when viewed on the server as well?
>Out of curiosity, is the file zeroed out when viewed on the server as well? Yes.
A new kernel update has been released (Version: 2.6.18-1.2200.fc5) based upon a new upstream kernel release. Please retest against this new kernel, as a large number of patches go into each upstream release, possibly including changes that may address this problem. This bug has been placed in NEEDINFO state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. In the last few updates, some users upgrading from FC4->FC5 have reported that installing a kernel update has left their systems unbootable. If you have been affected by this problem please check you only have one version of device-mapper & lvm2 installed. See bug 207474 for further details. If this bug is a problem preventing you from installing the release this version is filed against, please see bug 169613. If this bug has been fixed, but you are now experiencing a different problem, please file a separate bug for the new problem. Thank you.
One other note on this one; I tested against freebsd 6.2, and I did not see the problem. Steve, have you had a chance to look over the network traces yet?
We've tested 2.6.18-1.2200.fc5 and verified that the problem still exists.
First of all... sorry for taking so long to get back to this... In the network trace, looking at packets 60 and 61 is appear the client is truncating the file and the server is returning EPERM. So it appears the server is zeroing out the file but also returning EPERM. Looking at the before and after attributes (which are part of NFS SETATTR proc) the size is 21 before the SETATTR and 0 after the SETATTR.... So looks like its server issue because either the server should do the truncation and return success or don't do the truncation and return EPERM... not both...
We solved the problem by replacing our BSD NFS server with an RHEL server.
Obviously, I think that as the best move... :-) Thank you for our business!