Bug 239625

Summary: NFS O_EXCL|O_CREAT on R/O filesystem creates erroneous negative dentry
Product: Red Hat Enterprise Linux 4 Reporter: Andy Isaacson <aisaacson>
Component: kernelAssignee: Jeff Layton <jlayton>
Status: CLOSED ERRATA QA Contact: Martin Jenner <mjenner>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.4CC: kgraham, ppokorny, staubach, steved
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2007-0791 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-11-15 16:26:49 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
proposed patch none

Description Andy Isaacson 2007-05-10 01:10:52 UTC
Description of problem:
Attempting to create a file which already exists on a R/O NFS mount can result
in the file becoming inaccessible.  ls reports
?---------  ? ? ? ?           ? issue

Version-Release number of selected component (if applicable):
2.6.9-42.0.10

How reproducible:
deterministic.

Steps to Reproduce:
1. ssh plum cp /etc/issue /home/tmp/issue
2. mount -o ro plum:/home/tmp /mnt/tmp
3. (cd /etc; tar cf /tmp/issue.tar issue)
4. (cd /mnt/tmp; tar xf /tmp/issue.tar)
5. cat /mnt/tmp/issue
  
Actual results:
# (cd /mnt/tmp; tar xvf /tmp/issue.tar)
issue
tar: issue: Cannot open: Read-only file system
tar: Error exit delayed from previous errors
# ls -l /mnt/tmp
total 0
?---------  ? ? ? ?           ? issue
# cat /mnt/tmp/issue
cat: /mnt/tmp/issue: No such file or directory

Expected results:
the contents of the file present on the NFS server

Additional info:

This appears to be distinct from bug 228801 and bug 224424, and there is no
automounter involvement like in bug 201211.  It's not intermittent like bug
150759.  It does seem to be fixed in upstream 2.6.20, but I wasn't able to track
down which NFS change fixed it.

Comment 1 Jeff Layton 2007-05-15 13:00:51 UTC
Ok, I've been able to reproduce this. I'll see if I can track down the cause...


Comment 2 Jeff Layton 2007-05-15 14:03:13 UTC
Also, somewhat related is the fact that you get different errors when you try to
open a file this way, depending on whether the inode already exists in kernel
(-EROFS if it doesn't, and -EEXIST if it does). The difference seems to be that
in the first case (dentry doesn't exist yet), we fall into this in open_namei:

        /* Negative dentry, just create the file */
        if (!dentry->d_inode) {

...in the other case, d_inode already exists for the dentry and we don't. I need
to test whether this is also an issue on other filesystems as well...


Comment 3 Jeff Layton 2007-05-15 14:14:52 UTC
This seems to be NFS-specific. ext3 does not show the same behavior (it returns
-EEXIST every time), and doesn't call vfs_create (so it's not falling into the
same if condition above).


Comment 4 Jeff Layton 2007-05-15 14:50:01 UTC
It looks like the issue is this in nfs_lookup:

        /* If we're doing an exclusive create, optimize away the lookup */
        if (nfs_is_exclusive_create(dir, nd))
                goto no_entry;

This seems to make the dentry returned be a negative dentry, presuming that the
create call will fix everything up. Commenting this out corrects the problem,
but it may actually be better to fix this on the backend so that if the create
fails we just remove the dentry.


Comment 5 Jeff Layton 2007-05-15 15:18:11 UTC
Actually, I think we need this patch:

http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=fd6840714d9cf6e93f1d42b904860a94df316b85

I'll have a closer look.


Comment 6 Andy Isaacson 2007-05-15 17:11:10 UTC
(In reply to comment #5)
> Actually, I think we need this patch:
> 
> NFS: nfs_lookup - don't hash dentry when optimising away the lookup

That looks very promising.  Thanks for tracking it down, and I'll see if I can
verify that it fixes the problem on our systems.

Comment 7 Jeff Layton 2007-05-15 17:51:31 UTC
Created attachment 154754 [details]
proposed patch

This patch is a backport of the one I linked to earlier. I've not tested it so
not sure if it will help this. It looks like it will though...

Comment 9 RHEL Program Management 2007-05-16 17:26:54 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 10 Andy Isaacson 2007-06-01 21:19:29 UTC
(In reply to comment #7)
> Created an attachment (id=154754) [edit]
> proposed patch
> 
> This patch is a backport of the one I linked to earlier. I've not tested it so
> not sure if it will help this. It looks like it will though...

Even without this patch, I cannot reproduce the failure on 2.6.9-55.EL.  (I used
the CentOS RPM because our IT staff hasn't fetched the RHEL4u5 updates yet.)

[root@flatline ~]# uname -a
Linux flatline 2.6.9-55.ELsmp #1 SMP Wed May 2 14:04:42 EDT 2007 x86_64 x86_64
x86_64 GNU/Linux
[root@flatline ~]# dmesg | head -2
Bootdata ok (command line is ro root=/dev/VolGroup00/LogVol00 rhgb quiet vga=0x317)
Linux version 2.6.9-55.ELsmp (mockbuild.org) (gcc version 3.4.6
20060404 (Red Hat 3.4.6-8)) #1 SMP Wed May 2 14:04:42 EDT 2007
[root@flatline ~]# mount -o ro plum:/home/tmp /mnt/tmp
[root@flatline ~]# (cd /mnt/tmp; tar xvf /tmp/issue.tar)
issue
tar: issue: Cannot open: Read-only file system
tar: Error exit delayed from previous errors
[root@flatline ~]# ls -l /mnt/tmp
total 4
-rw-r--r--  1 adi adi 76 May  9 17:58 issue

I've repeated the test 4 times just to be sure; this sequence reliably
reproduces the failure on 2.6.9-42.0.10, and reliably does not fail on 2.6.9-55.EL.

And yet, I cannot see any sign of a patch in -55 that would resolve it.

Comment 11 Jason Baron 2007-06-25 20:22:28 UTC
committed in stream U6 build 55.12. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/


Comment 14 errata-xmlrpc 2007-11-15 16:26:49 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0791.html