Description of problem: The kernel is crashing at fs/nfs/namespace.c:103 92 static void * nfs_follow_mountpoint(struct dentry *dentry, struct nameidata *nd) 93 { 94 struct vfsmount *mnt; 95 struct nfs_server *server = NFS_SERVER(dentry->d_inode); 96 struct dentry *parent; 97 struct nfs_fh fh; 98 struct nfs_fattr fattr; 99 int err; 100 101 dprintk("--> nfs_follow_mountpoint()\n"); 102 103 BUG_ON(IS_ROOT(dentry)); 104 dprintk("%s: enter\n", __FUNCTION__); 105 dput(nd->dentry); 106 nd->dentry = dget(dentry); Version-Release number of selected component (if applicable): 2.6.18-8.el5 and probably up. How reproducible: Frequently Steps to Reproduce: 1. Mount a directory from NetApp with subdir with another FSID 2. two directories get mounted 3. attempt to 'find' or 'ls' on that subdir triggers the bug. There are two vmcores available at megatron.gsslab.rdu.redhat.com /cores/20080221172001/work. The first crash backtrace: nfs: server scratchy OK SELinux: initialized (dev 0:1a, type nfs), uses genfs_contexts SELinux: initialized (dev 0:1b, type nfs), uses genfs_contexts SELinux: initialized (dev 0:1c, type nfs), uses genfs_contexts audit(1203529272.675:172): avc: denied { execmod } for pid=3168 comm="java" name="libj9jit23.so " dev=dm-1 ino=492399 scontext=user_u:system_r:unconfined_t:s0 tcontext=user_u:object_r:file_t:s0 tclass=file SELinux: initialized (dev 0:1b, type nfs), uses genfs_contexts ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at fs/nfs/namespace.c:103 invalid opcode: 0000 [1] SMP last sysfs file: /power/state CPU 0 Modules linked in: nfs fscache nfsd exportfs lockd nfs_acl ipv6 autofs4 sunrpc xennet parport_pc l p parport pcspkr dm_snapshot dm_zero dm_mirror dm_mod xenblk ext3 jbd ehci_hcd ohci_hcd uhci_hcd Pid: 3945, comm: ls Not tainted 2.6.18-8.el5xen #1 RIP: e030:[<ffffffff882463fc>] [<ffffffff882463fc>] :nfs:nfs_follow_mountpoint+0x2d/0x1d9 RSP: e02b:ffff88000d199b98 EFLAGS: 00010246 RAX: ffff88006ad41c00 RBX: ffff880024219150 RCX: 000000000000000b RDX: 0000000000000000 RSI: ffff88000d199ec8 RDI: ffff880024219150 RBP: ffff88000d199ec8 R08: ffff88000d199bc8 R09: 0000000000000000 R10: ffff88000d199ca8 R11: 0000000000000048 R12: ffff88000d199ec8 R13: ffff88002c2e00c0 R14: 0000000000000000 R15: ffff880064493009 FS: 00002aaaaaab9bd0(0000) GS:ffffffff8058d000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 Process ls (pid: 3945, threadinfo ffff88000d198000, task ffff88005c2927e0) Stack: 0000000000000000 0000000000000002 ffff88000d199ca8 ffffffff803044c3 ffff88000d199ca8 ffff88007fe04bd8 ffffffffffffffff ffffffff00000000 0000011000000001 ffff88005c2927e0 Call Trace: [<ffffffff803044c3>] avc_has_perm+0x43/0x55 [<ffffffff80304ffa>] inode_has_perm+0x56/0x63 [<ffffffff803044c3>] avc_has_perm+0x43/0x55 [<ffffffff88239ff0>] :nfs:nfs_access_get_cached+0xab/0xfa [<ffffffff8030970d>] selinux_inode_follow_link+0x5f/0x6a [<ffffffff8020a370>] __link_path_walk+0xb71/0xf42 [<ffffffff8020e527>] link_path_walk+0x5c/0xe5 [<ffffffff8022c4f1>] mntput_no_expire+0x19/0x89 [<ffffffff8020c9ef>] do_path_lookup+0x270/0x2ec [<ffffffff802122c7>] getname+0x15b/0x1c1 [<ffffffff802233eb>] __user_walk_fd+0x37/0x4c [<ffffffff802ce9bc>] sys_getxattr+0x2b/0x62 [<ffffffff8025c432>] system_call+0x86/0x8b [<ffffffff8025c3ac>] system_call+0x0/0x8b Code: 0f 0b 68 d2 e8 25 88 c2 67 00 49 8b 3c 24 e8 29 6c fc f7 8b RIP [<ffffffff882463fc>] :nfs:nfs_follow_mountpoint+0x2d/0x1d9 RSP <ffff88000d199b98> The second one: #6 [ffff880079feba50] error_exit at ffffffff8025cb53 [exception RIP: nfs_follow_mountpoint+45] RIP: ffffffff882463fc RSP: ffff880079febb08 RFLAGS: 00010246 RAX: ffff88002a759800 RBX: ffff88002a2b03d8 RCX: 0000000000000004 RDX: 0000000000000000 RSI: ffff880079febea8 RDI: ffff88002a2b03d8 RBP: ffff880079febea8 R8: ffff880079febb38 R9: 0000000000000000 R10: ffff880079febc18 R11: 0000000000000048 R12: ffff880079febea8 R13: ffff8800709edec0 R14: 0000000000000000 R15: ffff880045c5a009 ORIG_RAX: ffffffffffffffff CS: e030 SS: e02b #7 [ffff880079febb20] avc_has_perm at ffffffff803044c3 #8 [ffff880079febb90] inode_has_perm at ffffffff80304ffa #9 [ffff880079febc10] selinux_inode_follow_link at ffffffff8030970d #10 [ffff880079febc90] __link_path_walk at ffffffff8020a370 #11 [ffff880079febd00] link_path_walk at ffffffff8020e527 #12 [ffff880079febdc0] do_path_lookup at ffffffff8020c9ef #13 [ffff880079febe00] __path_lookup_intent_open at ffffffff802231db #14 [ffff880079febe40] open_namei at ffffffff8021a354 #15 [ffff880079febea0] do_filp_open at ffffffff80226fd4 #16 [ffff880079febf50] do_sys_open at ffffffff802190dd #17 [ffff880079febf80] system_call at ffffffff8025c432 RIP: 0000003b9d8bf310 RSP: 00007fffe82fbb38 RFLAGS: 00000246 RAX: 0000000000000002 RBX: ffffffff8025c432 RCX: ffffffff8025c3ac RDX: ff736e67726f606d RSI: 0000000000010800 RDI: 000000000405581e RBP: 0000000000000001 R8: fefefefefefefeff R9: 00007fffe82fbb60 R10: 0000000004055828 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000004055810 R14: 0000000000000003 R15: 0000000000000017 Checking 'mount' of these vmcores it shows: <snipped> ffff880076e4ca80 ffff88002a560400 nfs bank:/vol/vol2/usr2/hmann /local/mnt/usr2/hmann ffff880076e4c580 ffff88000dac1000 nfs bank:/vol/vol2/usr2/hmann/.snapshot .snapshot <------------------ ffff880076e4c380 ffff88005f6f1c00 nfs watercooler:/vol/vol0/watercooler/usr2/davidwu /local/mnt/usr2/davidwu ffff880076e4c480 ffff88002a560800 nfs watercooler:/vol/vol0/watercooler/usr2/davidwu/.snapshot .snapshot <--------- <snipped> Both crashes are attempts to access '.snapshot' that ends up calling nfs_follow_mountpoint(). In this case it has: a/b where 'a' is the root dentry and 'b' is another root dentry. The operation is assigned at: 199 struct inode * 200 nfs_fhget(struct super_block *sb, struct nfs_fh *fh, struct nfs_fattr *fattr) <snipped> 252 /* Deal with crossing mountpoints */ 253 if (!nfs_fsid_equal(&NFS_SB(sb)->fsid, &fattr->fsid)) { 254 if (fattr->valid & NFS_ATTR_FATTR_V4_REFERRAL) 255 inode->i_op = &nfs_referral_inode_operations; 256 else 257 inode->i_op = &nfs_mountpoint_inode_operations; 258 inode->i_fop = NULL; <snipped> It probably works for the first time because it get mounted and it's not a root dentry yet, but on the second time access it triggers that 'BUG_ON(IS_ROOT(dentry))'.
Found the commit that fixes it: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4c1fe2f78a08e2c514a39c91a0eb7b55bbd3c0d2 Commit: 4c1fe2f78a08e2c514a39c91a0eb7b55bbd3c0d2 Parent: eda4f9b7996e5520934ca2a7310b363463a4e3b0 Author: Neil Brown <[EMAIL PROTECTED]> AuthorDate: Thu Nov 1 16:50:20 2007 +1100 Committer: Trond Myklebust <[EMAIL PROTECTED]> CommitDate: Sat Nov 17 13:08:48 2007 -0500 kernel BUG at fs/nfs/namespace.c:108! - can be triggered by bad server Hi Trond, I have discovered that the BUG_ON in nfs_follow_mountpoint: BUG_ON(IS_ROOT(dentry)); can be triggered by a misbehaving server. What happens is the client does a lookup and discoveres that the named directory has a different fsid, so it initiates a mount. It then performs a GETATTR on the mounted directory and gets a different fsid again (due to a bug in the NFS server). This causes nfs_follow_mountpoint to be called on the newly mounted root, which triggers the BUG_ON. To duplicate this, have a directory which contains some mountpoints, and export that directory with the "crossmnt" flag using nfs-utils 1.1.1 (or 1.1.0 I think) The GETATTR on the root of the mounted filesystem will return the information for the top exportpoint, while a lookup will return the correct information. This difference causes the NFS client to BUG. I think the best way to fix this is to trap this possibility early, so just before completing the mount in the NFS client, check that it isn't going to use nfs_mountpoint_inode_operations. As long as i_op will never change once set (is that true?), this should be adequately safe. The following patch shows a possible approach, and it works for me. i.e. when the NFS server is misbehaving, I get ESTALE on those mountpoints, while when the NFS server is working correctly, I get correct behaviour on the client. NeilBrown Signed-off-by: Neil Brown <[EMAIL PROTECTED]> Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]> --- fs/nfs/super.c | 5 +++++ 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/fs/nfs/super.c b/fs/nfs/super.c index fa517ae..71067d1 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -1474,6 +1474,11 @@ static int nfs_xdev_get_sb(struct file_system_type *fs_type, int flags, error = PTR_ERR(mntroot); goto error_splat_super; } + if (mntroot->d_inode->i_op != &nfs_dir_inode_operations) { + dput(mntroot); + error = -ESTALE; + goto error_splat_super; + } s->s_flags |= MS_ACTIVE; mnt->mnt_sb = s; Flavio
*** This bug has been marked as a duplicate of bug 427424 ***