Bug 810828

Summary: nfs: rm -rf fails on Solaris
Product: [Community] GlusterFS Reporter: Sachidananda Urs <sac>
Component: nfsAssignee: Vinayaga Raman <vraman>
Status: CLOSED CURRENTRELEASE QA Contact: Sachidananda Urs <sac>
Severity: urgent Docs Contact:
Priority: high    
Version: pre-releaseCC: aavati, amarts, gluster-bugs, rwheeler, vagarwal
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Solaris   
Whiteboard:
Fixed In Version: glusterfs-3.4.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-24 17:53:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 817967    

Description Sachidananda Urs 2012-04-09 10:28:03 UTC
Corresponding errors in nfs.log:
================================

[2012-04-09 15:39:42.300329] E [nfs3.c:1321:nfs3_lookup_parentdir_resume] 0-nfs-nfsv3: nfs_inode_loc_fill error
[2012-04-09 15:39:42.300385] W [nfs3-helpers.c:3389:nfs3_log_common_res] 0-nfs-nfsv3: XID: 808f2c24, LOOKUP: NFS: 10006(Error occurred on the server or IO Error), POSIX
: 14(Bad address)
[2012-04-09 15:40:23.046467] E [nfs3.c:1321:nfs3_lookup_parentdir_resume] 0-nfs-nfsv3: nfs_inode_loc_fill error
[2012-04-09 15:40:23.046534] W [nfs3-helpers.c:3389:nfs3_log_common_res] 0-nfs-nfsv3: XID: a18f2c24, LOOKUP: NFS: 10006(Error occurred on the server or IO Error), POSIX: 14(Bad address)
[2012-04-09 15:40:23.058266] E [nfs3.c:1321:nfs3_lookup_parentdir_resume] 0-nfs-nfsv3: nfs_inode_loc_fill error
[2012-04-09 15:40:23.058297] W [nfs3-helpers.c:3389:nfs3_log_common_res] 0-nfs-nfsv3: XID: a28f2c24, LOOKUP: NFS: 10006(Error occurred on the server or IO Error), POSIX: 14(Bad address)
[2012-04-09 15:43:04.462671] E [nfs3.c:1321:nfs3_lookup_parentdir_resume] 0-nfs-nfsv3: nfs_inode_loc_fill error
[2012-04-09 15:43:04.462751] W [nfs3-helpers.c:3389:nfs3_log_common_res] 0-nfs-nfsv3: XID: 89902c24, LOOKUP: NFS: 10006(Error occurred on the server or IO Error), POSIX: 14(Bad address)
[2012-04-09 15:43:04.463603] E [nfs3.c:1321:nfs3_lookup_parentdir_resume] 0-nfs-nfsv3: nfs_inode_loc_fill error
[2012-04-09 15:43:04.463631] W [nfs3-helpers.c:3389:nfs3_log_common_res] 0-nfs-nfsv3: XID: 8a902c24, LOOKUP: NFS: 10006(Error occurred on the server or IO Error), POSIX: 14(Bad address)
[2012-04-09 15:43:04.464447] E [nfs3.c:1321:nfs3_lookup_parentdir_resume] 0-nfs-nfsv3: nfs_inode_loc_fill error
[2012-04-09 15:43:04.464472] W [nfs3-helpers.c:3389:nfs3_log_common_res] 0-nfs-nfsv3: XID: 8b902c24, LOOKUP: NFS: 10006(Error occurred on the server or IO Error), POSIX: 14(Bad address)
[2012-04-09 15:43:04.465016] E [nfs3.c:1321:nfs3_lookup_parentdir_resume] 0-nfs-nfsv3: nfs_inode_loc_fill error
[2012-04-09 15:43:04.465039] W [nfs3-helpers.c:3389:nfs3_log_common_res] 0-nfs-nfsv3: XID: 8c902c24, LOOKUP: NFS: 10006(Error occurred on the server or IO Error), POSIX: 14(Bad address)
[2012-04-09 15:43:14.678031] E [nfs3.c:1321:nfs3_lookup_parentdir_resume] 0-nfs-nfsv3: nfs_inode_loc_fill error
[2012-04-09 15:43:14.678086] W [nfs3-helpers.c:3389:nfs3_log_common_res] 0-nfs-nfsv3: XID: 92902c24, LOOKUP: NFS: 10006(Error occurred on the server or IO Error), POSIX: 14(Bad address)
[2012-04-09 15:43:15.150295] E [nfs3.c:1321:nfs3_lookup_parentdir_resume] 0-nfs-nfsv3: nfs_inode_loc_fill error
[2012-04-09 15:43:15.150349] W [nfs3-helpers.c:3389:nfs3_log_common_res] 0-nfs-nfsv3: XID: 93902c24, LOOKUP: NFS: 10006(Error occurred on the server or IO Error), POSIX: 14(Bad address)
[2012-04-09 15:52:25.815029] E [nfs3.c:1321:nfs3_lookup_parentdir_resume] 0-nfs-nfsv3: nfs_inode_loc_fill error
[2012-04-09 15:52:25.815125] W [nfs3-helpers.c:3389:nfs3_log_common_res] 0-nfs-nfsv3: XID: af902c24, LOOKUP: NFS: 10006(Error occurred on the server or IO Error), POSIX: 14(Bad address)
[2012-04-09 15:52:25.816981] E [nfs3.c:1321:nfs3_lookup_parentdir_resume] 0-nfs-nfsv3: nfs_inode_loc_fill error
[2012-04-09 15:52:25.817008] W [nfs3-helpers.c:3389:nfs3_log_common_res] 0-nfs-nfsv3: XID: b2902c24, LOOKUP: NFS: 10006(Error occurred on the server or IO Error), POSIX: 14(Bad address)
[2012-04-09 15:55:54.896980] E [nfs3.c:1321:nfs3_lookup_parentdir_resume] 0-nfs-nfsv3: nfs_inode_loc_fill error


=================

Comment 1 Krishna Srinivas 2012-04-11 12:56:56 UTC
After gnfs server is restarted, nfs clients sending lookup (<FH>, "..") causes this problem. Previously, before gfid based backend changes, NFS server would resolve the GFID by crawling the directory tree and in the end having the dentry cache and inode table updated. So lookup on the parent inode could be sent. But now with gfid based backend after "hard resolution" on <FH> we have:
(gdb) p cs->resolvedloc
$8 = {path = 0x12791620 "<gfid:6786fdb3-62ba-4c0f-a262-56f4c61a2d07>", name = 0x0, inode = 0x2aaaabb721e8, parent = 0x0, gfid = "g\206\375\263b\272L\017\242bV\364\306\032-\a", 
  pargfid = '\000' <repeats 15 times>}
(gdb)

i.e resolvedloc->parent is NULL causing this bug.

To fix this we need to be able to send the FOP: lookup(parent_GFID, "..") and fix any side effects.

Comment 2 Anand Avati 2012-04-11 22:08:26 UTC
Krishna,
 Your analysis is right. We need to make sure lookup(gfid, "..") works smoothly. One problem what comes to my mind offhand is that inode_link() must either be avoided or inside inode_link() we should take care not to perform a dentry linking of ".." as it will result get caught in the loop formation check. This is also necessary if we decide to use kNFS in the future for NFSv4 where kNFS issues lookup(inode, "..") for satisfying the LOOKUPP nfs4 procedure.

Avati

Comment 3 Anand Avati 2012-05-05 19:42:38 UTC
CHANGE: http://review.gluster.com/3220 (libglusterfs/inode.c: do not link the inode in the dentry cache for "." and "..") merged in master by Anand Avati (avati)