965434 – nfs+dht: "invalid argument: inode"

Bug 965434 - nfs+dht: "invalid argument: inode"

Summary: nfs+dht: "invalid argument: inode"

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	distribute
Sub Component:
Version:	mainline
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Assignee:	Pranith Kumar K
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	960834
Blocks:
TreeView+	depends on / blocked

Reported:	2013-05-21 08:56 UTC by Pranith Kumar K
Modified:	2013-07-24 17:19 UTC (History)
CC List:	5 users (show)
Fixed In Version:	glusterfs-3.4.0
Clone Of:	960834
Environment:
Last Closed:	2013-07-24 17:19:19 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Pranith Kumar K 2013-05-21 08:56:50 UTC

Description of problem:
volume type:- 6x2

dht throws errors in the nfs.log
when executing rm -rf from different mount-point on two different clients.
mount point are again from different servers of the rhs cluster

Version-Release number of selected component (if applicable):
glusterfs-3.4.0.4rhs-1.el6rhs.x86_64

How reproducible:
the logs are seen many a times.

Steps to Reproduce:
1. create a volume, start the volume using nodes, [a, b, c, d]
2. mount volume from node a and b on clients c1 and c2 respectively
3. create loads of data in the mount-point.(use only one mount point for creating data).

function for creating data:
    for i in range(10000):
        os.mkdir(mount_path_nfs + "/" + "%d"%(i))
        for j in range(100):
            os.mkdir(mount_path_nfs + "/" + "%d"%(i) + "/" + "%d"%(j))
            commands.getoutput("touch" + " " + mount_path_nfs + "/" + "%d"%(i) + "/" + "%d"%(j) + "/" + "%d"%(j) + ".file")

4. now start "rm -rf *" on both mount point as mentioned in step 2.
 
Actual results:

[2013-05-07 22:46:36.514497] W [nfs3-helpers.c:3475:nfs3_log_readdir_res] 0-nfs-nfsv3: XID: adaf0719, READDIR: NFS: 2(No such file or directory), POSIX: 2(No such file or directory), count: 32768, cverf: 36506500, is_eof: 0
[2013-05-07 22:46:36.515576] W [client-rpc-fops.c:1369:client3_3_access_cbk] 0-dist-rep-client-2: remote operation failed: No such file or directory
[2013-05-07 22:46:36.516238] W [client-rpc-fops.c:1369:client3_3_access_cbk] 0-dist-rep-client-3: remote operation failed: No such file or directory
[2013-05-07 22:46:36.519404] E [dht-helper.c:1065:dht_inode_ctx_get] (-->/usr/lib64/glusterfs/3.4.0.4rhs/xlator/cluster/distribute.so(dht_discover_complete+0x421) [0x7f7de3b6f721] (-->/usr/lib64/glusterfs/3.4.0.4rhs/xlator/cluster/distribute.so(dht_layout_set+0x4e) [0x7f7de3b5203e] (-->/usr/lib64/glusterfs/3.4.0.4rhs/xlator/cluster/distribute.so(dht_inode_ctx_layout_get+0x1b) [0x7f7de3b60cfb]))) 0-dist-rep-dht: invalid argument: inode
[2013-05-07 22:46:36.519434] E [dht-helper.c:1065:dht_inode_ctx_get] (-->/usr/lib64/glusterfs/3.4.0.4rhs/xlator/cluster/distribute.so(dht_discover_complete+0x421) [0x7f7de3b6f721] (-->/usr/lib64/glusterfs/3.4.0.4rhs/xlator/cluster/distribute.so(dht_layout_set+0x63) [0x7f7de3b52053] (-->/usr/lib64/glusterfs/3.4.0.4rhs/xlator/cluster/distribute.so(dht_inode_ctx_layout_set+0x34) [0x7f7de3b52544]))) 0-dist-rep-dht: invalid argument: inode
[2013-05-07 22:46:36.519524] E [dht-helper.c:1084:dht_inode_ctx_set] (-->/usr/lib64/glusterfs/3.4.0.4rhs/xlator/cluster/distribute.so(dht_discover_complete+0x421) [0x7f7de3b6f721] (-->/usr/lib64/glusterfs/3.4.0.4rhs/xlator/cluster/distribute.so(dht_layout_set+0x63) [0x7f7de3b52053] (-->/usr/lib64/glusterfs/3.4.0.4rhs/xlator/cluster/distribute.so(dht_inode_ctx_layout_set+0x52) [0x7f7de3b52562]))) 0-dist-rep-dht: invalid argument: inode
[2013-05-07 22:46:36.519570] W [nfs3.c:1522:nfs3svc_access_cbk] 0-nfs: aeaf0719: /6863/6 => -1 (Structure needs cleaning)
[2013-05-07 22:46:36.519603] W [nfs3-helpers.c:3391:nfs3_log_common_res] 0-nfs-nfsv3: XID: aeaf0719, ACCESS: NFS: 10006(Error occurred on the server or IO Error), POSIX: 117(Structure needs cleaning)


Expected results:
I think in this scenario where files are getting deleted from both the mount points, errors related to DHT should not happen, rather Warning like "No such file or directory" as such files may be getting deleted from either of the mount-points.

Comment 1 Anand Avati 2013-05-21 09:10:09 UTC

REVIEW: http://review.gluster.org/5055 (cluster/dht: Set layout when inode is present) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 2 Anand Avati 2013-06-04 14:50:01 UTC

COMMIT: http://review.gluster.org/5055 committed in master by Anand Avati (avati) 
------
commit a72e77f7bc5abfa739f19f6d02e7cf94b138c477
Author: Pranith Kumar K <pkarampu>
Date:   Tue May 21 14:21:31 2013 +0530

    cluster/dht: Set layout when inode is present
    
    Problem:
    Lookups in discovery fail with ENOENT so local->inode
    is never set. dht_layout_set logs the callstack when
    the function is called in that state.
    
    Fix:
    Don't set layout when lookups fail in discovery.
    
    Change-Id: I5d588314c89e3575fcf7796d57847e35fd20f89a
    BUG: 965434
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/5055
    Reviewed-by: Shishir Gowda <sgowda>
    Tested-by: Gluster Build System <jenkins.com>

Note You need to log in before you can comment on or make changes to this bug.