Description of problem: volume type:- 6x2 dht throws errors in the nfs.log when executing rm -rf from different mount-point on two different clients. mount point are again from different servers of the rhs cluster Version-Release number of selected component (if applicable): glusterfs-3.4.0.4rhs-1.el6rhs.x86_64 How reproducible: the logs are seen many a times. Steps to Reproduce: 1. create a volume, start the volume using nodes, [a, b, c, d] 2. mount volume from node a and b on clients c1 and c2 respectively 3. create loads of data in the mount-point.(use only one mount point for creating data). function for creating data: for i in range(10000): os.mkdir(mount_path_nfs + "/" + "%d"%(i)) for j in range(100): os.mkdir(mount_path_nfs + "/" + "%d"%(i) + "/" + "%d"%(j)) commands.getoutput("touch" + " " + mount_path_nfs + "/" + "%d"%(i) + "/" + "%d"%(j) + "/" + "%d"%(j) + ".file") 4. now start "rm -rf *" on both mount point as mentioned in step 2. Actual results: [2013-05-07 22:46:36.514497] W [nfs3-helpers.c:3475:nfs3_log_readdir_res] 0-nfs-nfsv3: XID: adaf0719, READDIR: NFS: 2(No such file or directory), POSIX: 2(No such file or directory), count: 32768, cverf: 36506500, is_eof: 0 [2013-05-07 22:46:36.515576] W [client-rpc-fops.c:1369:client3_3_access_cbk] 0-dist-rep-client-2: remote operation failed: No such file or directory [2013-05-07 22:46:36.516238] W [client-rpc-fops.c:1369:client3_3_access_cbk] 0-dist-rep-client-3: remote operation failed: No such file or directory [2013-05-07 22:46:36.519404] E [dht-helper.c:1065:dht_inode_ctx_get] (-->/usr/lib64/glusterfs/3.4.0.4rhs/xlator/cluster/distribute.so(dht_discover_complete+0x421) [0x7f7de3b6f721] (-->/usr/lib64/glusterfs/3.4.0.4rhs/xlator/cluster/distribute.so(dht_layout_set+0x4e) [0x7f7de3b5203e] (-->/usr/lib64/glusterfs/3.4.0.4rhs/xlator/cluster/distribute.so(dht_inode_ctx_layout_get+0x1b) [0x7f7de3b60cfb]))) 0-dist-rep-dht: invalid argument: inode [2013-05-07 22:46:36.519434] E [dht-helper.c:1065:dht_inode_ctx_get] (-->/usr/lib64/glusterfs/3.4.0.4rhs/xlator/cluster/distribute.so(dht_discover_complete+0x421) [0x7f7de3b6f721] (-->/usr/lib64/glusterfs/3.4.0.4rhs/xlator/cluster/distribute.so(dht_layout_set+0x63) [0x7f7de3b52053] (-->/usr/lib64/glusterfs/3.4.0.4rhs/xlator/cluster/distribute.so(dht_inode_ctx_layout_set+0x34) [0x7f7de3b52544]))) 0-dist-rep-dht: invalid argument: inode [2013-05-07 22:46:36.519524] E [dht-helper.c:1084:dht_inode_ctx_set] (-->/usr/lib64/glusterfs/3.4.0.4rhs/xlator/cluster/distribute.so(dht_discover_complete+0x421) [0x7f7de3b6f721] (-->/usr/lib64/glusterfs/3.4.0.4rhs/xlator/cluster/distribute.so(dht_layout_set+0x63) [0x7f7de3b52053] (-->/usr/lib64/glusterfs/3.4.0.4rhs/xlator/cluster/distribute.so(dht_inode_ctx_layout_set+0x52) [0x7f7de3b52562]))) 0-dist-rep-dht: invalid argument: inode [2013-05-07 22:46:36.519570] W [nfs3.c:1522:nfs3svc_access_cbk] 0-nfs: aeaf0719: /6863/6 => -1 (Structure needs cleaning) [2013-05-07 22:46:36.519603] W [nfs3-helpers.c:3391:nfs3_log_common_res] 0-nfs-nfsv3: XID: aeaf0719, ACCESS: NFS: 10006(Error occurred on the server or IO Error), POSIX: 117(Structure needs cleaning) Expected results: I think in this scenario where files are getting deleted from both the mount points, errors related to DHT should not happen, rather Warning like "No such file or directory" as such files may be getting deleted from either of the mount-points. Additional info: In some cases I find nfs.log also throws logs for FH resoluation, [2013-05-07 22:50:05.984856] W [nfs3.c:4080:nfs3svc_readdir_fstat_cbk] 0-nfs: bc310819: <gfid:2a01dbe1-c740-4a10-8209-15ebc48db2e7> => -1 (No such file or directory) [2013-05-07 22:50:05.984900] W [nfs3-helpers.c:3475:nfs3_log_readdir_res] 0-nfs-nfsv3: XID: bc310819, READDIR: NFS: 2(No such file or directory), POSIX: 2(No such file or directory), count: 32768, cverf: 36506500, is_eof: 0 [2013-05-07 22:50:05.988889] E [nfs3.c:3536:nfs3_rmdir_resume] 0-nfs-nfsv3: Unable to resolve FH: (10.70.35.135:827) dist-rep : 6d2aabe8-93e5-4583-b049-406fd826776c [2013-05-07 22:50:06.016279] E [nfs3.c:3393:nfs3_remove_resume] 0-nfs-nfsv3: Unable to resolve FH: (10.70.35.135:827) dist-rep : 7faf1da8-3608-4d2e-98ed-06b511588045 filing a separate bug for this though. But this issue is also found during the similar operations.
I am able to re-create the bug with given steps 1/1 times.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1262.html