Description of problem: When one of the nodes becomes read-only and back to read-write some of the directories become inaccessible. A `rm -rf' on them does not remove the directory nor any error is shown on the mount point. Log throws following warnings: [2012-03-25 21:02:54.151641] W [nfs3.c:3524:nfs3svc_rmdir_cbk] 0-nfs: 57a38c4d: /foomati/linux-3.2.11 => -1 (No such file or directory) [2012-03-25 21:03:48.894359] W [client3_1-fops.c:423:client3_1_stat_cbk] 0-nfs-test-2-client-0: remote operation failed: No such file or directory [2012-03-25 21:03:48.896717] W [client3_1-fops.c:423:client3_1_stat_cbk] 0-nfs-test-2-client-0: remote operation failed: No such file or directory [2012-03-25 21:03:48.897653] W [client3_1-fops.c:423:client3_1_stat_cbk] 0-nfs-test-2-client-0: remote operation failed: No such file or directory [2012-03-25 21:03:52.239541] W [client3_1-fops.c:423:client3_1_stat_cbk] 0-nfs-test-2-client-0: remote operation failed: No such file or directory [2012-03-25 21:12:23.150555] W [client3_1-fops.c:423:client3_1_stat_cbk] 0-nfs-test-2-client-0: remote operation failed: No such file or directory [2012-03-25 21:12:24.178276] W [client3_1-fops.c:423:client3_1_stat_cbk] 0-nfs-test-2-client-0: remote operation failed: No such file or directory [2012-03-25 21:12:24.179653] W [client3_1-fops.c:1097:client3_1_access_cbk] 0-nfs-test-2-client-0: remote operation failed: No such file or directory [2012-03-25 21:12:24.180271] W [client3_1-fops.c:879:client3_1_getxattr_cbk] 0-nfs-test-2-client-0: remote operation failed: No such file or directory. Path: (null) [2012-03-25 21:12:24.180521] W [client3_1-fops.c:2157:client3_1_lookup_cbk] 0-nfs-test-2-client-0: remote operation failed: Invalid argument. Path: /linux-1 [2012-03-25 21:12:24.180701] W [client3_1-fops.c:2157:client3_1_lookup_cbk] 0-nfs-test-2-client-1: remote operation failed: Invalid argument. Path: /linux-1 [2012-03-25 21:12:24.180744] W [client3_1-fops.c:2157:client3_1_lookup_cbk] 0-nfs-test-2-client-3: remote operation failed: Invalid argument. Path: /linux-1 [2012-03-25 21:12:24.180773] W [client3_1-fops.c:2157:client3_1_lookup_cbk] 0-nfs-test-2-client-2: remote operation failed: Invalid argument. Path: /linux-1 [2012-03-25 21:12:24.180787] I [dht-layout.c:600:dht_layout_normalize] 0-nfs-test-2-dht: found anomalies in /linux-1. holes=1 overlaps=0 [2012-03-25 21:12:24.180873] W [nfs3.c:1492:nfs3svc_access_cbk] 0-nfs: 8964974d: /linux-1 => -1 (Structure needs cleaning) [2012-03-25 21:12:24.180905] W [nfs3-helpers.c:3389:nfs3_log_common_res] 0-nfs-nfsv3: XID: 8964974d, ACCESS: NFS: 10006(Error occurred on the server or IO Error), POSIX : 117(Structure needs cleaning) [2012-03-25 21:12:24.181426] W [client3_1-fops.c:2081:client3_1_opendir_cbk] 0-nfs-test-2-client-0: remote operation failed: No such file or directory. Path: /linux-1 [2012-03-25 21:12:24.182678] W [nfs3.c:3524:nfs3svc_rmdir_cbk] 0-nfs: 8d64974d: /linux-1 => -1 (No such file or directory) [2012-03-25 21:12:32.189148] W [client3_1-fops.c:423:client3_1_stat_cbk] 0-nfs-test-2-client-0: remote operation failed: No such file or directory [2012-03-25 21:12:32.190011] W [client3_1-fops.c:1097:client3_1_access_cbk] 0-nfs-test-2-client-0: remote operation failed: No such file or directory [2012-03-25 21:12:32.190687] W [client3_1-fops.c:879:client3_1_getxattr_cbk] 0-nfs-test-2-client-0: remote operation failed: No such file or directory. Path: (null) [2012-03-25 21:12:32.191005] W [client3_1-fops.c:2157:client3_1_lookup_cbk] 0-nfs-test-2-client-0: remote operation failed: Invalid argument. Path: /linux-1 [2012-03-25 21:12:32.191125] W [client3_1-fops.c:2157:client3_1_lookup_cbk] 0-nfs-test-2-client-1: remote operation failed: Invalid argument. Path: /linux-1 [2012-03-25 21:12:32.191215] W [client3_1-fops.c:2157:client3_1_lookup_cbk] 0-nfs-test-2-client-3: remote operation failed: Invalid argument. Path: /linux-1 [2012-03-25 21:12:32.191243] W [client3_1-fops.c:2157:client3_1_lookup_cbk] 0-nfs-test-2-client-2: remote operation failed: Invalid argument. Path: /linux-1 [2012-03-25 21:12:32.191255] I [dht-layout.c:600:dht_layout_normalize] 0-nfs-test-2-dht: found anomalies in /linux-1. holes=1 overlaps=0 [2012-03-25 21:12:32.191321] W [nfs3.c:1492:nfs3svc_access_cbk] 0-nfs: c88b974d: /linux-1 => -1 (Structure needs cleaning) [2012-03-25 21:12:32.191354] W [nfs3-helpers.c:3389:nfs3_log_common_res] 0-nfs-nfsv3: XID: c88b974d, ACCESS: NFS: 10006(Error occurred on the server or IO Error), POSIX: 117(Structure needs cleaning) [2012-03-25 21:12:32.191863] W [client3_1-fops.c:2081:client3_1_opendir_cbk] 0-nfs-test-2-client-0: remote operation failed: No such file or directory. Path: /linux-1 [2012-03-25 21:12:32.193125] W [nfs3.c:3524:nfs3svc_rmdir_cbk] 0-nfs: cc8b974d: /linux-1 => -1 (No such file or directory) [2012-03-25 21:13:31.605355] I [dht-layout.c:600:dht_layout_normalize] 0-nfs-test-2-dht: found anomalies in <gfid:e82d0e2c-e8d6-4596-97b7-5d2669a79f2e>. holes=1 overlaps=0 [2012-03-25 21:13:31.605904] E [nfs3-helpers.c:3603:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup failed: <gfid:e82d0e2c-e8d6-4596-97b7-5d2669a79f2e>: Invalid argument Steps to Reproduce: 1. Create a volume and do a couple or more nfs mounts and keep doing some I/O on the mount. For example, kernel extraction, fsx tests, etc. 2. remount the backend FS read-only. The mount starts throwing I/O errors (Note not read-only FS), let the extraction continue. Now try to extract on another directory, and just ignore the errors for a while. 3. remount the backend FS read-write and try the above operations, it still fails. And try to rm -rf the directories, the above behavior is seen. Additional info: While the FS is read-only do a fuse mount and do the extraction of the same tar file, no errors are thrown but extraction seem to happen but no files/directories are created. ================ FUSE BEHAVIOR ====================== root@gqac009 fuse-0]# ls -l -a foomati total 156 drwxr-xr-x. 5 root root 129 Mar 25 2012 . drwxr-xr-x. 21 root root 159744 Mar 25 2012 .. [root@gqac009 fuse-0]# rm -rf foomati rm: cannot remove `foomati': Directory not empty [root@gqac009 fuse-0]# ================ FUSE BEHAVIOR ====================== re-mounting the nfs/FUSE client does not solve the issue.
Created attachment 572688 [details] Contains fuse log, nfs log, brick log bzipped
[2012-03-25 21:12:32.191255] I [dht-layout.c:600:dht_layout_normalize] 0-nfs-test-2-dht: found anomalies in /linux-1. holes=1 overlaps=0 Looks like distribute is seeing holes.
Saw the logs, NFS is getting errors from DHT that there are holes. Reassigning the bug to DHT.
CHANGE: http://review.gluster.com/3327 (cluster/dht: Handle ENOENT failure in dht_rmdir_opendir_cbk) merged in master by Anand Avati (avati)