Description of problem: Was running dbench on nfs mount of a 2*2 dist-rep volume. Brought down 1 leg of both replicate-pair. dbench failed Version-Release number of selected component (if applicable): glusterfs-3.3.0qa27 How reproducible: 1/1 Steps to Reproduce: 1. Create and start 2*2 dist-rep volume. 2. From fuse mount run fs_mark in a loop for all 6 sync types for i in `seq 0 6`; do /opt/qa/tools/fs_mark-3.3/fs_mark -d /mnt/dir1 -D 16 -t 32 -S $i; done 3. Now do a nfs mount and start dbench for 600secs with 30 clients. Actual results: dbench failed and logs say that exportid=00000000-0000-0000-0000-000000000000 [2012-03-13 06:04:00.896374] E [nfs3-helpers.c:3768:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup failed: <gfid:00000000-0000-0000-0000-000000000000>: Invalid argument [2012-03-13 06:04:00.896393] E [nfs3.c:1357:nfs3_lookup_resume] 0-nfs-nfsv3: Unable to resolve FH: (10.1.11.112:1014) hosdu : 87ebc8a2-aaed-48a7-89cc-5b188cb3e747 [2012-03-13 06:04:00.896431] W [nfs3-helpers.c:3392:nfs3_log_common_res] 0-nfs-nfsv3: XID: 411e787b, LOOKUP: NFS: 22(Invalid argument for operation), POSIX: 14(Bad address) [2012-03-13 06:04:00.936720] I [dht-layout.c:600:dht_layout_normalize] 0-hosdu-dht: found anomalies in <gfid:00000000-0000-0000-0000-000000000000>. holes=1 overlaps=0 [2012-03-13 06:04:00.936768] E [nfs3-helpers.c:3768:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup failed: <gfid:00000000-0000-0000-0000-000000000000>: Invalid argument [2012-03-13 06:04:00.936787] E [nfs3.c:752:nfs3_getattr_resume] 0-nfs-nfsv3: Unable to resolve FH: (10.1.11.112:1014) hosdu : 87ebc8a2-aaed-48a7-89cc-5b188cb3e747 [2012-03-13 06:04:00.936803] W [nfs3-helpers.c:3392:nfs3_log_common_res] 0-nfs-nfsv3: XID: 421e787b, GETATTR: NFS: 22(Invalid argument for operation), POSIX: 14(Bad address) [2012-03-13 06:04:00.939235] E [nfs3.c:1549:nfs3_access] 0-nfs-nfsv3: Failed to map FH to vol: client=10.1.11.112:1014, exportid=00000000-0000-0000-0000-000000000000, gfid=3e2035ac-7dab-43d4-9f53-dd9b93083f2d [2012-03-13 06:04:00.939268] E [nfs3.c:1549:nfs3_access] 0-nfs-nfsv3: Stale nfs client 10.1.11.112:1014 must be trying to connect to a deleted volume, please unmount it. [2012-03-13 06:04:00.939289] W [nfs3-helpers.c:3392:nfs3_log_common_res] 0-nfs-nfsv3: XID: 481e787b, ACCESS: NFS: 70(Invalid file handle), POSIX: 14(Bad address) [2012-03-13 06:04:00.940758] E [nfs3.c:1549:nfs3_access] 0-nfs-nfsv3: Failed to map FH to vol: client=10.1.11.112:1014, exportid=00000000-0000-0000-0000-000000000000, gfid=3e2035ac-7dab-43d4-9f53-dd9b93083f2d [2012-03-13 06:04:00.940786] E [nfs3.c:1549:nfs3_access] 0-nfs-nfsv3: Stale nfs client 10.1.11.112:1014 must be trying to connect to a deleted volume, please unmount it. [2012-03-13 06:04:00.940808] W [nfs3-helpers.c:3392:nfs3_log_common_res] 0-nfs-nfsv3: XID: 4c1e787b, ACCESS: NFS: 70(Invalid file handle), POSIX: 14(Bad address) [2012-03-13 06:04:00.945106] E [nfs3.c:810:nfs3_getattr] 0-nfs-nfsv3: Failed to map FH to vol: client=10.1.11.112:1014, exportid=00000000-0000-0000-0000-000000000000, gfid=3e2035ac-7dab-43d4-9f53-dd9b93083f2d [2012-03-13 06:04:00.945136] E [nfs3.c:810:nfs3_getattr] 0-nfs-nfsv3: Stale nfs client 10.1.11.112:1014 must be trying to connect to a deleted volume, please unmount it. [2012-03-13 06:04:00.945158] W [nfs3-helpers.c:3392:nfs3_log_common_res] 0-nfs-nfsv3: XID: 541e787b, GETATTR: NFS: 70(Invalid file handle), POSIX: 14(Bad address) [2012-03-13 06:04:00.945399] E [nfs3.c:305:__nfs3_get_volume_id] (-->/usr/local/lib/glusterfs/3.3.0qa27/xlator/nfs/server.so(nfs3_getattr+0x55b) [0x7f2af3f2a6a9] (-->/usr/local/lib/glusterfs/3.3.0qa27/xlator/nfs/server.so(nfs3_getattr_reply+0x37) [0x7f2af3f29bee] (-->/usr/local/lib/glusterfs/3.3.0qa27/xlator/nfs/server.so(nfs3_request_xlator_deviceid+0xb1) [0x7f2af3f29b35]))) 0-nfs-nfsv3: invalid argument: xl I have attached the nfs log.
Created attachment 569688 [details] nfs log
tested the patch provided by Rajesh, in the similar fashion as the problem was found and dbench finished properly.
CHANGE: http://review.gluster.com/3150 (nfs/server: hard resolve fh on restart) merged in master by Vijay Bellur (vijay)
With the glusterfs-3.3.0qa35, dbench still exited with error but zero exportid error log was not found.
Can you post the logs relevant to the dbench errors? When i tested, dbench completed successfully, even saurabh had successful result
Created attachment 578632 [details] nfs log Attaching the nfs log...
The same behavior is seen even in fuse mount. I'm closing the bug since zero exportid error in nfs is fixed and verified.