Bug 802779

Summary: [glusterfs-3.3.0qa27] - Invalid argument error because of zero exportid
Product: [Community] GlusterFS Reporter: M S Vishwanath Bhat <vbhat>
Component: nfsAssignee: Rajesh <rajesh>
Status: CLOSED UPSTREAM QA Contact:
Severity: medium Docs Contact:
Priority: urgent    
Version: pre-releaseCC: gluster-bugs, mzywusko, saujain, vagarwal, vbellur, vinaraya
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-04-23 07:46:24 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Attachments:
Description Flags
nfs log
none
nfs log none

Description M S Vishwanath Bhat 2012-03-13 09:44:42 EDT
Description of problem:
Was running dbench on nfs mount of a 2*2 dist-rep volume. Brought down 1 leg of both replicate-pair. dbench failed

Version-Release number of selected component (if applicable):
glusterfs-3.3.0qa27

How reproducible:
1/1

Steps to Reproduce:
1. Create and start 2*2 dist-rep volume.
2. From fuse mount run fs_mark in a loop for all 6 sync types
   
for i in `seq 0 6`; do /opt/qa/tools/fs_mark-3.3/fs_mark -d /mnt/dir1 -D 16 -t 32 -S $i; done

3. Now do a nfs mount and start dbench for 600secs with 30 clients.
  
Actual results:
dbench failed and logs say that exportid=00000000-0000-0000-0000-000000000000

[2012-03-13 06:04:00.896374] E [nfs3-helpers.c:3768:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup failed: <gfid:00000000-0000-0000-0000-000000000000>: Invalid argument
[2012-03-13 06:04:00.896393] E [nfs3.c:1357:nfs3_lookup_resume] 0-nfs-nfsv3: Unable to resolve FH: (10.1.11.112:1014) hosdu : 87ebc8a2-aaed-48a7-89cc-5b188cb3e747
[2012-03-13 06:04:00.896431] W [nfs3-helpers.c:3392:nfs3_log_common_res] 0-nfs-nfsv3: XID: 411e787b, LOOKUP: NFS: 22(Invalid argument for operation), POSIX: 14(Bad address)
[2012-03-13 06:04:00.936720] I [dht-layout.c:600:dht_layout_normalize] 0-hosdu-dht: found anomalies in <gfid:00000000-0000-0000-0000-000000000000>. holes=1 overlaps=0
[2012-03-13 06:04:00.936768] E [nfs3-helpers.c:3768:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup failed: <gfid:00000000-0000-0000-0000-000000000000>: Invalid argument
[2012-03-13 06:04:00.936787] E [nfs3.c:752:nfs3_getattr_resume] 0-nfs-nfsv3: Unable to resolve FH: (10.1.11.112:1014) hosdu : 87ebc8a2-aaed-48a7-89cc-5b188cb3e747
[2012-03-13 06:04:00.936803] W [nfs3-helpers.c:3392:nfs3_log_common_res] 0-nfs-nfsv3: XID: 421e787b, GETATTR: NFS: 22(Invalid argument for operation), POSIX: 14(Bad address)
[2012-03-13 06:04:00.939235] E [nfs3.c:1549:nfs3_access] 0-nfs-nfsv3: Failed to map FH to vol: client=10.1.11.112:1014, exportid=00000000-0000-0000-0000-000000000000, gfid=3e2035ac-7dab-43d4-9f53-dd9b93083f2d
[2012-03-13 06:04:00.939268] E [nfs3.c:1549:nfs3_access] 0-nfs-nfsv3: Stale nfs client 10.1.11.112:1014 must be trying to connect to a deleted volume, please unmount it.
[2012-03-13 06:04:00.939289] W [nfs3-helpers.c:3392:nfs3_log_common_res] 0-nfs-nfsv3: XID: 481e787b, ACCESS: NFS: 70(Invalid file handle), POSIX: 14(Bad address)
[2012-03-13 06:04:00.940758] E [nfs3.c:1549:nfs3_access] 0-nfs-nfsv3: Failed to map FH to vol: client=10.1.11.112:1014, exportid=00000000-0000-0000-0000-000000000000, gfid=3e2035ac-7dab-43d4-9f53-dd9b93083f2d
[2012-03-13 06:04:00.940786] E [nfs3.c:1549:nfs3_access] 0-nfs-nfsv3: Stale nfs client 10.1.11.112:1014 must be trying to connect to a deleted volume, please unmount it.
[2012-03-13 06:04:00.940808] W [nfs3-helpers.c:3392:nfs3_log_common_res] 0-nfs-nfsv3: XID: 4c1e787b, ACCESS: NFS: 70(Invalid file handle), POSIX: 14(Bad address)
[2012-03-13 06:04:00.945106] E [nfs3.c:810:nfs3_getattr] 0-nfs-nfsv3: Failed to map FH to vol: client=10.1.11.112:1014, exportid=00000000-0000-0000-0000-000000000000, gfid=3e2035ac-7dab-43d4-9f53-dd9b93083f2d
[2012-03-13 06:04:00.945136] E [nfs3.c:810:nfs3_getattr] 0-nfs-nfsv3: Stale nfs client 10.1.11.112:1014 must be trying to connect to a deleted volume, please unmount it.
[2012-03-13 06:04:00.945158] W [nfs3-helpers.c:3392:nfs3_log_common_res] 0-nfs-nfsv3: XID: 541e787b, GETATTR: NFS: 70(Invalid file handle), POSIX: 14(Bad address)
[2012-03-13 06:04:00.945399] E [nfs3.c:305:__nfs3_get_volume_id] (-->/usr/local/lib/glusterfs/3.3.0qa27/xlator/nfs/server.so(nfs3_getattr+0x55b) [0x7f2af3f2a6a9] (-->/usr/local/lib/glusterfs/3.3.0qa27/xlator/nfs/server.so(nfs3_getattr_reply+0x37) [0x7f2af3f29bee] (-->/usr/local/lib/glusterfs/3.3.0qa27/xlator/nfs/server.so(nfs3_request_xlator_deviceid+0xb1) [0x7f2af3f29b35]))) 0-nfs-nfsv3: invalid argument: xl

I have attached the nfs log.
Comment 1 M S Vishwanath Bhat 2012-03-13 10:20:23 EDT
Created attachment 569688 [details]
nfs log
Comment 2 Saurabh 2012-04-16 08:54:43 EDT
tested the patch provided by Rajesh,

in the similar fashion as the problem was found and dbench finished properly.
Comment 3 Anand Avati 2012-04-16 14:19:11 EDT
CHANGE: http://review.gluster.com/3150 (nfs/server: hard resolve fh on restart) merged in master by Vijay Bellur (vijay@gluster.com)
Comment 4 M S Vishwanath Bhat 2012-04-19 06:56:12 EDT
With the glusterfs-3.3.0qa35, dbench still exited with error but zero exportid error log was not found.
Comment 5 Rajesh 2012-04-19 07:09:14 EDT
Can you post the logs relevant to the dbench errors? When i tested, dbench completed successfully, even saurabh had successful result
Comment 6 M S Vishwanath Bhat 2012-04-19 08:39:18 EDT
Created attachment 578632 [details]
nfs log

Attaching the nfs log...
Comment 7 M S Vishwanath Bhat 2012-04-23 07:46:24 EDT
The same behavior is seen even in fuse mount. I'm closing the bug since zero exportid error in nfs is fixed and verified.