Bug 901723
Summary: | gnfs: E [nfs3.c:1545:nfs3_access_resume] 0-nfs-nfsv3: Unable to resolve FH: seen during ltp on 6.3 client. | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Ben Turner <bturner> | ||||||||
Component: | glusterd | Assignee: | santosh pradhan <spradhan> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Ben Turner <bturner> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | medium | ||||||||||
Version: | 2.0 | CC: | bturner, grajaiya, kkeithle, rhs-bugs, saujain, shaines, vagarwal, vbellur | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2013-09-23 22:39:24 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
Ben Turner
2013-01-18 21:44:56 UTC
Created attachment 682764 [details]
Sosreport from storage1
Created attachment 682765 [details]
Sosreport from storage2
Created attachment 682766 [details]
Sosreport from storage3(client)
I tested this on ever client version from 5.6-6.4 and I saw this behavior on all client versions. Today I reran ltp to see if I could get to the bottom of which test was causing the errors. I haven't found which test was giving the unable to resolve FH error but I am seeing some real strange behavior with: time $LTP_DIR/fsstress/fsstress -d /gluster-mount -l 22 -n 22 -p 22 When I run it I see warnings spam the logs: [2013-01-29 16:40:50.485398] W [client3_1-fops.c:187:client3_1_symlink_cbk] 0-DISTRIBUTED-client-1: remote operation failed: File name too long. Path: /p8/d3/l9 (00000000-0000-0000-0000-000000000000) [2013-01-29 16:40:50.485436] W [nfs3.c:2939:nfs3svc_symlink_cbk] 0-nfs: 9646c2ba: /p8/d3/l9 => -1 (File name too long) [2013-01-29 16:40:50.486733] W [client3_1-fops.c:187:client3_1_symlink_cbk] 0-DISTRIBUTED-client-1: remote operation failed: File name too long. Path: /p8/d3/l9 (00000000-0000-0000-0000-000000000000) [2013-01-29 16:40:50.486765] W [nfs3.c:2939:nfs3svc_symlink_cbk] 0-nfs: 9746c2ba: /p8/d3/l9 => -1 (File name too long) I picked one example and looked at it: [2013-01-29 16:40:50.197716] W [nfs3.c:3391:nfs3svc_remove_cbk] 0-nfs: 1f42c2ba: /run1089/p7/d3/f5 => -1 (No such file or directory) On /gluster mount I cd to the dir: [root@storage-qe04 d3]# pwd /gluster-mount/run1089/p7/d3 And I try to remove the file: [root@storage-qe04 d3]# rm f5 rm: remove regular file `f5'? y rm: cannot remove `f5': No such file or directory Now I check ll and I still see the file: [root@storage-qe04 d3]# ll total 0 -rw-rw-rw-. 1 root root 579411 Jan 29 16:24 f5 I tried unmounting and remounting the FS and still saw the same thing: [root@storage-qe04 gluster-mount]# cd /gluster-mount/run1089/p7/d3 [root@storage-qe04 d3]# ls f5 [root@storage-qe04 d3]# rm f5 rm: remove regular file `f5'? y rm: cannot remove `f5': No such file or directory So I went on the backend bricks and looked: [root@storage-qe01 d3]# pwd /brick1/run1089/p7/d3 [root@storage-qe01 d3]# ll total 0 [root@storage-qe02 d3]# pwd /brick1/run1089/p7/d3 [root@storage-qe02 d3]# ll total 0 The file was not on either brick but was still showing on the client. I went ahead and mounted from a different client: [root@storage-qe12 ~]# mount -t nfs -o mountproto=tcp,vers=3 storage-qe01.lab.eng.rdu2.redhat.com:/DISTRIBUTED $(mkdir /test-mount; echo /test-mount) [root@storage-qe12 ~]# cd /test-mount/run1089/p7/d3 [root@storage-qe12 d3]# ll total 0 -rw-rw-rw-. 1 root root 579411 Jan 29 16:24 f5 The file exists even on a client that is mounting for the first time. I am pretty sure that the lpt testcase that causing the FH error is the same one I am running, but after executing the whole testsuite I don't see the FH error again. I will try tomorrow just running fsstress and see if I hit the FH error. Hi Ben, 1. "Unable to resolve FH" error is addressed as part of the BZ 960835. The FIX is available in the latest RHS-2.1 build (bigbend). 2. "File name too long" message in the log is expected because the underlying file system "XFS" or "ext2/3/4" does not support file name length more than 256 chars. The tool is trying to create the symlink of 1024 chars which is rejected by symlink() syscall. Which is OK. I could not reproduce the issue in 3.4.0.13rhs-1 build. Could you confirm? Thanks, Santosh Verified that the FH issue is resolved on glusterfs-3.4.0.18rhs-1.el6rhs.x86_64. Thanks Ben Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1262.html Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1262.html |