960835 – nfs:Unable to resolve FH(READDIR issue)

Bug 960835 - nfs:Unable to resolve FH(READDIR issue)

Summary: nfs:Unable to resolve FH(READDIR issue)

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterd
Sub Component:
Version:	2.1
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	urgent
Target Milestone:	---
Target Release:	---
Assignee:	rjoseph
QA Contact:	Saurabh
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	965435
TreeView+	depends on / blocked

Reported:	2013-05-08 06:01 UTC by Saurabh
Modified:	2016-01-19 06:11 UTC (History)
CC List:	6 users (show)
Fixed In Version:	glusterfs-3.4.0.10rhs
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	965435 (view as bug list)
Environment:
Last Closed:	2013-09-23 22:39:41 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Saurabh 2013-05-08 06:01:01 UTC

Description of problem:
volume type:- 6x2

when executing rm -rf from different mount-point on two different clients.
mount point are again from different servers of the rhs cluster

Version-Release number of selected component (if applicable):
glusterfs-3.4.0.4rhs-1.el6rhs.x86_64

How reproducible:
the logs are seen many a times.

Steps to Reproduce:
1. create a volume, start the volume using nodes, [a, b, c, d]
2. mount volume from node a and b on clients c1 and c2 respectively
3. create loads of data in the mount-point.(use only one mount point for creating data).

function for creating data:
    for i in range(10000):
        os.mkdir(mount_path_nfs + "/" + "%d"%(i))
        for j in range(100):
            os.mkdir(mount_path_nfs + "/" + "%d"%(i) + "/" + "%d"%(j))
            commands.getoutput("touch" + " " + mount_path_nfs + "/" + "%d"%(i) + "/" + "%d"%(j) + "/" + "%d"%(j) + ".file")

4. now start "rm -rf *" on both mount point as mentioned in step 2.

Actual results:

sometimes I find this error in nfs.log
[2013-05-07 22:50:05.984856] W [nfs3.c:4080:nfs3svc_readdir_fstat_cbk] 0-nfs: bc310819: <gfid:2a01dbe1-c740-4a10-8209-15ebc48db2e7> => -1 (No such file or directory)
[2013-05-07 22:50:05.984900] W [nfs3-helpers.c:3475:nfs3_log_readdir_res] 0-nfs-nfsv3: XID: bc310819, READDIR: NFS: 2(No such file or directory), POSIX: 2(No such file or directory), count: 32768, cverf: 36506500, is_eof: 0
[2013-05-07 22:50:05.988889] E [nfs3.c:3536:nfs3_rmdir_resume] 0-nfs-nfsv3: Unable to resolve FH: (10.70.35.135:827) dist-rep : 6d2aabe8-93e5-4583-b049-406fd826776c
[2013-05-07 22:50:06.016279] E [nfs3.c:3393:nfs3_remove_resume] 0-nfs-nfsv3: Unable to resolve FH: (10.70.35.135:827) dist-rep : 7faf1da8-3608-4d2e-98ed-06b511588045

Expected results:

FH resolution is not expected.

Additional info:
one more issue was found during these operations, files the BZ 960834

Comment 3 rjoseph 2013-05-20 11:39:42 UTC

The problem is reproduced. 

The error is shown in the case whenever NFS server gets a rmdir or remove call and the file/directory is already deleted.

Comment 4 Ben Turner 2013-05-20 17:53:32 UTC

I opened https://bugzilla.redhat.com/show_bug.cgi?id=901723 during Anshi testing and it appears to be the same issue as this.  Should we close 901723 as a DUP of this?

Comment 5 rjoseph 2013-05-21 11:38:53 UTC

Ben:

"Unable to resolve FH" error comes whenever the server fails to get the FH for the file. In this bug rmdir is called from two machines. Therefore while deleting a file/directory one of the machine might see this error because the other machine might have already deleted that entry.

But in your bug comment I see that you are trying to delete a file (f5) from the client machine, but the file is not present in any of the brick. This I think should be looked into separately.

Comment 9 Scott Haines 2013-09-23 22:39:41 UTC

Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html

Comment 10 Scott Haines 2013-09-23 22:43:47 UTC

Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html

Note You need to log in before you can comment on or make changes to this bug.