Bug 763129 (GLUSTER-1397) - Cached dir fd_ts are a leakin'
Summary: Cached dir fd_ts are a leakin'
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: GLUSTER-1397
Product: GlusterFS
Classification: Community
Component: nfs
Version: 3.1-alpha
Hardware: All
OS: Linux
low
high
Target Milestone: ---
Assignee: Shehjar Tikoo
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-08-19 07:09 UTC by Shehjar Tikoo
Modified: 2015-12-01 16:45 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: RTP
Mount Type: nfs
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Shehjar Tikoo 2010-08-19 07:09:49 UTC
Problem description from Krishna:
if direntries are more than 9-10
entries the NFS server does not close the dir fd_t. Because of this
backend FS can not be unmounted (as it gives EBUSY). 


Confirmed through log, which shows successive NFSv3 READDIR calls resulting in increasing references on the directory fd:

shehjart@indus:~$ grep "call_state_wipe.*fd ref" /tmp/dirfd2.log 
[2010-08-19 12:22:59.590367] T [nfs3.c:202:nfs3_call_state_wipe] nfs-nfsv3: fd ref: 3
[2010-08-19 12:22:59.591897] T [nfs3.c:202:nfs3_call_state_wipe] nfs-nfsv3: fd ref: 3
[2010-08-19 12:22:59.593443] T [nfs3.c:202:nfs3_call_state_wipe] nfs-nfsv3: fd ref: 4
[2010-08-19 12:22:59.594955] T [nfs3.c:202:nfs3_call_state_wipe] nfs-nfsv3: fd ref: 5
[2010-08-19 12:22:59.596459] T [nfs3.c:202:nfs3_call_state_wipe] nfs-nfsv3: fd ref: 6
[2010-08-19 12:22:59.597946] T [nfs3.c:202:nfs3_call_state_wipe] nfs-nfsv3: fd ref: 7
[2010-08-19 12:22:59.599418] T [nfs3.c:202:nfs3_call_state_wipe] nfs-nfsv3: fd ref: 8
[2010-08-19 12:22:59.600871] T [nfs3.c:202:nfs3_call_state_wipe] nfs-nfsv3: fd ref: 9
[2010-08-19 12:22:59.602312] T [nfs3.c:202:nfs3_call_state_wipe] nfs-nfsv3: fd ref: 10
[2010-08-19 12:22:59.603772] T [nfs3.c:202:nfs3_call_state_wipe] nfs-nfsv3: fd ref: 11
[2010-08-19 12:22:59.604958] T [nfs3.c:202:nfs3_call_state_wipe] nfs-nfsv3: fd ref: 11


These are refcounts on the fd just before they are unref as part of the operation to free up the per-NFS op state.

Comment 1 Anand Avati 2010-08-19 07:46:44 UTC
PATCH: http://patches.gluster.com/patch/4204 in master (protocol/client: fix ESTALE in statfs on root inode)

Comment 2 Amar Tumballi 2010-08-19 08:12:24 UTC
(In reply to comment #1)
> PATCH: http://patches.gluster.com/patch/4204 in master (protocol/client: fix
> ESTALE in statfs on root inode)

Sorry about the confusion.. this patch should have been for bug 763130.. This bug is not yet fixed.

Comment 3 Shehjar Tikoo 2010-08-20 10:22:51 UTC
The leaks are also present in hard fh resolution code where directory opens and reading is performed.

Comment 4 Vijay Bellur 2010-08-31 11:44:15 UTC
PATCH: http://patches.gluster.com/patch/4416 in master (nfs3: Dont ref cached fd after fd_lookup)

Comment 5 Vijay Bellur 2010-08-31 11:44:20 UTC
PATCH: http://patches.gluster.com/patch/4417 in master (nfs3: Dont ref dir fd_t used in hard fh resolution)

Comment 6 Vijay Bellur 2010-08-31 11:44:26 UTC
PATCH: http://patches.gluster.com/patch/4418 in master (nfs3: Unref dir fd once usage ends in hard fh resolution)

Comment 7 Shehjar Tikoo 2010-09-01 02:37:05 UTC
Regression Test:

1. Start nfs export as:

posix->proto/server->proto/client->nfs/server

in the same volume file.

2. A the client, mount as:

mount <server>:/posix -o soft,intr,actimeo=3600 /mnt

3. Run the following command:
$ mkdir -p /mnt//2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18/19/

4. Restart the nfs server, without remounting at the client.

5. At the nfs client:
$ touch /mnt//2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18/19

6. Now at the nfs server:
$ kill -USR1 <pif of gnfs>

7. Inspect the glusterfsdump file. Ensure there are no more than one open file descriptors. If there are, there is a regression.


Note You need to log in before you can comment on or make changes to this bug.