Bug 155929 - Linux NFSv4 client doesn't return a delegation before removing a file
Summary: Linux NFSv4 client doesn't return a delegation before removing a file
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Steve Dickson
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-04-25 20:17 UTC by Chuck Lever
Modified: 2007-11-30 22:07 UTC (History)
5 users (show)

Fixed In Version: RHBA-2007-0304
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-05-01 22:53:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Return NFSv4 delegation during file removal (3.63 KB, patch)
2006-12-12 22:31 UTC, Chuck Lever
no flags Details | Diff
Return NFSv4 delegation during silly rename (1.03 KB, patch)
2006-12-12 22:31 UTC, Chuck Lever
no flags Details | Diff
Return NFSv4 delegation during rename (888 bytes, patch)
2006-12-12 22:32 UTC, Chuck Lever
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2007:0304 0 normal SHIPPED_LIVE Updated kernel packages available for Red Hat Enterprise Linux 4 Update 5 2007-04-28 18:58:50 UTC

Description Chuck Lever 2005-04-25 20:17:05 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6) Gecko/20050317 Firefox/1.0.2

Description of problem:
Certain workloads that remove lots of files run slowly on Linux clients using
NFSv4 and with delegation enabled on the server.  By examining a network trace,
we saw that the Linux NFSv4 client does not send a DELEG_RETURN before deleting
a file when there is an outstanding delegation on that file.  Instead, the
server delays the remove operation, does a call back, and the client then does
the DELEG_RETURN.  The client redrives the REMOVE request a second later.

Customers using NFSv4 against an NFSv4 server with delegation enabled will see
terrible performance on workloads that remove lots of files after reading or
writing to them.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Enable delegation on an NFSv4 server
2. Mount one of the server's exports from Linux client using NFSv4
3. Try a kernel build on that share
  

Actual Results:  Every on-the-wire remove operation results in an NFS4ERR_DELAY and a server call
back.  The client takes a second to recover and redrive the remove operation.

Expected Results:  The client should do a DELEG_RETURN before removing a file that has been
granted a delegation.

Additional info:

There is a policy question about just when the client should return a
delegation -- just after closing the file, or should it wait as long as
possible (until the file's cached pages are reclaimed or the file
itself is deleted by the client).

In addition we don't want to add a second round trip by doing the
DELEG_RETURN in a separate compound, so some cleverness will be needed to
make nfs4_proc_remove check if it needs to do the return and build the
DELEG_RETURN op into the same compound as the REMOVE.

Trond knows about this issue and is working on a solution for 2.6 kernels.

Comment 1 Chuck Lever 2005-04-25 20:19:24 UTC
This is a problem for Fedora Core kernels as well.

Comment 2 Chuck Lever 2006-12-12 22:31:09 UTC
Created attachment 143462 [details]
Return NFSv4 delegation during file removal

Comment 3 Chuck Lever 2006-12-12 22:31:58 UTC
Created attachment 143464 [details]
Return NFSv4 delegation during silly rename

Comment 4 Chuck Lever 2006-12-12 22:32:58 UTC
Created attachment 143467 [details]
Return NFSv4 delegation during rename

Comment 5 Chuck Lever 2006-12-12 22:34:42 UTC
Attached three patches from mainline that implement delegation return during
file removal.

Comment 6 RHEL Program Management 2006-12-21 01:44:18 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 7 Jay Turner 2007-01-02 13:41:46 UTC
QE ack for 4.5.

Comment 8 Jason Baron 2007-01-10 19:28:58 UTC
committed in stream U5 build 42.40. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/


Comment 11 Red Hat Bugzilla 2007-05-01 22:53:29 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0304.html



Note You need to log in before you can comment on or make changes to this bug.