Bug 155929 - Linux NFSv4 client doesn't return a delegation before removing a file
Linux NFSv4 client doesn't return a delegation before removing a file
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
All Linux
medium Severity medium
: ---
: ---
Assigned To: Steve Dickson
Brian Brock
Depends On:
  Show dependency treegraph
Reported: 2005-04-25 16:17 EDT by Chuck Lever
Modified: 2007-11-30 17:07 EST (History)
5 users (show)

See Also:
Fixed In Version: RHBA-2007-0304
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2007-05-01 18:53:28 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Return NFSv4 delegation during file removal (3.63 KB, patch)
2006-12-12 17:31 EST, Chuck Lever
no flags Details | Diff
Return NFSv4 delegation during silly rename (1.03 KB, patch)
2006-12-12 17:31 EST, Chuck Lever
no flags Details | Diff
Return NFSv4 delegation during rename (888 bytes, patch)
2006-12-12 17:32 EST, Chuck Lever
no flags Details | Diff

  None (edit)
Description Chuck Lever 2005-04-25 16:17:05 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6) Gecko/20050317 Firefox/1.0.2

Description of problem:
Certain workloads that remove lots of files run slowly on Linux clients using
NFSv4 and with delegation enabled on the server.  By examining a network trace,
we saw that the Linux NFSv4 client does not send a DELEG_RETURN before deleting
a file when there is an outstanding delegation on that file.  Instead, the
server delays the remove operation, does a call back, and the client then does
the DELEG_RETURN.  The client redrives the REMOVE request a second later.

Customers using NFSv4 against an NFSv4 server with delegation enabled will see
terrible performance on workloads that remove lots of files after reading or
writing to them.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Enable delegation on an NFSv4 server
2. Mount one of the server's exports from Linux client using NFSv4
3. Try a kernel build on that share

Actual Results:  Every on-the-wire remove operation results in an NFS4ERR_DELAY and a server call
back.  The client takes a second to recover and redrive the remove operation.

Expected Results:  The client should do a DELEG_RETURN before removing a file that has been
granted a delegation.

Additional info:

There is a policy question about just when the client should return a
delegation -- just after closing the file, or should it wait as long as
possible (until the file's cached pages are reclaimed or the file
itself is deleted by the client).

In addition we don't want to add a second round trip by doing the
DELEG_RETURN in a separate compound, so some cleverness will be needed to
make nfs4_proc_remove check if it needs to do the return and build the
DELEG_RETURN op into the same compound as the REMOVE.

Trond knows about this issue and is working on a solution for 2.6 kernels.
Comment 1 Chuck Lever 2005-04-25 16:19:24 EDT
This is a problem for Fedora Core kernels as well.
Comment 2 Chuck Lever 2006-12-12 17:31:09 EST
Created attachment 143462 [details]
Return NFSv4 delegation during file removal
Comment 3 Chuck Lever 2006-12-12 17:31:58 EST
Created attachment 143464 [details]
Return NFSv4 delegation during silly rename
Comment 4 Chuck Lever 2006-12-12 17:32:58 EST
Created attachment 143467 [details]
Return NFSv4 delegation during rename
Comment 5 Chuck Lever 2006-12-12 17:34:42 EST
Attached three patches from mainline that implement delegation return during
file removal.
Comment 6 RHEL Product and Program Management 2006-12-20 20:44:18 EST
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
Comment 7 Jay Turner 2007-01-02 08:41:46 EST
QE ack for 4.5.
Comment 8 Jason Baron 2007-01-10 14:28:58 EST
committed in stream U5 build 42.40. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/
Comment 11 Red Hat Bugzilla 2007-05-01 18:53:29 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.