Bug 626977

Summary: [nfs] make close(2) asynchronous when closing nfs o_direct files
Product: Red Hat Enterprise Linux 5 Reporter: Herbert van den Bergh <herbert.van.den.bergh>
Component: kernelAssignee: Jeff Layton <jlayton>
Status: CLOSED ERRATA QA Contact: yanfu,wang <yanwang>
Severity: high Docs Contact:
Priority: low    
Version: 5.5CC: bfields, chuck.lever, herbert.van.den.bergh, jlayton, qcai, sardella, steved
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-07-21 09:47:34 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
[nfs] make close(2) asynchronous when closing nfs o_direct files
none
tarball of patches split out into individual files none

Description Herbert van den Bergh 2010-08-24 19:12:08 UTC
Created attachment 440736 [details]
[nfs] make close(2) asynchronous when closing nfs o_direct files

For NFSv2 and v3:

    O_DIRECT writes are always synchronous, and aren't cached, so nothing
    should be flushed when closing an NFS O_DIRECT file descriptor.  Thus
    there are no write errors to report on close(2).

    In addition, there's no cached data to verify on the next open(2),
    so we don't need clean GETATTR results at close time to compare with.

    Thus, there's no need for the nfs_revalidate_inode() call when closing
    an NFS O_DIRECT file.  This reduces the number of synchronous
    on-the-wire requests for a simple open-write-close of an NFS O_DIRECT
    file by roughly 20%.

    For NFSv4:

    Call nfs4_do_close() with wait set to zero when closing an NFS
    O_DIRECT file.  The CLOSE will go on the wire, but the application
    won't wait for it to complete.

Comment 2 Jeff Layton 2010-10-06 11:59:27 UTC
Created attachment 451873 [details]
tarball of patches split out into individual files

Comment 3 Jeff Layton 2010-10-06 13:37:38 UTC
Thanks for the patches.

Note that patch #1 has a bug... it replaces a call to rpc_new_task_wq with rpc_run_task. It probably should use rpc_run_task_wq to ensure that the job is queued to nfsiod instead of rpciod.

Comment 4 Jeff Layton 2010-10-07 13:30:35 UTC
I've forward ported the patches to more recent kernels and have test kernels on my people.redhat.com pages. Any testing of them would be appreciated:

http://people.redhat.com/jlayton/

Comment 6 RHEL Program Management 2011-02-01 16:52:48 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 8 yanfu,wang 2011-02-15 06:30:12 UTC
hi Jeff,
Could you give me some advices for reproduce/verify the bug? thanks a lot.

Comment 9 Jeff Layton 2011-02-15 12:05:00 UTC
Nothing specific, I'm afraid. It's not really a bug per-se, more of an optimization we lack that has significant performance impact on certain workloads.

I suppose you could do some direct I/O and see whether the close call is forcing a GETATTR to happen, but that's not straightforward to determine.

Comment 10 yanfu,wang 2011-02-16 07:45:23 UTC
maybe only do code review.

Comment 12 Jarod Wilson 2011-04-04 21:58:05 UTC
Patch(es) available in kernel-2.6.18-255.el5
You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5
Detailed testing feedback is always welcomed.

Comment 14 yanfu,wang 2011-05-26 03:32:23 UTC
do code review and verified patches are being appied in kernel-2.6.18-262.el5, and do some regression testing against O_DIRECT:
https://tcms.engineering.redhat.com/run/21467/

Comment 15 errata-xmlrpc 2011-07-21 09:47:34 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-1065.html