Created attachment 440736 [details] [nfs] make close(2) asynchronous when closing nfs o_direct files For NFSv2 and v3: O_DIRECT writes are always synchronous, and aren't cached, so nothing should be flushed when closing an NFS O_DIRECT file descriptor. Thus there are no write errors to report on close(2). In addition, there's no cached data to verify on the next open(2), so we don't need clean GETATTR results at close time to compare with. Thus, there's no need for the nfs_revalidate_inode() call when closing an NFS O_DIRECT file. This reduces the number of synchronous on-the-wire requests for a simple open-write-close of an NFS O_DIRECT file by roughly 20%. For NFSv4: Call nfs4_do_close() with wait set to zero when closing an NFS O_DIRECT file. The CLOSE will go on the wire, but the application won't wait for it to complete.
Created attachment 451873 [details] tarball of patches split out into individual files
Thanks for the patches. Note that patch #1 has a bug... it replaces a call to rpc_new_task_wq with rpc_run_task. It probably should use rpc_run_task_wq to ensure that the job is queued to nfsiod instead of rpciod.
I've forward ported the patches to more recent kernels and have test kernels on my people.redhat.com pages. Any testing of them would be appreciated: http://people.redhat.com/jlayton/
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
hi Jeff, Could you give me some advices for reproduce/verify the bug? thanks a lot.
Nothing specific, I'm afraid. It's not really a bug per-se, more of an optimization we lack that has significant performance impact on certain workloads. I suppose you could do some direct I/O and see whether the close call is forcing a GETATTR to happen, but that's not straightforward to determine.
maybe only do code review.
Patch(es) available in kernel-2.6.18-255.el5 You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed.
do code review and verified patches are being appied in kernel-2.6.18-262.el5, and do some regression testing against O_DIRECT: https://tcms.engineering.redhat.com/run/21467/
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-1065.html