Bug 678026

Summary: Bug fixes to the 2.6.37 NFS Client [rhel-6.0.z]
Product: Red Hat Enterprise Linux 6 Reporter: RHEL Program Management <pm-rhel>
Component: kernelAssignee: Frantisek Hrbata <fhrbata>
Status: CLOSED WONTFIX QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.1CC: arozansk, bugproxy, dhoward, dtian, jjarvis, jpallich, kzhang, pm-eus, qcai, rwheeler, sforsber, steved, tpnoonan, tsmetana, yanwang
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-09-20 12:40:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 662782    
Bug Blocks:    
Attachments:
Description Flags
generate stable writes with O_SYNC none

Description RHEL Program Management 2011-02-16 14:32:33 UTC
This bug has been copied from bug #662782 and has been proposed
to be backported to 6.0 z-stream (EUS).

Comment 4 John Jarvis 2011-02-25 18:06:31 UTC
Copying IBM's comments from BZ 662782.  I will have them reverse mirror this one so they can comment directly on this one.



IBM Bug Proxy 2011-02-25 13:00:45 EST

------- Comment From pbadari.com 2011-02-25 12:52 EDT-------
I would like to suggest that instead of considering all NFS changes that are
targetted for RHEL6.1,
lets focus on specific fixes we need for RHEL6.0-z stream.

I will provide list of fixes we need for z-stream:

1) fix similar to the one descrined in bug 67632 (for RHEL5 series)

@@ -709,7 +709,7 @@ static ssize_t nfs_direct_write(struct k
return -ENOMEM;
nfs_alloc_commit_data(dreq);

-       if (dreq->commit_data == NULL || count<  wsize)
+       if (dreq->commit_data == NULL || count<= wsize)
sync = FLUSH_STABLE;

dreq->inode = inode;

2) we may have another one to improve O_SYNC performance. let me look at that
also.

Comment 5 Timothy Noonan 2011-02-25 20:34:55 UTC
ltcbz 67632 is rhbz 643441/677172 ltc bz 67632  -  RIT00404862- pure nfs client
performance using odirect )

Comment 6 IBM Bug Proxy 2011-02-25 21:50:49 UTC
------- Comment From pbadari.com 2011-02-25 16:42 EDT-------
We have 2 use-case scenario fixes here:

1) O_DIRECT: above simple fix where STABLE -> UNSTABLE cut off happens.

2) O_SYNC: There are lot of changes on how O_SYNC is handled between RHEL5 -> RHEL6.
One major change is, O_SYNC no longer generates STABLE writes. It always generates
UNSTABLE followed by COMMIT. This is causing significant performance degrade with
our backend SoNAS/GPFS (since it has larger blocksize).

Here is the proposed fix (currently under validation) to generate STABLE writes like
before (RHEL5) till ->wsize.

Comment 7 IBM Bug Proxy 2011-02-25 21:50:55 UTC
Created attachment 481091 [details]
generate stable writes with O_SYNC


------- Comment (attachment only) From pbadari.com 2011-02-25 16:42 EDT-------

Comment 8 IBM Bug Proxy 2011-02-26 16:30:58 UTC
------- Comment From tpnoonan.com 2011-02-26 11:25 EDT-------
can these 2 fixes go into rhel6.0.z with kernel freeze 3/2/11? thanks

Comment 10 Steve Dickson 2011-02-28 14:55:46 UTC
(In reply to comment #8)
> ------- Comment From tpnoonan.com 2011-02-26 11:25 EDT-------
> can these 2 fixes go into rhel6.0.z with kernel freeze 3/2/11? thanks

Question, has requested patch been posted to upstream?

Also has there been any performance testing to make sure
this does not cause a regression with non-O_SYNC writes?

Comment 11 IBM Bug Proxy 2011-03-01 21:10:59 UTC
------- Comment From pbadari.com 2011-03-01 16:07 EDT-------
1) O_DIRECT patch is well tested and is upstream.

2) O_SYNC problem was recently found. Neil Brown proposed this patch and still being worked
upstream. https://patchwork.kernel.org/patch/565831/

Comment 12 IBM Bug Proxy 2011-03-01 21:20:38 UTC
------- Comment From ffilz.com 2011-03-01 16:13 EDT-------
Another patch that should be considered is:

Bug 68522  -  RIT1613663- NFS client has troubles with fileid with bit 31 (or bit 63) set

Comment 13 IBM Bug Proxy 2011-03-02 16:01:30 UTC
------- Comment From ffilz.com 2011-03-02 10:59 EDT-------
The impact o the bit 31/63 bug is that 32 bit clients will get unxpected failures if the NFS server has inode numbers with bit 31 or bit 63 set. The compatibility code tries to fix up these inode numbers to fit into 32 bits, but then causes the resulting 32 bit quantity to be sign extended to 64 bits, which of course then doesn't fit in 32 bits.

If I remember correctly, this issue will arise even with 32 bit inode numbers when bit 31 is set.