Bug 517162

Summary: cthon test5 failing on nfsv4 with rhel6 client vs. rhel4 server
Product: Red Hat Enterprise Linux 4 Reporter: Mike Gahagan <mgahagan>
Component: kernelAssignee: Jeff Layton <jlayton>
Status: CLOSED ERRATA QA Contact: Mike Gahagan <mgahagan>
Severity: medium Docs Contact:
Priority: urgent    
Version: 4.8CC: dhoward, jburke, jlayton, pbunyan, steved, syeghiay
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 573995 (view as bug list) Environment:
Last Closed: 2011-02-16 15:43:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 597314    
Attachments:
Description Flags
patch -- 2 patches to fix zero-stateid handling in nfsd setattr none

Description Mike Gahagan 2009-08-12 19:21:02 UTC
Description of problem:
rhts test /kernel/filesystems/nfs/connectathon/rhel4-nfs/nfsvers=4/base fails
 against rhel 6 alpha 1 kernel. Apperantly it gets into a situation where it can't write one of the required files. 

Version-Release number of selected component (if applicable):
RHEL6.0-20090709.0
kernel 2.6.29.4-1.el6

How reproducible:
always

Steps to Reproduce:
1.queue up /kernel/filesystems/nfs/connectathon/ in rhts
2.observe that /kernel/filesystems/nfs/connectathon/rhel4-nfs/nfsvers=4/base fails
3.
  
Actual results:


Expected results:
all tests pass

Additional info:
For logs, packet capture etc...
http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=9174502

test log:
Running as user "root" and group "root". This could be dangerous.
Capturing on eth0
===== Starting 'nfsvers=4' test 'base' =====
----- Server load rhel4-nfs                up  10 days,  9:10,    load average: 1.90 2.42 1.36 -----
----- start: Tue Jul 21 11:07:30 EDT 2009 -----
./server -b -F nfs4 -p /export/home rhel4-nfs
Start tests on path /mnt/rhel4-nfs/dell-pe800-01.test [y/n]? 
sh ./runtests  -b -t /mnt/rhel4-nfs/dell-pe800-01.test

Starting BASIC tests: test directory /mnt/rhel4-nfs/dell-pe800-01.test (arg: -t)

./test1: File and directory creation test
	created 155 files 62 directories 5 levels deep in 0.88 seconds
	./test1 ok.

./test2: File and directory removal test
	removed 155 files 62 directories 5 levels deep in 0.46 seconds
	./test2 ok.

./test3: lookups across mount point
	500 getcwd and stat calls in 0.0  seconds
	./test3 ok.

./test4: setattr, getattr, and lookup
	1000 chmods and stats on 10 files in 2.5  seconds
	./test4 ok.

./test5: read and write
	./test5: (/mnt/rhel4-nfs/dell-pe800-01.test) can't create 'bigfile' : Input/output error
basic tests failed
Tests failed, leaving /mnt/rhel4-nfs mounted

Comment 1 RHEL Program Management 2009-08-12 19:33:23 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.

Comment 7 Jeff Layton 2010-03-15 16:15:37 UTC
*** Bug 567333 has been marked as a duplicate of this bug. ***

Comment 8 Jeff Layton 2010-03-15 16:31:37 UTC
On the open(), the RHEL6 client does a SETATTR call to take care of the O_TRUNC open flag. When it does this however, it sends a zero stateid, even though it has the file open and has gotten a valid stateid back.

I suspect that RHEL4 handling this case incorrectly, but I'll need to verify it with the spec.

Comment 9 Jeff Layton 2010-03-15 17:10:11 UTC
Created attachment 400268 [details]
patch -- 2 patches to fix zero-stateid handling in nfsd setattr

These two (really old) patches seem to fix the problem for me. They change the code to handle the zero-stateid differently. I need to look over them a bit more carefully but they seem sane.

Comment 10 Jeff Layton 2010-03-15 17:10:54 UTC
It's also not clear to me whether rhel6 is doing the right thing. The file has been opened, so it seems like it should be using the stateid we got from that.

Comment 11 Jeff Layton 2010-03-15 19:39:16 UTC
It's tough to tell from the spec whether RHEL6 is doing the right thing. I think that it's technically *not*, but most servers apparently don't really care.

Fixing this in RHEL6 will be more of a VFS patch though. We'd probably have to fix may_open to take an "nd" arg, and then have it check to see whether there's an already instantiated filp in the open intent stuff, and then have it pass that to do_truncate.

It's probably the right thing to do, but really isn't high priority.

Comment 12 Jeff Layton 2010-03-15 22:12:18 UTC
For giggles, I went ahead and sent a patch upstream to fix this behavior in the client too. It turns out to be a pretty small patch there, but will probably be a bit larger for RHEL6.

http://marc.info/?l=linux-kernel&m=126868917104621&w=2

...if this ends up flying upstream, I'll clone this BZ for RHEL6 too and we'll fix it there as well.

Comment 13 Jeff Layton 2010-03-19 19:40:31 UTC
RHEL6 bug 573995.

Comment 18 Jeff Layton 2010-03-29 12:49:20 UTC
*** Bug 486768 has been marked as a duplicate of this bug. ***

Comment 20 RHEL Program Management 2010-03-29 13:15:03 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 22 Vivek Goyal 2010-05-27 15:02:23 UTC
Committed in 89.26.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/

Comment 30 errata-xmlrpc 2011-02-16 15:43:13 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0263.html