Bug 227610

Summary: READDIR on a NFSv4 directory containing a referral returns -EIO for entire directory
Product: Red Hat Enterprise Linux 4 Reporter: Jeff Layton <jlayton>
Component: kernelAssignee: Jeff Layton <jlayton>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.4CC: staubach, steved
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHSA-2008-0665 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-07-24 19:12:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 430698, 439431    
Attachments:
Description Flags
patch -- Make READDIR use mounted_on_fileid rather than regular fileid
none
patch 2 - set and handle fattr_rdattr_error attribute none

Description Jeff Layton 2007-02-07 01:54:30 UTC
When a NFSv4 client requests a READDIR and the server hits an error while
getting attributes for a directory entry, the server will return the error for
the entire READDIR call. This isn't optimal, so we can and should set the
"fattr4_rdattr_error" flag in the request to specify that the server should
continue on and only report the error on the problem directory entries.

Comment 1 Jeff Layton 2007-02-07 01:58:03 UTC
Created attachment 147529 [details]
patch -- Make READDIR use mounted_on_fileid rather than regular fileid

This patch makes the next one apply cleanly (and looks like it also fixes
another potential bug).

Comment 2 Jeff Layton 2007-02-07 02:00:29 UTC
Created attachment 147530 [details]
patch 2 - set and handle fattr_rdattr_error attribute 

This patch fixes the problem, and allowing the server to return a more granular
mix of valid results and errors.

Comment 3 Steve Dickson 2007-02-21 14:45:49 UTC
Was there a reproducer for this problem?

Comment 4 Jeff Layton 2007-02-21 15:09:26 UTC
Now that I think we have a resolution on bz228893, I think I might be able to
get one.

The way I reproduced it at connectathon was mounting a nfs4 directory that
contained a referral export within it (hopefully I have the terminology correct
here). When trying to do a readdirplus on the directory that contained the
referral, the client would get an error back on the entire directory because
this wasn't set.

I was hoping I might be able to set up a similar situation by having a v4 root
dir that exports a mix of directories to krb5 and krb5i exclusively, but haven't
had time to reproduce it as of yet.


Comment 5 Jeff Layton 2007-04-20 20:40:05 UTC
To reproduce, you'll need a Fedora 7 server (the server-side referral bits
aren't yet in RHEL5):

# mkdir /export
# mkdir /export/fsloc
# mount --bind /export/fsloc /export/fsloc
# cat /etc/exports
/export         *(rw,nohide,insecure,fsid=0)
/export/fsloc   *(ro,nohide,insecure,refer=/foo.10.10)
# service nfs start

(server path and address don't really matter here since RHEL4 can't chase the
referral anyway)...

On the client:

# mkdir -p /mnt/referral
# mount -t nfs4 server:/ /mnt/referral
# ls -l /mnt/referral
# ls -l /mnt/referral
ls: reading directory /mnt/referral: Input/output error
total 0

...you'll also get this in the ring buffer:

nfs4_map_errors could not handle NFSv4 error 10019

...the expectation is that you'll be able to list the contents of the directory,
though the READDIRPLUS entry for the referral will come back with an error.


Comment 6 Jeff Layton 2007-04-23 19:12:56 UTC
Actually, given my reproducer, only the first patch here is needed to fix this.
I'll go ahead and propose that and hold off on the other one since I don't seem
to have a situation that actually requires rdattr_error.


Comment 7 Jeff Layton 2007-04-23 19:34:14 UTC
Expected results (with an empty file 

# ls -al /mnt/referral
total 16
drwxr-xr-x   3 root root 4096 Apr 23 15:32 .
drwxr-xr-x  12 root root 4096 Apr 20 11:58 ..
?---------   ? ?    ?       ?            ? fsloc

...the directory entries are listable, but attempting to stat fsloc gives back a
-EIO. I think this is the best we can do for RHEL4. Backporting the
referral-chasing code is probably too much.


Comment 9 RHEL Program Management 2007-04-23 19:45:46 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 10 RHEL Program Management 2007-04-23 20:02:07 UTC
This request was evaluated by Red Hat Kernel Team for inclusion in a Red
Hat Enterprise Linux maintenance release, and has moved to bugzilla 
status POST.

Comment 11 Jeff Layton 2007-09-05 21:41:26 UTC
Moving to 4.7. This patch was less critical, and we already had a lot of NFS
patches for 4.6.


Comment 12 RHEL Program Management 2007-09-05 21:43:42 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 14 Vivek Goyal 2008-03-27 23:21:36 UTC
Committed in 68.27.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/

Comment 18 errata-xmlrpc 2008-07-24 19:12:45 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2008-0665.html