When a NFSv4 client requests a READDIR and the server hits an error while getting attributes for a directory entry, the server will return the error for the entire READDIR call. This isn't optimal, so we can and should set the "fattr4_rdattr_error" flag in the request to specify that the server should continue on and only report the error on the problem directory entries.
Created attachment 147529 [details] patch -- Make READDIR use mounted_on_fileid rather than regular fileid This patch makes the next one apply cleanly (and looks like it also fixes another potential bug).
Created attachment 147530 [details] patch 2 - set and handle fattr_rdattr_error attribute This patch fixes the problem, and allowing the server to return a more granular mix of valid results and errors.
Was there a reproducer for this problem?
Now that I think we have a resolution on bz228893, I think I might be able to get one. The way I reproduced it at connectathon was mounting a nfs4 directory that contained a referral export within it (hopefully I have the terminology correct here). When trying to do a readdirplus on the directory that contained the referral, the client would get an error back on the entire directory because this wasn't set. I was hoping I might be able to set up a similar situation by having a v4 root dir that exports a mix of directories to krb5 and krb5i exclusively, but haven't had time to reproduce it as of yet.
To reproduce, you'll need a Fedora 7 server (the server-side referral bits aren't yet in RHEL5): # mkdir /export # mkdir /export/fsloc # mount --bind /export/fsloc /export/fsloc # cat /etc/exports /export *(rw,nohide,insecure,fsid=0) /export/fsloc *(ro,nohide,insecure,refer=/foo.10.10) # service nfs start (server path and address don't really matter here since RHEL4 can't chase the referral anyway)... On the client: # mkdir -p /mnt/referral # mount -t nfs4 server:/ /mnt/referral # ls -l /mnt/referral # ls -l /mnt/referral ls: reading directory /mnt/referral: Input/output error total 0 ...you'll also get this in the ring buffer: nfs4_map_errors could not handle NFSv4 error 10019 ...the expectation is that you'll be able to list the contents of the directory, though the READDIRPLUS entry for the referral will come back with an error.
Actually, given my reproducer, only the first patch here is needed to fix this. I'll go ahead and propose that and hold off on the other one since I don't seem to have a situation that actually requires rdattr_error.
Expected results (with an empty file # ls -al /mnt/referral total 16 drwxr-xr-x 3 root root 4096 Apr 23 15:32 . drwxr-xr-x 12 root root 4096 Apr 20 11:58 .. ?--------- ? ? ? ? ? fsloc ...the directory entries are listable, but attempting to stat fsloc gives back a -EIO. I think this is the best we can do for RHEL4. Backporting the referral-chasing code is probably too much.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
This request was evaluated by Red Hat Kernel Team for inclusion in a Red Hat Enterprise Linux maintenance release, and has moved to bugzilla status POST.
Moving to 4.7. This patch was less critical, and we already had a lot of NFS patches for 4.6.
Committed in 68.27.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2008-0665.html