Description of problem: Due to a typo in the nfsd code, the NFS ACL support is disabled Version-Release number of selected component (if applicable): kernel-2.6.18-69 How reproducible: All the time. Steps to Reproduce: Using a F9 nfs client, mount a rhel5.1 server and notice the "svc: unknown version (3)" log message. Actual results: The mount success but no ACL support. Expected results: The mount succeed but with ACL support. Additional info:
Created attachment 291988 [details] proposed rhel5 patch
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
*** Bug 426138 has been marked as a duplicate of this bug. ***
Note that the proposed patch here is pretty much what Neil Brown suggested for plain 2.6.19 (as I included last month) but this one doesn't correct the .pg_name field of the structure for the nfs acl stuff from "nfsd" to "nfsacl". That means that any errors logged by the kernel will still show the "nfsd" string and make it slightly harder to diagnose future problems etc. Is there a reason why that part of https://bugzilla.redhat.com/attachment.cgi?id=289919 isn't appropriate? BTW was there a reason for opening a new bugzilla report and marking mine a duplicate of it rather than just putting the new info in that report?
Oh wait I see that your patch has an extra fix in it to correct the same typo at case NFSD_AVAIL as well - assuming that is actually correct...
Created attachment 292377 [details] Updated RHEL patch Updated the patch to removing the change to the NFSD_AVAIL case and added the '+ pg_name = "nfsacl",' change.
in 2.6.18-72.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5
In case it wasn't obvious from reading #426138 this is a _regression_ which happened at some point between kernel-2.6.18-8.1.15 and kernel-2.6.18-53.1.4 I'd guess that it was actually introduced as part of update 1 (ie with the update to kernel-2.6.18-53) but I may be wrong - I didn't look very closely at the changes in all versions. If the plan is to not roll this fix into a fix until update 2 then those of us who use NFS will need to either run experimental (well not fully QA'd) kernels or just apply the (fairly small) patch to each new security update kernel e.g. kernel-2.6.18-53.1.6 etc etc. I suppose one could argue that the lack of *working* ACLs over NFS is actually a security problem so ought to be considered more serious than just a random bugfix.
re comment #8, you mention "removing the change to the NFSD_AVAIL case", but the patch linked in that comment has that case still in. Of course, this is correct, as per http://www.ussg.iu.edu/hypermail/linux/ kernel/0701.1/1480.html so please make sure that the final fix for update 2 does have the full version of this patch, and not the patch with the NFSD_AVAIL case removed. Is the severity set to the correct level? NFS servers that have to interact with anything other than other RHEL5 boxes need to run a kernel with this fix to be able to deal with acls, required for security, and also require a new kernel for the latest security update. Both conditions can't be satisified currently (plus there are horrible throughput issues on the working kernel- 2.6.18-8.1.15 kernel).
Between me starting typing this and submitting it, the severity changed from 'low' to 'high'. I don't know who did that... The http://www.ussg.iu.edu/hypermail/linux/kernel/0701.1/1480.html seems to be the same message as I pointed at in https://bugzilla.redhat.com/show_bug.cgi?id=426138 ie http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-01/msg03478.html Neither seems to have a change suggested for the NFSD_AVAIL case, though I'm personally unclear if it is actually needed or not. In case that isn't clear I mean that the bit of https://bugzilla.redhat.com/attachment.cgi?id=291988 which does: ... case NFSD_AVAIL: - return nfsd_version[vers] != NULL; + return nfsd_versions[vers] != NULL; } ... isn't obviously in the patch that Neil Brown was suggesting at that point. It may well be that later he (or someone else) spotted that the same change needed making in that branch too. I'm afraid that I don't have a clue where the patch 291988 actually came from - since it was missing the fix to the pg_name it obviously wasn't the same place that I'd already found and reported... Re the severity the issues as I see them are that without this fix: talking to older linux or solaris (etc) clients causes huge numbers of errors talking to newer linux clients silently disables ACL support which indeed could have security implications - suddenly people may gain access to stuff that they shouldn't be allowed to see etc! As to whether this makes it important enough to just add this patch into an interim/security update isn't clear. Currently we are forced to build our own kernels based on -53.1.13 with this patch added and the nfs-silly-rename-fix stuff (4 patches) disabled to have reasonable NFS client performance again. (ie the ones that were suggested to be commented out in https://bugzilla.redhat.com/show_bug.cgi?id=431092 comments #12, #43 etc. I tried one of the test kernels from http://people.redhat.com/dzickus/el5/ and it seemed to work under light load but others reported that it wasn't suitable for production use (crashing) so we can't safely deploy that yet... If the changes to fix these problems do make it into update 2 then good but that is probably still some way off, and we might get a little bored of rebuilding kernels (and having to test them ourselves) until then.
Jonathan, you are quite right - I managed to get myself confused (easily done these days) with all of these maze of twisty patches I had open on my display, all alike. I just verified off a known good 2.6.24 kernel that the NFSD_AVAIL case is the unpluralised version of case NFSD_AVAIL: return nfsd_version[vers] != NULL;
*** Bug 431877 has been marked as a duplicate of this bug. ***
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0314.html