Description of problem: Version-Release number of selected component (if applicable): kernel-2.6.18-53.1.4.el5 (tested on both x86_64 and i686). How reproducible: 100% Steps to Reproduce: 1. On an EL3 machine running 2.4.21-53.EL mount an NFS fs from server, do an ls -l 2. e.g. mount zex:/local/scratch/ /mnt/testing/ 3. ls -al /mnt/testing/test Actual results: $ ls -al /mnt/testing/test ls: /mnt/testing/test: Input/output error -rw-r--r-- 1 jp107 other 367 Dec 18 16:24 /mnt/testing/test Expected results: $ ls -al /mnt/testing/test -rw-r--r-- 1 jp107 other 367 Dec 18 16:24 /mnt/testing/test Additional info: This seems to be caused by a bug in the nfs acl support code and mounting without acls makes the error go away, e.g. umount /mnt/testing mount -o noacl zex:/local/scratch/ /mnt/testing/ ls -al /mnt/testing/test doesn't show the problem. The problem also doesn't seem to happen when using newer (2.6) kernels so EL4/EL5 clients arn't affected. This is using v3 mounts btw. Looking at the logs on the server and tracing the traffic suggested the acl issue and a few web searches showed up a discussion on the kernel list for a similar sounding issue in (plain) 2.6.19. http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-01/msg03478.html (for example) describes a fix which ought to help, and adding something equivalent to the kernel-2.6.18-53.1.4.el5 srpm does seem to fix it for my simple tests. Without that patch it seems to not fill in the nfsd_acl_versions[vers] fields so probably doesn't handle ACLs at all as far as I can follow, but you probably understand this stuff much better than I do! I'll attach the patch I used to test, I re-did the patch from his post but should just have the two changes he suggests. I don't know if that made it into the plain kernel tree or not...
Created attachment 289919 [details] Neil Brown suggested essentially this fix for 2.6.19
I should add that the EIO errors are no produced when the server is running kernel-2.6.18-8.1.15 or earlier. I didn't test the ones between that and kernel-2.6.18-53.1.4 so don't know when this regression appeared. -- Jon
Solaris 7 and 8 clients also show this problem: # ls -l ~/powercut.prod NFS getacl failed for server moa: error 9 (RPC: Program/version mismatch) -rw-r--r-- 1 werdna staff 29 Nov 28 14:53 /homes/werdna/powercut.prod # getfacl ~/powercut.prod NFS getacl failed for server moa: error 9 (RPC: Program/version mismatch) /homes/werdna/powercut.prod: failed to get acl count #
*** This bug has been marked as a duplicate of 429109 ***