Description of problem: When mounting NFS exported volumes from a SOLARIS 5.x clients using the version 2 of the NFS protocol, the mount is aborted on the SOLARIS client side. Other OS nfs clients do not experience the problem in the process of NFS v2 mounting partitions from the RHEL server. NFS v3 mounting solves the problem, but we have to use NFS version 2 in our environment. Version-Release number of selected component (if applicable): 2.4.21-9EL kernel How reproducible: Steps to Reproduce: 1.Export a directory via NFS from the RHEL server. For example in my RHEL server the /etc/exports file contains an entry of: /mn/odin/u1 solarisbox(rw-sync) 2.From solarisbox, attempt to NFS mount the exported partition using version 2 of the NFS protocol: mount -o vers=2 linuxrhel:/mn/odin/u1 /mnt NFS getattr failed for server linuxrhel: error 9 (RPC: Program/version mismatch) nfs mount: mount: /mn/odin/u1: I/O error The same command completes with no problems on an IRIX 6.5 client. On the contrary, from both solaris and IRIX nfs clients, v3 works: mount -o vers=3 linuxrhel:/mn/odin/u1 /mnt Actual results: Expected results: Additional info:
Just a little bit more info on why I think the NFS response is "incorrect". The problem is probably related to the fact that the Solaris Client asks for ACL info on the mount request. Then the response from RHEL is something along the lines of what you can see below from the solaris logs: solarisbox -> linuxrhel NFS_ACL C GETATTR2 FH=033A linuxrhel-> solarisbox RPC R (#23) XID=4242363657 Program number mismatch (low=2, high=3) Is this the correct response? If I check from the solaris box via rpcinfo, the RHEL server offers both versions of the NFS protocol: solarisbox$ rpcinfo -t linuxrhel 100003 program 100003 version 2 ready and waiting program 100003 version 3 ready and waiting One of the Solaris admins here tells me that the nfs_acl RPC program code is 100227: solarisbox$ rpcinfo -t linuxrhel 100227 rpcinfo: RPC: Program not registered program 100227 is not available Obviously, I don't have ACL support with version 2 and I also know that this is not standard in NFS. However, the Solaris team here believes that the linuxrhel box should respond along the lines of: program 100227 is not available as opposed to: rpcinfo: RPC: Program not registered program 100227 is not available (the earlier part confuses SOLARIS to think that NFS Version 2 is not available and it aborts the mount). They also seem to think that this behavior is peculiar to the RHEL kernel. So, two questions: 1) Knowing that ACL with version2 is not standard, do you think that RHEL responds correctly in that case? 2) Would Linux vanilla kernels answer the same thing on the server side of things? (diff indicates that there are various differences amongst the files net/sunrpc/svc.c net/sunrpc/svcsock.c include/linux/sunrpc/svcsock.h between 2.4.21-9EL and for example the pure 2.4.21.) I hope that this justifies a bit better the "incorrect behavior" bit. GM
I have recreated the exact same output as above. I was previously using RH 8 to NFS mount to our Sun system and life was good. But following a complete rebuild of my system to RHEL WS 3 the NFS problem mentioned above appeared. I am running 2.4.21-9.0.1EL. This presents a serious problem as the previous suggestion to change over to pure 2.4.21 necesitates going to another environment other than RHEL since pure 2.4.21 is not offered by Red Hat. Is there a patch being considered by RHEL to correct this? I have logs of the Sun trying to mount to my Linux box but they seem to just be indicating I am maxing out my numbers of NFS mounts which is currently set to 8 by default in the NFS script file for example: nfs mount: mount: /fullup1: I/O error mount: /tmp is already mounted, swap is busy, or the allowable number of mount points has been exceeded NFS getattr failed for server linuxbox: error 9 (RPC: Program/version mismatch)
This is a Solaris bug.... It was identified at this years Connectathon.. The v2 Solaris client probes server with a NFS_ACL getattr to see if version 2 of the NFS ACL protocol is supported. The RHEL 3 server only supports version 3 of the NFS ACL protocol so an PROGMISMATCH error is returned which is the correct thing to do...
> 1) Knowing that ACL with version2 is not standard, do you think that > RHEL responds correctly in that case? Yes... > 2) Would Linux vanilla kernels answer the same thing on the server > side of things? (diff indicates that there are various differences > amongst the files No... the Linux vanilla server would reply with would reply with ENOTSUPPORTED which the Solaris server interprets correctly.... > Is there a patch being considered by RHEL to correct this? Not at this moment since it is a Solaris bug and the problem does not happen when using NFS v3...
There's an alternative workaround for this. It's down to how the Sun boxes connect to the NFS share and how they rely on ACLs. It seems that in Linux 9 ACLs were enabled by default - in Enterprise 3 they are not. Thus, a generic mount does not provide the kind of info that a Sun box likes to see. On your Linux box, when mounting the lump to be shared, from the command line add the -o acl switch to turn on the acls for that mount:- #mount -t ext3 -o acl /dev/sdx1 /mnt/point or inside of /etc/fstab instead of defaults, define read/write with acls enabled:- /dev/sdx1 /mnt/point ext3 rw,acl 1 2 Now set up the nfs shares in the normal way, and Sun boxes will love you forever. -sf