Bug 114904
Summary: | Incorrect NFS server response of RHEL kernel makes the solaris client abort NFS version 2 mounts | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | George B. Magklaras <georgios> |
Component: | kernel | Assignee: | Steve Dickson <steved> |
Status: | CLOSED NOTABUG | QA Contact: | Brian Brock <bbrock> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 3.0 | CC: | anders.odberg, georgios, pere, petrides, riel, tao, t.h.amundsen, vcaruso |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2004-06-14 15:43:43 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
George B. Magklaras
2004-02-04 10:18:22 UTC
Just a little bit more info on why I think the NFS response is "incorrect". The problem is probably related to the fact that the Solaris Client asks for ACL info on the mount request. Then the response from RHEL is something along the lines of what you can see below from the solaris logs: solarisbox -> linuxrhel NFS_ACL C GETATTR2 FH=033A linuxrhel-> solarisbox RPC R (#23) XID=4242363657 Program number mismatch (low=2, high=3) Is this the correct response? If I check from the solaris box via rpcinfo, the RHEL server offers both versions of the NFS protocol: solarisbox$ rpcinfo -t linuxrhel 100003 program 100003 version 2 ready and waiting program 100003 version 3 ready and waiting One of the Solaris admins here tells me that the nfs_acl RPC program code is 100227: solarisbox$ rpcinfo -t linuxrhel 100227 rpcinfo: RPC: Program not registered program 100227 is not available Obviously, I don't have ACL support with version 2 and I also know that this is not standard in NFS. However, the Solaris team here believes that the linuxrhel box should respond along the lines of: program 100227 is not available as opposed to: rpcinfo: RPC: Program not registered program 100227 is not available (the earlier part confuses SOLARIS to think that NFS Version 2 is not available and it aborts the mount). They also seem to think that this behavior is peculiar to the RHEL kernel. So, two questions: 1) Knowing that ACL with version2 is not standard, do you think that RHEL responds correctly in that case? 2) Would Linux vanilla kernels answer the same thing on the server side of things? (diff indicates that there are various differences amongst the files net/sunrpc/svc.c net/sunrpc/svcsock.c include/linux/sunrpc/svcsock.h between 2.4.21-9EL and for example the pure 2.4.21.) I hope that this justifies a bit better the "incorrect behavior" bit. GM I have recreated the exact same output as above. I was previously using RH 8 to NFS mount to our Sun system and life was good. But following a complete rebuild of my system to RHEL WS 3 the NFS problem mentioned above appeared. I am running 2.4.21-9.0.1EL. This presents a serious problem as the previous suggestion to change over to pure 2.4.21 necesitates going to another environment other than RHEL since pure 2.4.21 is not offered by Red Hat. Is there a patch being considered by RHEL to correct this? I have logs of the Sun trying to mount to my Linux box but they seem to just be indicating I am maxing out my numbers of NFS mounts which is currently set to 8 by default in the NFS script file for example: nfs mount: mount: /fullup1: I/O error mount: /tmp is already mounted, swap is busy, or the allowable number of mount points has been exceeded NFS getattr failed for server linuxbox: error 9 (RPC: Program/version mismatch) This is a Solaris bug.... It was identified at this years Connectathon.. The v2 Solaris client probes server with a NFS_ACL getattr to see if version 2 of the NFS ACL protocol is supported. The RHEL 3 server only supports version 3 of the NFS ACL protocol so an PROGMISMATCH error is returned which is the correct thing to do... > 1) Knowing that ACL with version2 is not standard, do you think that > RHEL responds correctly in that case? Yes... > 2) Would Linux vanilla kernels answer the same thing on the server > side of things? (diff indicates that there are various differences > amongst the files No... the Linux vanilla server would reply with would reply with ENOTSUPPORTED which the Solaris server interprets correctly.... > Is there a patch being considered by RHEL to correct this? Not at this moment since it is a Solaris bug and the problem does not happen when using NFS v3... There's an alternative workaround for this. It's down to how the Sun boxes connect to the NFS share and how they rely on ACLs. It seems that in Linux 9 ACLs were enabled by default - in Enterprise 3 they are not. Thus, a generic mount does not provide the kind of info that a Sun box likes to see. On your Linux box, when mounting the lump to be shared, from the command line add the -o acl switch to turn on the acls for that mount:- #mount -t ext3 -o acl /dev/sdx1 /mnt/point or inside of /etc/fstab instead of defaults, define read/write with acls enabled:- /dev/sdx1 /mnt/point ext3 rw,acl 1 2 Now set up the nfs shares in the normal way, and Sun boxes will love you forever. -sf |