Bug 63162
Summary: | clients first nfs mount goes stale after manual relocation | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 2.1 | Reporter: | Mike McLean <mikem> |
Component: | clumanager | Assignee: | Lon Hohberger <lhh> |
Status: | CLOSED ERRATA | QA Contact: | |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 2.1 | CC: | tao |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2002-10-08 15:33:12 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 70115, 70132 | ||
Bug Blocks: | 63033 |
Description
Mike McLean
2002-04-10 19:15:30 UTC
This problem seems to be (anti-) related to the persistent nfs daemons (bug #63178). The permission denied error appears when relating or failing over the "first" time that the client mounts the nfs share. It seems that there is some negotiation between client and nfs server that is lost when shifting from one server to another. Once the client has mounted from both servers, everything is fine, but until then the client's mount only functions when the original node is active. In particular, if you relocate twice (back to same machine) the mountpoint works again. When the error occurs, running 'showmount' on the server that controls the nfs service will not show the client in question. Once the client unmounts and remounts, it will appear in showmount on both servers at all times and failovers will go smoothly. Client entries being present in /var/lib/nfs/rmtab seems to be key for correct NFS fail-over and proper server recovery on reboot (in a non-cluster). For instance (assume cluc and clud are the cluster nodes; with the service starting on cluc): linda: mount -t nfs clubsvc5:/mnt/nfs0/dir1 /mnt/nfs linda: ls /mnt/nfs cluc: cluadmin -- service relocate nfs0 linda: ls /mnt/nfs - ESTALE cluc: cluadmin -- service relocate nfs0 linda: ls /mnt/nfs cluc: cluadmin -- service relocate nfs0 linda: ls /mnt/nfs - ESTALE cluc: cp /dev/null /var/lib/nfs/rmtab cluc: cluadmin -- service relocate nfs0 linda: ls /mnt/nfs - ESTALE Ok, so, now the service is on cluc and we've received ESTALE. If we copy the old contents of /var/lib/nfs/rmtab from cluc to clud, and relocate the service (to clud), we'll succeed with our 'ls'. Similarly, if we restore /var/lib/nfs/rmtab on cluc and relocate the service back to cluc, 'ls' will again succeed. Needless to say, this provides an interesting problem. Either we can keep the files in sync (somehow), or fix it so that the client retries in some manner. Further testing reveals that this problem is only exhibited for wildcard (and probably netgroup) exports. For example, suppose the nfs client name is client1, when the service is created, if client1 is explicitly itemized then the relocation will be transparent. But if the service is wildcard exported (e.g. *), then in response to the relocation the client will encounter either ESTALE or EPERM (depending on the version of nfs-utils running on the client). In this case, the only way for the client to resume operation is to remount the nfs filesystem. This boils down to state maintained on the nfs server side. In the case of the relocate, this state isn't there. To comprehensively address this problem would require that the cluster infrastructure get involved with keeping the rmtab state consistent across cluster members. This change could entail modifications to the nfs_utils. Its sufficiently broad that its really not appropriate for the small window we have for pensacola release. I propose that we release note along these lines: - For transparent relocation/failover you need to explicitly itemize the set of authorized clients. - If you use netgroups or wildcards, in response to a relocation/failover, in certain circumstances it will be necessary to manually remount the directory on the nfs client systems. I've tested with netgroups and the behavior is the same. Summary: - Exports to explicit hosts/IPs don't need entries in /var/lib/nfs/rmtab - Exports to host/IP wildcards and netgroups require entries in /var/lib/nfs/rmtab. This is how it works: exportfs reads rmtab when run, sending entries in rmtab which match the current export pathname/wildcard up to the kernel. It also sends the export pathname/wildcard to rpc.mountd for authentication. rpc.mountd places entries in rmtab when a mount request is authenticated successfully and removes entries then umount requests are received. This little bit of state is used in the event of a server reboot so that clients can transparently continue working (after a delay, of course). Unfortunately, if this list isn't present on the other node (or if it has different entries), the nfs clients will receive ESTALE. Fix in pool. Awaiting more testing from different developers before closing. |