If a mount request comes in from an IP address where the DNS server of the subnet for the source IP is down or not running named, rpc.mountd freezes or is greatly slowed down causing all requests to time out (even requests from IP addresses that resolve properly). This problem does NOT occur of the DNS server is running, even if the IP address does not resolve to a name. In other words, this problem seems to occur only if the (delegated) name server for the subnet (of the NFS client) is not running. I have placed a tar (gz) file with an strace of rpc.mountd, and the output of a couple of test scripts in the URL: http://intranet.redhat.com/~kambiz/ The tar file contains the following: NFS-DNS-Issue/ NFS-DNS-Issue/slowmount.sh NFS-DNS-Issue/NFS_Server_mountd_strace NFS-DNS-Issue/bouncemount.sh NFS-DNS-Issue/named.conf NFS-DNS-Issue/notes NFS-DNS-Issue/NFS_Client_mount_unmount_via_good_interface NFS-DNS-Issue/NFS_Client_mount_unmount_via_bad_interface NFS-DNS-Issue/168.192.in-addr.arpa The .sh files are my test scripts. The NFS_* are text files with lots of information to help demonstrate this problem. Kambiz
Ok. I just ran my test on a 7.0 beta system (6.9.2 w/ 2.2.16 kernel and nfs-utils) in the test lab. And the problem is identically reproduced. I've added an additional tgz file to my intranet page (http://intranet.redhat.com/~kambiz/) called stuff.tgz which is the named configuration to go along with the named.conf file in the original tgz file.
assigned to johnsonm
rpc.mountd uses the well defined gethostbyXXX() rouitnes to resolve addresses and hostnames. Since these routines are subject to long resolver timeouts, there is not much rpc.mountd can do to elminate this problem. To avoid this problem run multiple nameds.