Description of problem: Hi, All When I did some NFSv3 mount operations, I found the problem abut the "nfs binodresvport: Address already in use" in NFSv3. I mounted NFSv3 in TCP protocol, after lots of continuous mount operations, the error "nfs bindresvport: Address already in use" always happened, as the count of the mounted NFS reaches about 500, even I added the "insecure" option to the exports entry of the NFS server. My system is RHEL5 and the mount operation for NFSv3 is implemented in nfs-utils-1.0.9-10. I have investigated the problem, and found the cause was the limitation of the bindresvport in nfsmount. In nfsmount, nfsmount() will create a socket for NFS, and just bind the socket to a reserve port. For the bindresvport() just can use the port number in the range 512-1023, after the lots of continuous mount operations, the reserved ports will all be in TIME_WAIT during the mount storm but can not be released in time, then next mount operation will usually fails by bindresvport(). In NFS server exports configuration, the 'insecure' option allows clients with NFS implementations that don't use a reserved port for NFS, so I think the port range can be expended so the entire port space can be tried, for a reserved port is not always needed when the 'insecure' option is allowed. For the problem in the nfsmount, I have made the patch below for the limitation of the bindresvport, and make the socket can be bound to a non-reserve port when the bindresvport fails. I have tested it as the "insecure" is set, after the patch is applied, the limitation with "nfs binodresvport:" can be avoided. signed-off-by:ShiChao <shic> --- nfs-utils-1.0.9-orig/utils/mount/nfsmount.c 2007-01-31 17:12:26.000000000 -0500 +++ nfs-utils-1.0.9/utils/mount/nfsmount.c 2007-02-01 05:13:48.000000000 -0500 @@ -851,6 +851,7 @@ time_t t; time_t prevt; time_t timeout; + struct sockaddr_in laddr; /* The version to try is either specified or 0 In case it is 0 we tell the caller what we tried */ @@ -1139,10 +1140,18 @@ perror(_("nfs socket")); goto fail; } + if (bindresvport(fsock, 0) < 0) { - perror(_("nfs bindresvport")); - goto fail; - } + perror(_("nfs bindresvport fail, try a non-reserver port")); + laddr.sin_family = AF_INET; + laddr.sin_port = 0; + laddr.sin_addr.s_addr = htonl(INADDR_ANY); + if ( bind(fsock, (struct sockaddr *)&laddr, sizeof(laddr)) < 0 ){ + perror(_("nfs bind")); + goto fail; + } + } + #ifdef NFS_MOUNT_DEBUG printf(_("using port %d for nfs deamon\n"), nfs_pmap->pm_port); #endif Thanks
I just added an upstream patch to the latest fc6 nfs-utils that should take care of this problem. I know that this is a rhel5 bug as well but if you could try one of the rpms in http://people.redhat.com/steved/bz230969 to see if it help with this problem, it would be much appreciated
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
I have seen this same bug in RHEL4 with the latest updates and patches applied. uname -a Linux linuxhost6 2.6.9-55.EL #1 Fri Apr 20 16:35:59 EDT 2007 i686 i686 i386 GNU/Linux cat /etc/redhat-release Red Hat Enterprise Linux WS release 4 (Nahant Update 5) nfs-utils-lib-1.0.6-8 nfs-utils-1.0.6-80.EL4 autofs-4.1.3-199.3 Excerpt from log : May 22 18:14:30 linuxhost6 automount[9804]: >> nfs bindresvport: Address already in use May 22 18:14:30 linuxhost6 automount[9804]: mount(nfs): nfs: mount failure nfs_server:/export/array1/lsf on /svr/lsf May 22 18:14:30 mildsn6 automount[9804]: failed to mount /svr/lsf Has this been reported elsewhere as a bug in RHEL4 and is there an updated version of nfs-utils which could be applied to this release to test ?
This request was previously evaluated by Red Hat Product Management for inclusion in the current Red Hat Enterprise Linux release, but Red Hat was unable to resolve it in time. This request will be reviewed for a future Red Hat Enterprise Linux release.
Development Management has reviewed and declined this request. You may appeal this decision by reopening this request.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days