Bug 155470
Summary: | LTC12834-"nfs bindresvport: Address already in use" messages for mounting | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Jeff Moyer <jmoyer> | ||||
Component: | nfs-utils | Assignee: | Steve Dickson <steved> | ||||
Status: | CLOSED DUPLICATE | QA Contact: | Ben Levenson <benl> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 3 | CC: | anderson, bjohnson, cel, davej, jakub, jturner, kanderso, kzak, stevec, tao | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2005-05-18 18:19:07 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 141773 | ||||||
Bug Blocks: | |||||||
Attachments: |
|
Description
Jeff Moyer
2005-04-20 17:31:21 UTC
Pasting in comment from bz #146629: From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20041020 Description of problem: On a dual-Xeon 4GB server running FC3 I am unable to successfully issue more than 52 automount requests to unique Linux (also FC3) NFS servers in less than a minute. After 52 mounts I get the following error. This is reproducible. Jan 30 10:40:22 beowulf automount[22882]: >> nfs bindresvport: Address already in use Jan 30 10:40:22 beowulf automount[22882]: mount(nfs): nfs: mount failure node54:/usr1 on /data/node54 Jan 30 10:40:22 beowulf automount[22882]: failed to mount /data/node54 After waiting ~1 minute I am able to mount another 52 filesystems before the getting the same error again. This problem does not occur on a RedHat 9 client machine nor on a Solaris 9 client. This limitation is breaking an application on a large Beowulf cluster that cross-mounts data between 290 GigE attached nodes. I have also reproduced this problem with a script that explicitly calls /bin/mount rather than relying on automount. Version-Release number of selected component (if applicable): autofs-4.1.3-28 How reproducible: Always Steps to Reproduce: 1. setup /etc/auto.data to have /data/node* point to 290 Linux NFS servers. 2. umount /data/node* 3. ls -l /data/node*/known_file Actual Results: After 52 successfully mounts and ls results, the following shows up in /var/log/messages, Jan 30 19:42:01 beowulf automount[23509]: >> nfs bindresvport: Address already in use Jan 30 19:42:01 beowulf automount[23509]: mount(nfs): nfs: mount failure node53:/usr1 on /data/node53 Jan 30 19:42:01 beowulf automount[23509]: failed to mount /data/node53 ... (for the remaining nodes) Expected Results: The ls output for 290 files. Additional info: Another cut-n-paste from 146629: Comment #17 From Stuart Anderson (anderson.edu) on 2005-04-12 15:39 EST I upgraded to util-linux-2.12a-23 on the client side only (as was the case for autofs, if I need to update anything on the NFS servers please let me know), and I now typically get 205-260 mounts before the bindresvport error. Note, in my case, each mount request is to a distince server. And more cut-n-paste from 146629: Comment #24 From Stuart Anderson (anderson.edu) on 2005-04-20 12:23 EST [reply] Private Many thanks for your help. I have also been tracking the corresponding static mount problem, https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=141773 and I just reproduced it again today. In particular, after adding the 290 mount points to /etc/fstab and then running grep | xargs mount I am able to get 237 successful mounts before finding, nfs bindresvport: Address already in use nfs bindresvport: Address already in use nfs bindresvport: Address already in use ... for the remainder of the 290 requests. $ rpm -qf /bin/mount util-linux-2.12a-23 Could you please post the output of "netstat -a | grep ^tcp". I think there is a reserver port leak in the pmap_getport() routine which cause things like NIS to unnecessarily use reserver port to talk to the portmapper. Where exactly do you think there is a leak? The pmap_getport code is pretty straight-forward, and I can't find any leaks in it. Do you have some empirical evidence (i.e. a unit test) to prove this? Now, pmap_getport definitely calls clnttcp_create with a socket of RPC_ANYSOCK, which causes a new socket to be created and bound to a reserved port. The socket is closed before returning from pmap_getport, though, by way of calling CLNT_DESTROY. Created attachment 113579 [details]
output of netstat -a | grep ^tcp after failing to mount
I have removed the unimportant entries associated with ssh connections
and a few other non-NFS related services.
Well as the netstat trace clearly shows (in Comment #6) about 40% of the tcp connections that are in TIME_WAIT are from portmap requests. Since a reserver port is *not* needed to make portmap requests, those ports are definitely a waste with respect to reserver port space. Any ideas on how to get rid of the reserved port TIME_WAIT portmap requests? Or for that matter is there a reason any portmap connections should be left in TIME_WAIT? |