Bug 154678
Summary: | [Texas Instruments] nfs bindresvport: Address already in use | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Issue Tracker <tao> | ||||||
Component: | kernel | Assignee: | Steve Dickson <steved> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Ben Levenson <benl> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 3.0 | CC: | andrew_l_martin, george.liu, jakub, nhorman, petrides, rajeev, tao, wtkeeler | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | RHSA-2005-663 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2005-09-28 14:54:27 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Issue Tracker
2005-04-13 14:18:20 UTC
Could you please post the output of "netstat -a | grep ^tcp". I think there is a reserver port leak in the pmap_getport() routine which cause things like NIS to unnecessarily use reserver port to talk to the portmapper. It appears both glibc and the kernel are misusing the reserver port space. The pmap_getport() and pmap_getmaps() glibc routines and the kernel use reserver ports to communicate with the local or remote portmapper. A reserver port is not needed for these types of queries. The result of this misuse causes the majority of reserver port to be in TIME_WAIT during the mount storm. Also the port ranges that both the glibc and kernel try can be expended so the entire reserver port space can be tried. Finally I found that if the mount command retries every 5 seconds for 10 times, I was able to get an substantially more file system mounted. Created attachment 114516 [details]
Kernel Patch
This patch stop reserves ports from being used
on portmap quires and expands the reserver ports
that will be tried.
Created attachment 114519 [details]
glibc patch
This patch makes pmap_getport() and pmap_getmaps()
use non-reserver ports to do their queries.
This patch also increases the reserver ports that will
be tried as well as cause the entire pool of reserver
ports will be tried on every call.
Changing to kernel component. A fix for this problem has just been committed to the RHEL3 U6 patch pool this evening (in kernel version 2.4.21-32.8.EL). A revision to the fix for this problem has just been committed to the RHEL3 U6 patch pool this evening (in kernel version 2.4.21-33.EL). An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-663.html *** Bug 173495 has been marked as a duplicate of this bug. *** *** Bug 186310 has been marked as a duplicate of this bug. *** It appears this problem is back in kernel 2.4.21-40.EL. When I updated the kernel I started having a major slow down when mounting over 800 mounts. The /var/log/messages is showing the error "nfs bindresvport: Address already in use" over and over. I opened a ticket with red hat support, and after they looked into it, I was told to update this ticket with the information. Thanks. did you update glibc and utils-linux as well? I ran up2date and updated everything. My current version of glibc is glibc- 2.3.2-95.39. How do I find what version of utils-linux I'm running? rpm -q utils-linux I tried that and it said it wasn't installed so I figured I was doing something wrong. I figured I had to have it so I started looking around and found util-linux without the "s" and found I have util-linux-2.11y-31.11. Thanks. Sorry about that... so util-linux-2.11y-31.11 does indeed fix this problem? No, it isn't fixed. I was just answering your question about what version I was running. Sorry for the confusion. Looking back at the RPM changelog of util-linux, it appears the fix for this bug when into version 2.11y-31.13. So either upgrade to 2.11y-31.13 or the latest version util-linux-2.11y-31.18 I just updated to the latest util-linux. I'm at util-linux-2.11y-31.18 and glibc-2.3.2-95.44. The problem still exists. I'm wondering if my problem isn't the same as the one in this bug. I was able to replicate it. The following assumes /scratch is an exported drive on machine ec2090 and it's run on ec2090. mkdir /scratch/testdir mkdir /scratch/testdir/mountdir cd /scratch/testdir I=1 J=1 while [ $I -lt 1000 ]; do mkdir dir$I mkdir mountdir/mount$I mount ec2090:/scratch/testdir/dir$I /scratch/testdir/mountdir/mount$I let I=I+1 done while [ $J -lt 1000 ]; do umount /scratch/testdir/mountdir/mount$J rm -rf /scratch/testdir/mountdir/mount$J rm -rf dir$J let J=J+1 done When I run this, I'm able to mount almost 500 directories before it starts giving me the error "nfs bindresvport: Address already in use" on the prompt. The exact number it errors on changes each time I run it. Question: After the script dies, does 'netstat -an | grep 111' show those connections being made ports > 1024? There was also a fix to the portmap routines in glibc that stop them from using reserver ports (i.e. ports < 1024). I just want to make sure you have that fix as well... When I do 'netstat -an |grep 111' I get 999 connections with ports from 54434 to 55476. Thanks. Ok... it appears you have the correct glibc since all those connections are not on ports < 1024... Here is the test scrip I used to get over 100 mounts Note: this scrips assume there is directory tree that already exists on the server. #!/bin/bash MOUNT=mount HOST=ppro5 for i in `seq 1 1020` do [ ! -d /mnt/$i ] && mkdir /mnt/$i $MOUNT -v -t nfs -o tcp $HOST:/home/tree/$i /mnt/$i || exit 1 ls /mnt/$i; done Please run this scrip to see how many mounts you get... Here is the umount scrip that can be run to clean up the mounts #!/bin/bash for i in `seq 1 1020` do umount /mnt/$i || exit 1; done It gave the error after mount 503. Any news on this? It's killing us. We did discover that a similar problem exists in RHEL 4. Thanks. Andrew, please open a new bug report so that the problem will receive appropriate attention. This bug report is marked CLOSED/ERRATA, since the problem as originally reported here was resolved in U6. If one of the linked Issue Tracker IDs linked to this BZ is yours, you should also relink it to your new bug report. Thanks in advance. -ernie |