Red Hat Bugzilla – Bug 103401
RPC (on boot) port selection collisions between various applications
Last modified: 2016-08-08 23:29:34 EDT
Description of problem:
On boot, ypbind occasionally grabs port 631/udp, blocking CUPS from binding to
the port. This is a glibc problem because ypbind is a RPC service that has its
port assigned dynamically through bindresvport().
The code in libc/sunrpc/bindrsvprt.c shows the port number is assigned purely
based on the PID of the ypbind process, something like
port = (PID % 424) + 600
The PID seems to vary slightly from reboot to reboot, but generally is in the
870s on the machine in question, resulting in ports assigned in the vicinity of
630. CUPS (actually, IPP) has ports 631/tcp and 631/udp reserved. NIS starts
first, so it wins, and since CUPS has a reserved Well Known Port, it can't
relocate and loses.
Version-Release number of selected component (if applicable):
Depends entirely on the PID handed to ypbind on boot, and the exact set of
services configured affects this.
The glibc algorithm already blacklists all reserved ports below 600, presumably
to avoid this exact problem. Consider altering the code to blacklist 5 to 8
additional ports in the 600-1023 range that are or may be in common use:
631 (IPP == CUPS)
749 (Kerberos V kadmin)
992-995 (SSL-enabled telnet, IMAP, IRC, and POP3)
The ports lost could be recovered, if desired, by allowing ports in the 590-600
range to be assigned by bindrsvprt().
According to Ulrich Drepper, daemons requiring specific ports in the 600-1023
range need to be started before any daemons using bindresvport.
I think that portmap ought to be tought how to avoid a list of ports.
Otherwise, once I've stopped cupsd (for whatever reason) and restarted ypbind, I
can no longer be sure that cupsd will ever start again.
(see above comment)
If you accept Ulrich's argument that this is a service bug, then this is NOT
just a CUPS bug. It's also a bug against xinetd-2.3.12-0.3E, openldap-2.0.27-9,
and krb5-server-1.2.7-15. ALL of the services in the distribution that I know
about that use Well Known Ports in the 600-1023 range start up AFTER ypbind, not
This problem should be solved in portmap. Here is a proof-of-concept, which I
think should be merged into portmap: http://cyberelk.net/tim/portreserve. In
summary: portmap should read a directory of configuration files, one per daemon,
each containing the name of the service that is out of bounds.
(I know that bindresvport is the culprit, but what other users of that are there
I'm moving this to RHEL 3 because we're still seeing this in operation
on servers with that version installed.
I don't know what else uses bindresvport, if anything, I just know
that if you have ypserv and CUPS running, there's a pretty decent
chance of a collision on boot which breaks CUPS.
spamassassin's spamd is another possible collision (though not in
RHEL), it uses port 783 by default.
*** Bug 83985 has been marked as a duplicate of this bug. ***
*** Bug 113586 has been marked as a duplicate of this bug. ***
NFS also causes port collisions, more specifically rpc.rquotad.
I just had a collision between rpc.mountd and cups.
Regarding command #1 (daemons requiring specific ports in the 600-1023
range need to be started before any daemons using bindresvport): if
you consider that ypbind (that uses bindresvport) is often necessary
to obtain user information, host information, service port numbers,
etc, and that starting services that require such information before
it's available won't work, we've got a catch 22 situation. We really
need a better way to reserve ports such that portmap doesn't take them
portreserve looks like a perfect solution for the problem, except that
it can't rely on th ypbind services map, so there's a slight risk that
it might reserve a port based on /etc/services that turns out to be
different in the NIS map, or even that the port number can't be
identified because it's only defined in the services map. I suppose
it's a resonable requirement to have any ports that are to be reserved
by portreserve defined in /etc/services or some other database that's
available early enough in the boot.
*** Bug 125962 has been marked as a duplicate of this bug. ***
reassing to Jakub the glibc maintainer
This is no glibc issue. Lacking a better idea, I'll assign it to
initscripts since a solution Ã la portreserve would be part of
initscripts. Changing bindresvport() is no option since there are no
universally available ports (look at the IANA list). So an extern
solution like portreserve is needed. I think it can work nicely and
should not be hard the integrate.
I am seeing a similar issue now on Fedora Core 3 where rpc.mountd grabs port 783
before spamassassin has a chance to grab it.
Need to move nfs init sequence from 60 to 81 to place it after spamassassin to
clear the problem.
Should the port be reserved, or is rearrangement of init order more appropriate?
As an update, and not that it should be a surprise since nothing has been
changed, but we are seeing the exact issue I originally reported (on Red Hat
Linux 9, as I recall) on Red Hat Enterprise Linux 4 as well.
Just want to add that I have seen this on several RHEL4 clients too.
Also, having ypbind start AFTER other services is not an option as
ypbind is needed to see all accounts and many services need to
see those accounts when starting (e.g.imapd for port 993)
Is this ever going to be fixed? This is a wide spread problem that will affect almost everyone at some
*** Bug 154800 has been marked as a duplicate of this bug. ***
I would propose to make this issue a release blocker for FC5.
This problem is being considered for a future major release of Red Hat
Enterprise Linux. Red Hat does not currently plan to provide a resolution for
this in a Red Hat Enterprise Linux update for currently deployed systems.
With the goal of minimizing risk of change for deployed systems, and in response
to customer and partner requirements, Red Hat takes a conservative approach when
evaluating changes for inclusion in maintenance updates for currently deployed
products. The primary objectives of update releases are to enable new hardware
platform support and to resolve critical defects.
*** Bug 51904 has been marked as a duplicate of this bug. ***
I reported this some time ago in
glibc seems the obvious place to fix this problem.
And you were also told there it is a bad idea and glibc is not going to change
in this regard.
This will be my only comment to prevent bugspam, but it says nowhere in the bug
that this is a bad idea, or why it is a bad idea. Please can someone elaborate
on this? All these issues are caused by a single point of failure - glibc. It
seems the obvious place to solve the issue. It seems pretty hairbrained to have
a special daemon to handle this. All the apps, and any 3rd party ones too, need
to be rewritten to use the daemon. Why couldn't there be a configuration file
stating the ports for bindresvport() to avoid?
(In reply to comment #34)
> I would propose to make this issue a release blocker for FC5.
FYI, I see this problem in FC5 test 1, as reported in 154800 (which was closed
as a duplicate of this bug).
Perhaps we can reopen this and actually work on a fix? Target FC6?
*** Bug 191950 has been marked as a duplicate of this bug. ***
*** Bug 189144 has been marked as a duplicate of this bug. ***
*** Bug 218216 has been marked as a duplicate of this bug. ***
Also see https://bugzilla.novell.com/show_bug.cgi?id=262341
*** Bug 318461 has been marked as a duplicate of this bug. ***
Seems like leaving this in "Closed Deferred" means it will never get worked on.
Is there ever a hope of a fix?
Any bug in RHEL-3 is dead as far as I can tell unless it is a security problem.
This bug is still present in RHEL4U5 as I had a system hit this just a few weeks
ago. It may well be present in rhel5 also, though I haven't verified.
It is present in 5 and 5.1; at least some of the bugs marked as duplicates above
reflect that. #218216 is the same bug in Fedora Core 6 filed against spamd in
spamassassin (port 783), for example, which is effectively RHEL 5 for this purpose.
Also present in Fedora 7, which is what got me digging this old thing up again.
Now the service is "rpcbind" which acquires the ports.
I still haven't seen any technical reasons why it portmap couldn't check in a
special directory for files containing reserved ports and not use these. See
comment #5. These files could be installed by RPM.
On RHEL-4 we had practically the same problem. The program that listened on the
port TCP 631 was rpc.statd. As a result cups died on startup with the message:
startListening: Unable to bind socket for address 7f000001:631 - Address
already in use.
While Red Hat decides what is the best way to fix this permanently can Red Hat
offer a workaround?
I just got hit with this on rhel4, rpc.statd took port 631, preventing cups from
I got same problem on Fedora 9. krb5kdc port 750. Please fix this.
The workaround is: use portreserve (see comment #5).
This is already included in Fedora 10, and CUPS in Fedora 10 makes use of it in its initscript.
spamassassin has been mentioned as another service that could make use of portreserve. Changing component and reassigning.
Can portreserve work with RHEL-4/5? If it can, I would be happy to get it into EPEL as a help.
Yes, I don't see any reason it wouldn't work. However, services that want to use it need to modify their initscript (to call 'portrelease').
Packaging it for EPEL might help for third-party applications though, and other EPEL packages.
Can someone enlighten me how portreserve is not a race condition waiting to happen? As far as I can tell, you get portreserve to release the port before starting the service. How do you make sure nothing else gets in there first before the portreserve command and the service-starting command? I suspect it's unlikely that something could slip in between these commands, but is this good programming?
(In reply to comment #75)
> Can someone enlighten me how portreserve is not a race condition waiting to
> happen? As far as I can tell, you get portreserve to release the port before
> starting the service. How do you make sure nothing else gets in there first
> before the portreserve command and the service-starting command? I suspect it's
> unlikely that something could slip in between these commands, but is this good
Without support in glibc/portmap there is no generic way around this.
have any of the portreserve changes in Fedora 10 made it into the RHEL5.3 release? or would this wait until RHEL6?
It's not in RHEL-5.3.
Confirming 5.3 has issues. After updating and rebooting 105 clients with a recent kernel patch + others, one had this issue logged in /var/log/cups/error_log:
E [25/Jun/2009:21:11:11 -0400] Unable to bind broadcast socket - Address already in use.
Cups didn't get broadcasted print queues until after restarting cups.
The curious thing is that I hard-bind rpc services; status, nlockmgr, ypbind are listing on *my* chosen ports, non of the conflicting with cups's 631.
We've also had past (but none this time) random issues with rsyncd not being able to bind to its listening port, I'm extrapolating the root cause is the same.
I'm adding the portreserve hack to spamassassin-3.3.0 for RHEL-6. But this is not a complete solution. Arbitrary other services can still cause this failure.
Warren: how so?
The only way I know of that it can fail is after the protected service is stopped. There's no way to close that hold without kernel support.
Some ethernet NICs (broadcom, intel gigabit) have something called
ASF enabled. This causes the NIC to gobble stuff sent to port 623
and port 664! So there are some other ports that we need to exclude.
RPC mounts avoid these ports by using sunrpc.min_resvport which is
set to 665 these days. So the code for mounts won't use ports below
Maybe bindrsvprt.c could be modified to also use sunrpc.min_resvport
In the 623/664 cases there is no collision with another daemon. The
affected NICs lose the packets.
ypmatch and ypcat will hang with do_ypcall: clnt_call: RPC: Timed out
when they pick 623 or 664 on the affected hardware.
sshd (particularly when AllowGroups is used) will also hang on the
affected hardware when it selects 623 or 664 to do an NIS lookup.
A kludge is to use xinetd to grab the 623/664 ports for a workaround.
Just been hit with this again:
# netstat -anp | grep 631
tcp 0 0 0.0.0.0:631 0.0.0.0:* LISTEN 25591/cupsd
udp 0 0 0.0.0.0:631 0.0.0.0:* 24623/rpc.statd
Whilst portreserve looks like a fix, from my understanding is it not a bit of a hack around this. Shouldn't portmaper not just be given a list of ports to never use from a file that gets appended to as services are added from packages? Or am I missing something?
See comment #16.
comment #16 doesn't answer why we need portreserve, it just says that there are no free unallocated low ports in the official allocation list. That's true, but shouldn't portmapper have a notion of services THIS system has or needs reserved low and not allocate them.
Or is there just no easy way of allocating a low port in a system call that can exclude ports we want to hold back? Sounds like a new call is called for eventually?
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
(In reply to comment #91)
> Red Hat Enterprise Linux 6.0 is now available and should resolve
> the problem described in this bug report.
It won't since there is no portrelease support in the 6.x ypbind scripts.
It's not ypbind that needs to use portreserve/release, it's everything else that uses ports in the rpc port range.
Dovecot is missing support portreserve/release (at least in Scientific Linux 6.1).
(In reply to comment #103)
> It's not ypbind that needs to use portreserve/release, it's everything else
> that uses ports in the rpc port range.
What I discovered via portreserve(1) and testing is that when portreserve starts, it listens on all the reserved ports that have been configured. Just before the real daemon starts, it needs to run portrelease so the *real* daemon can listen on it. Look at:
grep portrelease /etc/init.d/*
In the case of ypbind, which can listen either on a random port (unfixable) or a fixed port (via setting OTHER_YPBIND_OPTS). I hard-code to port 900 by setting
To fix ypbind, I see two potential ways - (1) it's init script needs to parse out the -p parameter and if you get something, call portrelease on that port. Something like:
ypport=`echo $OTHER_YPBIND_OPTS | sed 's/.*-p[[:space:]]*\(\<.*\>\).*/\1/'`
[ -n "$ypport" -a -x /sbin/portrelease ] && /sbin/portrelease $ypport &>/dev/null || :
Or (2), since the user needs to reserve a port by dropping a file into /etc/portreserve/ anyway, force them to use the service named 'ypbind' and just portrelease that:
[ -x /sbin/portrelease ] && /sbin/portrelease ypbind &>/dev/null || :
Hopefully I'll get time to test this today.
Should I open a new bug on RHEL6 for this issue? I searched and could not find one...
I have changed the component to "distribution". Please file a new bug for each affected component and use this one as a tracker.
rsync bug filed as Bug 786076.
This bug exists in RHELv5.x, too. I noticed it when trying to use heartbeat (port 694).
Can we get portreserve backported to RHELv5.x?
Just having portreserve in RHEL-5.X will not solve the issue, you need to update all the packages which already use it in RHEL-6 to achieve same results. And this is IMHO very unlikely.
I wasn't suggesting updating all the RHEL5 packages that would need portreserve, but having the package itself as part of the distro (instead of having to go to some third party repo, which is what I ended up doing) would go a long way to allow a sysadmin solve the problem, IMO.
Even if it made it into EPEL for RHEL5 I think I'd be happy...
I attached this bug to case 00721174, as the new RHS product can (and will) assign a port for every storage brick. In huge systems we have seen this consume all available ports under 1024.
Having just been caught out by this bug in CentOS6 with heartbeat, ( UDP Port 694 ), and looking at the AFS /IPMI Port 623/664 bug mentioned in previous comments, it doesn't seem unreasonable to me to make portmap/rpcbind honour sunrpc.min_resvport/sunrpc.max_resvport as per comment 86.
I presume rpc.mountd honours these proc settings, and the original setting of 600 minimum was modified specifically so as to avoid IPMI hardware clashes. If that's the case, then it would seem rpcbind is STILL vulnerable to a potential clash with IPMI hardware as long as a suitably configured portreserve is not running, even if no actual "real" daemons use those ports. Since there's willingness to avoid NFS mounts clashing, why not "protect" rpcbind with the same tweak?
If this were done, then sysadmins who only have a few clashing services in the 6xx range, but only a modicum of simultaneous RPC services/ports, could avoid the whole mess by simply raising sunrpc.min_resvport to 700, for instance.
In more complex cases, I admit that it might well be necessary to employ portreserve, but the /proc setting might well be sufficient in many cases.
I would just like to add that I have this problem on CentOS6 with ypserv *every* time I reboot a particular server - 6.0 through 6.6. And occasionally on CentOS 5.11 with ypbind/cups.
I haven't seen this solution mentioned in this thread:
Supposedly port numbers added to this file will be skipped over by the
glibc functions that auto-assign reserved ports as used by tools like
Some interesting details here:
Any chance we'll see a fix? Having this issue on RHEL 6.7 with rpc.statd. No port collisions, but in our environment we are required to have specific ports/port ranges defined for a particular service. There doesn't seem to be anyway to force rpc.statd to use a particular port rather than follow the pid % 424 + 600 formula.