Description of problem: Resolving via NIS uses a disproportionate number of TCP connects from ports in the reserved range (below 1024) when root. This can break server (NFS) and any other machine in several ways (out of privileged ports, conflicting ports in use). In some cases the issue can be mitigated though the use of nscd. When exporting a number (20) of mounts over NFS (version 3, UDP only) to a number (60) of clients via a NIS netgroup the server start up broke because there were too many lookups involved consuming way to many privileged ports: Aug 8 16:46:13 nfs5 kernel: svc: failed to register lockdv1 RPC service (errno 5). Aug 8 16:46:13 nfs5 kernel: lockd_up: makesock failed, error=-5 Aug 8 18:28:53 nfs5 kernel: svc: failed to register nfsdv2 RPC service (errno 5). Aug 8 18:28:53 nfs5 kernel: svc: failed to register nfsaclv2 RPC service (errno 5). Aug 8 18:28:53 nfs5 kernel: nfsd: last server has exited, flushing export cache Aug 8 18:28:53 nfs5 xinetd[3298]: bind failed (Address already in use (errno = 98)). service = login Aug 8 18:28:53 nfs5 xinetd[3298]: bind failed (Address already in use (errno = 98)). service = shell Aug 8 18:28:53 nfs5 xinetd[3298]: bind failed (Address already in use (errno = 98)). service = rsync This is all due to excessive numbers of privileged TCP ports part of connections in TIME_WAIT state. This issue can be mitigated by an early echo 1 >/proc/sys/net/ipv4/tcp_fin_timeout during startup (maybe '0' works even better, not sure). Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: Install and configure ypbind, do not run nscd. Match "netstat -ant" output against "rpcinfo -p" output after (do this as "root"): - ls -l causing many user/group lookups - service nfslock start - service nfs start - rsh as non-root Rsh is an interesting case since it requires a privileged port but uses many of them too for NIS lookups for no reason: it should use unprivileged ports for that. Actual results: Lots of TCP connections in TIME_WAIT from privileged TCP ports to ypbind TCP port registered at rpcbind. Expected results: Additional info: This could be considered an RPC (glibc/sunrpc) issue. Maybe it will resurface in libtirpc (has a /etc/netconfig however). Things to consider: - ypbind UDP-only option. - local RPC should always use UDP. - RPC by root: Don't use privileged ports unless explicitly asked for. - separate TCP FIN timeout (=zero) for local TCP connects.
- An UDP-only ypbind is not an option (too much dependencies on TCP already) - the tcp_fin_timeout trick is not very effective and the wrong approach anyway - It is a F14 glibc/sunrpc scalability issue: privileged (TCP) ports are a precious resource, especially when they linger in TIME_WAIT for 60s. - There is also an exportfs (nfs-utils) scalability issue w.r.t. netgroups. Truncating /var/lib/nfs/rmtab before starting nfs is a server workaround.
Created attachment 527403 [details] proposed patch to use reserved port only for secure maps There is a similar request in RHEL-6 (bug #689424). A proposed solution there is based on HP solution -- they use reserved ports only for secure maps (see http://bizsupport1.austin.hp.com/bc/docs/support/SupportManual/c02037757/c02037757.pdf). This patch, which tries to use reserved port only when asking passwd maps, can be used as a proof of concept. We'll probably need to define which maps are secure on the client side in the same way as it is done on the server side (it is defined in /etc/ypserv.conf).
This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component.
We have a similar problem open as support case 00546448. We have noticed that RHEL 5 looks in /var/yp/binding/ to find a NIS master and that RHEL6 does not, just strace'd ls. We've looked at some library sources and the routine that reads the file, yp_bind_file in nis/ypclnt.c is conditionally compiled in. We can use "nm" to find that routine in /lib/libnsl-2.5.so on RHEL5, we can't find it in any file in /lib64 on RHEL6. This change causes many extra access to ypbind, while this may not be the whole problem we'd like to eliminate this issue, it is at least a clear difference between RHEL5 which does not have the problem for us, and RHEL6.
*** This bug has been marked as a duplicate of bug 689424 ***