From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040113 Description of problem: Rather than submit 2 bug reports for portmap this description covers two possibly inter-related portmap bugs. Part 1) On several occassions portmap has appeared to cause an extreme slow down on a server running ypserv and nfsd, when this happens the ypclients loose contact, refer to part 2 for a detailed explanation on the client. Logins to the server console as root or non-root fail as it times out after several minutes, ssh connections as root succeed after several minutes. Doing a ps takes several minutes to respond, processes appear to be running normally. Stopping portmap brings the machine back to normal response times and console logins are possible. Restarting portmap returns the machine to the slow response times. If portmap is stopped the server can be taken down to run-level 2 or lower and back-up to level 3 which usually clears the problem, a reboot may be required if it does not. If the machine is shutdown without first stopping portmap it hangs when trying to shutdown the "nfs services", a power off restart is then required. Further investigation has not been possible as I have been unable to replicate the fault conditions and the server provides authentication to several labs so restoration of service takes priority over testing. Part 2) We have had several network outages and ypserv failures recently where the yp clients (running in nsswitch compat mode) have lost connection to the server running ypserv. Once ypserv is available each client must be individually visited.If the fedora client was not logged in prior to the loss of ypserv it must be rebooted as remote or local attempts to login fail, remote ssh connections fail with the message Connection closed by xxx.xxx.xxx.xxx lost connection If a session is open the client can be rescued by locally restarting portmap (if ypserv is available). We still have several redhat 7.2 clients authenticating in the same manner to the same server, they do not require any action as they they do not exhibit portmap time-outs. On the assumption that the problem did not exist in RH7.2 several packages were taken back to those versions. The time-outs were still present with portmap-4.0-38, ypbind-mt-1.12, yp-tools-2.5 The packages were also replaced with newer versions but the time-outs were still present with portmap 4.0-59, ypbind-mt-1.17.2, yp-tools-2.8 If ypbind is restarted when ypserv is stopped the following is seen. /etc/init.d/ypbind restart Shutting down NIS services: [ OK ] Binding to the NIS domain: [ OK ] Listening for an NIS domain server.....rpcinfo: can't contact portmapper: rpcinfo: RPC: Timed out In this state rpcinfo calls also fail, but portmap is still running. rpcinfo -p rpcinfo: can't contact portmapper: rpcinfo: RPC: Timed out /etc/init.d/portmap status portmap (pid 18796) is running... With ypserv still unavailable, portmap and ypbind were restarted. /etc/init.d/portmap stop Stopping portmapper: [ OK ] /etc/init.d/portmap start Starting portmapper: [ OK ] rpcinfo -p program vers proto port 100000 2 tcp 111 portmapper 100000 2 udp 111 portmapper /etc/init.d/ypbind restart Shutting down NIS services: [ OK ] Binding to the NIS domain: [ OK ] Listening for an NIS domain server.rpcinfo: can't contact portmapper: rpcinfo: RPC: Timed out In the above example where portmap is restarted then ypbind is restarted the following is seen in /var/log/messages, portmap is running with the -v flag. <snip> 09:55:55 ypbind: ypbind shutdown succeeded 09:56:55 ypbind: ypbind startup succeeded 09:57:27 portmap[19039]: connect from 127.0.0.1 to getport(ypbind) 09:59:27 portmap[19047]: connect from 127.0.0.1 to getport(ypbind) 09:59:55 ypbind[19031]: Unable to register (YPBINDPROG, YPBINDVERS, udp). 10:01:27 portmap[19059]: connect from 127.0.0.1 to getport(ypbind) <end snip> If just portmap is restarted there are no RPC timeouts until ypbind attemps to contact ypserv. The fedora clients are running the following: glibc-2.3.2-101.4 portmap-4.0-57 yp-tools-2.8-2 kernel-2.4.22-1.2149.nptl ypbind-1.12-3 This bug will give the same symptoms as bug 112770 if the ypbind client is rebooted when ypserv is not available. The Reproduce info refers to the part 2 bug. Version-Release number of selected component (if applicable): portmap-4.0-57 How reproducible: Always Steps to Reproduce: 1.Configure fedora client to use yp authentication with nsswitch in compat mode. 2.Stop ypserv on seperate server 3.Wait for ypbind to poll ypserv or force the bug by restarting ypbind Actual Results: Client is unusable due to portmap being unavailable and exhibiting "RPC: Timed out" errors. Expected Results: portmap will not fail with "RPC: Timed out" errors and ypbind will gracefully wait until ypserv reappears, it will then continue to allow authentication without intervention If ypbind is manually restarted with no ypserv it will eventually gracefully fail with a time out {FAILED} message. Additional info:
looking at the code, it appears that the only time NIS (or DNS for that matter) will get involved is when the caller is not from the local machine (meaning the src ip address is not one of the local machines ip address). So its not clear to me what can be done to verify non-local callers and not involve NIS since portmapper uses the gethostbyXXX() API (which is controlled by the /etc/nsswitch.conf). So I'm going to close this as NOTABUG...