Bug 119281 - When using NIS if ypserv is lost, portmap fails with "RPC: Timed out" messages.
Summary: When using NIS if ypserv is lost, portmap fails with "RPC: Timed out" messages.
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: portmap
Version: 1
Hardware: i386
OS: Linux
low
high
Target Milestone: ---
Assignee: Steve Dickson
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2004-03-28 22:01 UTC by Barry Wright
Modified: 2007-11-30 22:10 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-08-12 18:36:42 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Barry Wright 2004-03-28 22:02:00 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040113

Description of problem:
Rather than submit 2 bug reports for portmap this description covers
two possibly inter-related portmap bugs.

Part 1)
On several occassions portmap has appeared to cause an extreme slow
down on a server running ypserv and nfsd, when this happens the
ypclients loose contact, refer to part 2 for a detailed explanation on
the client.
Logins to the server console as root or non-root fail as it times out
after several minutes, ssh connections as root succeed after several
minutes. Doing a ps takes several minutes to respond, processes appear
to be running normally. Stopping portmap brings the machine back to
normal response times and console logins are possible. Restarting
portmap returns the machine to the slow response times.
If portmap is stopped the server can be taken down to run-level 2 or
lower and back-up to level 3 which usually clears the problem, a
reboot may be required if it does not. If the machine is shutdown
without first stopping portmap it hangs when trying to shutdown the
"nfs services", a power off restart is then required.
Further investigation has not been possible as I have been unable to
replicate the fault conditions and the server provides authentication
to several labs so restoration of service takes priority over testing.

Part 2)
We have had several network outages and ypserv failures recently where
the yp clients (running in nsswitch compat mode) have lost connection
to the server running ypserv. Once ypserv is available each client
must be individually visited.If the fedora client was not logged in
prior to the loss of ypserv it must be rebooted as remote or local
attempts to login fail, remote ssh connections fail with the message

  Connection closed by xxx.xxx.xxx.xxx
  lost connection

If a session is open the client can be rescued by locally restarting
portmap (if ypserv is available). We still have several redhat 7.2
clients authenticating in the same manner to the same server, they do
not require any action as they they do not exhibit portmap time-outs.

On the assumption that the problem did not exist in RH7.2 several
packages were taken back to those versions. The time-outs were still
present with portmap-4.0-38, ypbind-mt-1.12, yp-tools-2.5

The packages were also replaced with newer versions but the time-outs
were still present with portmap 4.0-59, ypbind-mt-1.17.2, yp-tools-2.8

If ypbind is restarted when ypserv is stopped the following is seen.

  /etc/init.d/ypbind restart
  Shutting down NIS services:                                [  OK  ]
  Binding to the NIS domain:                                 [  OK  ]
  Listening for an NIS domain server.....rpcinfo: can't contact
  portmapper: rpcinfo: RPC: Timed out

In this state rpcinfo calls also fail, but portmap is still running.

  rpcinfo -p
  rpcinfo: can't contact portmapper: rpcinfo: RPC: Timed out

  /etc/init.d/portmap status
  portmap (pid 18796) is running...

With ypserv still unavailable, portmap and ypbind were restarted. 

  /etc/init.d/portmap stop
  Stopping portmapper:                                       [  OK  ]
  /etc/init.d/portmap start
  Starting portmapper:                                       [  OK  ]
  rpcinfo -p
     program vers proto   port
      100000    2   tcp    111  portmapper
      100000    2   udp    111  portmapper

  /etc/init.d/ypbind restart
  Shutting down NIS services:                                [  OK  ]
  Binding to the NIS domain:                                 [  OK  ]
  Listening for an NIS domain server.rpcinfo: can't contact
  portmapper: rpcinfo: RPC: Timed out

In the above example where portmap is restarted then ypbind is
restarted the following is seen in /var/log/messages, portmap is
running with the -v flag.

  <snip>
  09:55:55 ypbind: ypbind shutdown succeeded
  09:56:55 ypbind: ypbind startup succeeded
  09:57:27 portmap[19039]: connect from 127.0.0.1 to getport(ypbind)
  09:59:27 portmap[19047]: connect from 127.0.0.1 to getport(ypbind)
  09:59:55 ypbind[19031]: Unable to register (YPBINDPROG, YPBINDVERS,
  udp).
  10:01:27 portmap[19059]: connect from 127.0.0.1 to getport(ypbind)
  <end snip>

If just portmap is restarted there are no RPC timeouts until ypbind
attemps to contact ypserv.

The fedora clients are running the following:
glibc-2.3.2-101.4
portmap-4.0-57
yp-tools-2.8-2
kernel-2.4.22-1.2149.nptl
ypbind-1.12-3

This bug will give the same symptoms as bug 112770 if the ypbind
client is rebooted when ypserv is not available.

The Reproduce info refers to the part 2 bug.

Version-Release number of selected component (if applicable):
portmap-4.0-57

How reproducible:
Always

Steps to Reproduce:
1.Configure fedora client to use yp authentication with nsswitch in
compat mode.
2.Stop ypserv on seperate server
3.Wait for ypbind to poll ypserv or force the bug by restarting ypbind
    

Actual Results:  Client is unusable due to portmap being unavailable
and exhibiting "RPC: Timed out" errors.

Expected Results:  portmap will not fail with "RPC: Timed out" errors
and ypbind will gracefully wait until ypserv reappears, it will then
continue to allow authentication without intervention

If ypbind is manually restarted with no ypserv it will eventually
gracefully fail with a time out {FAILED} message.

Additional info:

Comment 1 Steve Dickson 2004-08-12 18:36:42 UTC
looking at the code, it appears that the only time
NIS (or DNS for that matter) will get involved is
when the caller is not from the local machine (meaning 
the src ip address is not one of the local
machines ip address).

So its not clear to me what can be done to verify
non-local callers and not involve NIS since portmapper
uses the gethostbyXXX() API (which is controlled by
the /etc/nsswitch.conf). 

So I'm going to close this as NOTABUG...


Note You need to log in before you can comment on or make changes to this bug.