Bug 164192

Summary: During boot time, portmapper crashes when remote clients contact the portmapper at a rate faster than 4 times per second.
Product: Red Hat Enterprise Linux 3 Reporter: Charles Whalen <whalen>
Component: portmapAssignee: Steve Dickson <steved>
Status: CLOSED WONTFIX QA Contact: Jay Turner <jturner>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: srevivo
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-10-19 18:57:22 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Charles Whalen 2005-07-25 21:00:27 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.3) Gecko/20050104 Red Hat/1.4.3-3.0.7

Description of problem:
During boot time, portmapper crashes when remote clients contact the portmapper at a rate faster than 4 times per second.  While initiating many rpcinfo -p host requests, the portmapper appears to answer to first few requests, then hangs.

From netstat, I see many half open connections. (See below).  I can reproduce
this by running a script many times in parallel. (See below)

The problem was noticed when ypbind service was trying to start.

I worked around this problem by using IPTABLES and block all remote client connections to port 111.

Thank You



Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Run many copies of script in additional inforamtion field
2. Reboot computer
3. Watch system hang at ypbind service startup.  Can not contact portmapper.
  

Additional info:


#!/bin/csh

set x = 0
while ( $x <1000 )
    rpcinfo -p HOST &
    set x = `expr $x + 1`
    usleep 100000
end 

Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address               Foreign Address             State      
tcp        0      0 0.0.0.0:32768               0.0.0.0:*                   LISTEN      
tcp        0      0 0.0.0.0:32769               0.0.0.0:*                   LISTEN      
tcp        0      0 0.0.0.0:513                 0.0.0.0:*                   LISTEN      
tcp        0      0 0.0.0.0:514                 0.0.0.0:*                   LISTEN      
tcp        0      0 0.0.0.0:111                 0.0.0.0:*                   LISTEN      
tcp        0      0 127.0.0.1:111               127.0.0.1:32775             SYN_RECV    
tcp        0      0 127.0.0.1:111               127.0.0.1:905               SYN_RECV    
tcp        0      0 127.0.0.1:111               127.0.0.1:901               SYN_RECV    
tcp        0      0 127.0.0.1:111               127.0.0.1:903               SYN_RECV    
tcp        0      0 127.0.0.1:111               127.0.0.1:902               SYN_RECV    
tcp        0      0 127.0.0.1:111               127.0.0.1:32776             SYN_RECV    
tcp        0      0 127.0.0.1:111               127.0.0.1:934               SYN_RECV    
tcp        0      0 127.0.0.1:111               127.0.0.1:938               SYN_RECV    
tcp        0      0 127.0.0.1:111               127.0.0.1:951               SYN_RECV    
tcp        0      0 127.0.0.1:111               127.0.0.1:914               SYN_RECV    
tcp        0      0 127.0.0.1:111               127.0.0.1:909               SYN_RECV    
tcp        0      0 127.0.0.1:111               127.0.0.1:32777             SYN_RECV

Comment 1 Charles Whalen 2005-07-25 21:06:39 UTC
The script is run on a remote computer and is trying to contact machine while it
is rebooting.

Comment 2 Suzanne Hillman 2005-07-26 14:29:57 UTC
*** Bug 164193 has been marked as a duplicate of this bug. ***

Comment 3 Suzanne Hillman 2005-07-26 14:30:06 UTC
*** Bug 164194 has been marked as a duplicate of this bug. ***

Comment 4 Suzanne Hillman 2005-07-26 14:30:34 UTC
*** Bug 164195 has been marked as a duplicate of this bug. ***

Comment 5 Steve Dickson 2005-07-27 12:31:25 UTC
Question: are you running the script as root or a normal user. If its root,
try running the scrip as a normal user, since you maybe running out
of privilege ports when run as root... 

Comment 6 Charles Whalen 2005-07-27 14:15:59 UTC
Hello,
I was not running script as root.  I was running script as a NIS user.  The user
did not have a local account.  I started several dozen scripts,perhaps more than
100 copies, on a remote client, rebooted the computer that they were trying to
communicate with, and  then witnessed machine hang when ypbind service tried to
start. The portmapper appears to answer rpc requests until ypbind service tries
to start.  Then the portmapper appears to be hung.  That is when we noticed alot
of half open tcp connections.

The original scenario was a server machine running processes as NIS user that
received information from many (>500 remote processes), the server crashed, the
remote processes were still trying to communicate with server as it was
rebooting.  The server machine hung at the step where ypbind was trying to
start.  The portmapper would not respond.  While sniffing network, we made an
approximate guess that it took more than 4 RPC requests per second to hang the
system.

Hope this helps.  Thank you for your assistance

Comment 7 Steve Dickson 2005-07-27 14:49:28 UTC
hmm... Adding NIS into the picture make things a bit more interesting... ;-)
But I'm still thinking this might be an exhaustion of privilege port issue.

A couple of things... One, where there any error messages in either the
server's or client's /var/log/messages file? Secondly would it be possible
to get a system trace (i.e. echo t > /proc/sysrq-trigger; which shows up
in /var/log/messages) from both the client and server machines?

Comment 8 RHEL Program Management 2007-10-19 18:57:22 UTC
This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
 
For more information of the RHEL errata support policy, please visit:
http://www.redhat.com/security/updates/errata/
 
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.