Bug 97710

Summary: local users (ie root) can't log in when nis is set to broadcast and default firewall configured
Product: Red Hat Enterprise Linux 2.1 Reporter: erikj
Component: ypbindAssignee: Steve Dickson <steved>
Status: CLOSED NOTABUG QA Contact: Ben Levenson <benl>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.1CC: gbeshers, jh, martinez
Target Milestone: ---   
Target Release: ---   
Hardware: ia64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-02-15 18:35:37 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description erikj 2003-06-19 16:19:41 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.79C-SGI [en] (X11; I; IRIX 6.5 IP32)

Description of problem:
If you configure a system to have the default installer firewall rules and set
it up to be
an NIS client with broadcast binding, you will find that local users including
root cannot log in.

This could be done from the installer by selecing NIS with broadcast and the
default firewall level.  This could also be done by the user herself by
configuring the system to be in such a sate.

Assume a default nsswitch.conf file where the order for passwd and group is
"files nis" or "files nisplus nis"

The log in attempt will try to contact the NIS server.  The requests are blocked
by the firewall rules and eventually login times out (after 60 seconds) and
spits you back at the login prompt.  You are left with a system that root can't
even log in to unless you boot single user.

I have done some research on this issue including running strace to figure out
what is going on.  I've discovered some interesting things.

 - If you change nsswitch for passwd and group from something like "files nis"
to "compat", you will 
   not have a problem.

 - If you are configured as above, and add a "+" to the end of the passwd file,
you are still ok.

 - If you are configured as above and add a "+" to the bottom of /etc/group, you
block.

I was able to get some traces.  However, running trace directly with /bin/login
made  /bin/login fail.  Instead, I would start another shell as root in one
window.  From  another window, I would run the strace telling it to look at the
bash process and use "-f" to follow child processes.  IE:

strace -p 5737 -f -o /tmp/login.strace.notworking

Then I would run /bin/login like this to emulate what a login attempt from the
console or telnet might do.  Ie, from the bash shell that has PID 5737, I would
run
/bin/login -- localuser

Where localuser is a local account in /etc/passwd and not in NIS.  

I found in the strace output that, indeed, with nsswitch.conf set to "files
nisplus nis" or "files nis" for group and passwd in /etc/nsswitch.conf, the
system would start trying  to consult NIS for some reason even though the
account is a local account.  

If you were to configure NIS to use a specific NIS server instead of trying to
broadcast and keep the default firewall rules, ypbind would fail with an error
and you would not be in this "root can't log in" situation.

In my opinion, and I haven't dug in to a bunch of documentation... If
nsswitch.conf is configured to check local files and then NIS, local users
should always be able to log in even if NIS is having trouble.  Hence this bug.

Version-Release number of selected component (if applicable):
ypbind 1.8-1, pam-0.75-29, glibc-2.2.4-29.2

How reproducible:
Always

Steps to Reproduce:
All of the re-production steps are laid out in the description. If you set up a
local NIS server that responds to broadcasts and put the AS 2.1 system on that
local network., configure it with the default firewall and NIS with broadcast
binding - and you will hit the problem each time.
    

Actual Results:  You can not log in as a local user.
Also, ypwhich will take a while to respond but will finally show this:

[root@rappel sysconfig]# ypwhich
 do_ypcall: clnt_call: RPC: Timed out
 clinker.americas.sgi.com

A login attempt might look like this (I show it with telnet here, same with
console login, etc):

[root@rappel sysconfig]# telnet localhost
 Trying 127.0.0.1...
 Connected to localhost.localdomain (127.0.0.1).
 Escape character is '^]'.
 SGI ProPack v2.2 for Linux
 Kernel 2.4.20-sgi220r3 on an ia64
 login: root
 Password: 
 Login timed out after 60 seconds
 Connection closed by foreign host.

Expected Results:  local users should be able to log in even if NIS is in a bad
state - so root from the  console or a local test user via telnet should be able
to log in as nsswitch is told to look at files first then NIS.  However, these
login attempts block and login times out.

Additional info:

Comment 1 John Hesterberg 2007-02-15 16:36:36 UTC
Erik, is this still an issue on RHEL5?

Comment 2 erikj 2007-02-15 17:21:14 UTC
I tried a couple quick tests and I don't believe this RHEL 2.1 (from 2003 :) 
issue is still a problem in RHEL5.

I didn't check RHEL4U4 or anything.

Since it hasn't been looked at in 4 years and doesn't seem to be an issue in 
RHEL5, maybe we can just close it then? 

Comment 3 Marizol Martinez 2007-02-15 18:35:37 UTC
Thanks, Erik, agree.