Description of problem: During a recent network failure, local logins were not possible. The LDAP server was unreachable, which would cause the message "Login timed out after 60 seconds" to appear when attempting to login to the console as root. nsswitch.conf is configured correctly, as are system-auth and ldap.conf. During testing, I found that turning off the LDAP server will result in local logins working as expected, as the connection is refused instead of being dropped into a black hole. Commenting out the LDAP related entries in system-auth did NOT fix this problem. During the outage, I had to reboot to single user mode, remove ldap from nsswitch.conf, and then init 3 in order to login as root. That is not expected behavior for nsswitch.conf, AFAIK. Version-Release number of selected component (if applicable): openldap-2.2.13-8 glibc-2.3.4-2.39 How reproducible: Configure an LDAP client, then simulate the network unreachable failure using iptables. Steps to Reproduce: 1. Configure RHES 4 client to use to an LDAP server. We use fedora-ds. 2. Verify LDAP logins and local logins via console are working as expected. 3. Use iptables to "-j DROP" all outgoing packets to the LDAP server. 4. Attempt to login as root on the console. Actual results: "Login timed out after 60 seconds" Expected results: root login should succeed. Additional info: /etc/ldap.conf (hard linked to /etc/openldap/ldap.conf) uri ldap://our.internal.ldap.server.com base dc=our,dc=ldap,dc=server,dc=com TLS_CACERT /etc/openldap/cacerts/cacert.asc TLS_REQCERT allow bind_policy soft ssl start_tls pam_password md5 nss_reconnect_tries 2 nss_reconnect_sleeptime 1 nss_reconnect_maxsleeptime 10 nss_reconnect_maxconntries 1 /etc/pam.d/system-auth: auth required /lib/security/$ISA/pam_env.so auth sufficient /lib/security/$ISA/pam_localuser auth sufficient /lib/security/$ISA/pam_unix.so likeauth nullok auth sufficient /lib/security/$ISA/pam_ldap.so auth required /lib/security/$ISA/pam_deny.so account sufficient /lib/security/$ISA/pam_unix.so account sufficient /lib/security/$ISA/pam_local_user.so account sufficient /lib/security/$ISA/pam_succeed_if.so uid < 100 quiet account sufficient /lib/security/$ISA/pam_ldap.so account required /lib/security/$ISA/pam_permit.so password requisite /lib/security/$ISA/pam_cracklib.so retry=3 password sufficient /lib/security/$ISA/pam_unix.so nullok use_authtok md5 shadow password sufficient /lib/security/$ISA/pam_ldap.so password required /lib/security/$ISA/pam_deny.so session required /lib/security/$ISA/pam_limits.so session required /lib/security/$ISA/pam_unix.so PLEASE NOTE: As stated above, commenting out all the relative pam_ldap entries in system-auth does NOT fix the problem, which is counter-intuitive. Changing the order of pam_ldap and pam_unix does not fix the problem either. Adding pam_localuser was the last thing I tried, and that also did not fix the problem. /etc/nsswitch.conf: passwd: files ldap shadow: files ldap group: files ldap automount: files ldap hosts: files dns bootparams: files ethers: files netmasks: files networks: files protocols: files rpc: files services: files netgroup: files publickey: files aliases: files I've also tried using these settings in nsswitch.conf, which did not work. (And they should be the default behavior, from the docs I've read) passwd: files [success=return notfound=continue unavail=continue tryagain=continue] ldap shadow: files [success=return notfound=continue unavail=continue tryagain=continue] ldap The only way I've been able to login as root on the console during a simulated network failure has been to remove ldap from the nsswitch.conf settings.
I forgot to mention that I noticed some other odd behavior during my simulated network failure testing. I use iptables to block only the outgoing connections to the LDAP server, so I decided to see what would happen if I attempted to login as root from another location on the network. I enabled root logins in sshd_config, restarted sshd, and tried logging in. After typing the ssh command, it pauses for about 60 seconds, then prompts for the password. After entering root's password, I am immediately greeted with the "Last login" information, after which it pauses for another 60 seconds. Then I am finally presented the root shell. /etc/pam.d/sshd is configured as default: auth required pam_stack.so service=system-auth auth required pam_nologin.so account required pam_stack.so service=system-auth password required pam_stack.so service=system-auth session required pam_stack.so service=system-auth session required pam_loginuid.so
It does not seem to be openldap problem, glibc should not try to contact ldap server if it can find all root account information locally. As a workaround you can tweak the ldap timeouts in /etc/ldap.conf (timelimit and bind_timelimit options).
Except that glibc doesn't try to contact ldap server at all, it is the nss_ldap plugin that does that.
During login etc. initgroups or getgrouplist are called. And these functions really have to look through all groups to see what groups the user (in your case root) belongs to.
Wait, so not being able to login on to console as root during a network outage is not a bug? How can that be considered expected behavior? If nsswitch.conf is configured to go to files then ldap, why is it attempting to look at ldap for groups? The default behavior is supposed to be success=return - Are you suggesting that it is expecting to find that local users are also part of ldap groups? Regardless, this issue should remain open and be considered a bug. Default, expected behavior should NOT lock you completely out of the system during an LDAP or network failure. That's akin to programming my car to refuse to unlock when it's hailing outside - it won't happen very often, but it will happen. Also take into consideration the serious ramifications if a malicious person were to deliberately target someone's LDAP servers, knowing the default behavior will lock them out of ALL of their LDAP connected servers.
Adding the following to ldap.conf did indeed fix this problem: bind_timelimit 15 timelimit 15 It actually took about 30 seconds to timeout because of the nss_reconnect values I stated earlier. Perhaps the default login timeout should be increased, or the default values for bind_timelimit, timelimit and nss_reconnect should be changed to prevent a console login from timing out before the LDAP query? Granted, using nscd also fixes this problem (short term), but I shouldn't have to rely on it. I stand by my assertion that it's ludicrous to have a default design that locks root completely out of the system because of a little network issue. A determined hacker could use this little bug to their advantage. They could have their way with your most critical server while you were busy troubleshooting LDAP issues. Or, a malicious user could DDOS your LDAP servers, locking out everyone on every LDAP connected server in your network.
another hint: try to add to your /etc/ldap.conf: nss_initgroups_ignoreusers root As result, nss_ldap will not ask LDAP server for list of root's groups.
That would be the valid solution for this, however a concern would be Bug 429101 where a lock is not cleared up if the option is used.. causing dbus to lockup during the boot sequence. This should be, however, fixed in 4.7's nss_ldap package. Jose
I thought the nss_reconnect options were only implemented in nss_ldap v2.41 and newer. Were they backported to RHEL 4 ? http://www.liquidx.net/blog/2006/04/03/nss_ldap-undocumented-nss_reconnect_tries/
Thank you for submitting this issue for consideration in Red Hat Enterprise Linux. The release for which you requested us to review is now End of Life. Please See https://access.redhat.com/support/policy/updates/errata/ If you would like Red Hat to re-consider your feature request for an active release, please re-open the request via appropriate support channels and provide additional supporting details about the importance of this issue.