Description of problem: With sshd 4.3p2-24.el5, if nscd is off and user information comes from LDAP, sshd fails to find account information. This may be a glibc nss bug that sshd is tickling, not sure. In any case downgrading openssh fixes it, as does starting nscd. I'm running into this on both i386 and x86_64 if you're curious. A plain ldapsearch, getent passwd, etc. works. It's also been separately reported here http://bugs.centos.org/view.php?id=2532 Version-Release number of selected component (if applicable): openssh-server-4.3p2-24.el5 How reproducible: Every time. Steps to Reproduce: 1. /etc/init.d/nscd stop 2. ssh host.tld # from a remote host Actual results: Connection to myhost.domain.tld closed by remote host. In /var/log/secure: Dec 17 15:31:35 myhost sshd[19537]: Postponed publickey for joshuadf from w.x.y.z port 34973 ssh2 Dec 17 15:31:36 myhost sshd[19536]: pam_ldap: could not open secret file /etc/ldap.secret (No such file or directory) Dec 17 15:31:37 myhost sshd[19536]: Accepted publickey for joshuadf from w.x.y.z port 34973 ssh2 Dec 17 15:31:37 myhost sshd[19536]: pam_unix(sshd:session): session opened for user joshuadf by (uid=0) nss_ldap: could not search LDAP server - Server is unavailable Dec 17 15:31:05 myhost sshd[19507]: fatal: login_get_lastlog: Cannot find account for uid 1234 Expected results: successful ssh login Additional info:
I'm sorry I cannot reproduce the problem here. I'm running this openssh version with authentication against a LDAP server through pam_ldap and everything works fine even without nscd. Can you rebuild the openssh src.rpm with %define nss 0 and try whether it helps?
Sure, I'll try that. By the way, do you have "bind_policy soft" in /etc/ldap.conf?
I've tried both with bind_policy soft and hard and still cannot reproduce it.
OK, it works with '%define nss 0' in the SPEC. I rebuilt from this srpm: ftp://ftp.redhat.com/pub/redhat/linux/enterprise/5Server/en/os/SRPMS/openssh-4.3p2-24.el5.src.rpm Toggling back to offical el5 the problem comes back.
Created attachment 289929 [details] My /etc/ldap.conf
So the NSS suport is the culprit. Although sshd doesn't call any function from NSS library it still has to link to it because some .c files which call NSS are shared with the ssh client. It seems like just the linking to NSS causes some conflicts with the nss_ldap or openldap library itself. I'm ccing maintainers of NSS, nss_ldap and openldap if they have some ideas on how to debug/fix this.
From spamgl 2007-12-19 20:15 at CentOS bugzilla: "I have seen the same problem when trying to authenticate against our main openldap server, v2.2.13-6.4e (Centos 4.4). However when I modify /etc/ldap.conf to point at our backup ldap server, openldap v2.2.13-4 (Centos 4.3), users can authenticate. Turning on nscd, on the affected client, allows us to authenticate against our main openldap (v2.2.13-6.4e) server."
Could you try to strace the old openssh and new openssh servers when trying to authenticate against the ldap server? Of course please ensure that you do not release security relevant info in the attached logs as this is a public bugzilla. (such as private keys of the server, passwords/password hashes, or ldap secret)
I'm sorry I had other priorities today and I'm leaving on holiday tomorrow so I will probably not be able to get to this for a couple weeks. The CentOS bugzilla has this very interesting note from spamgl: "For another work-around, we've found that turning off SSL in ldap.conf on the client and contacting the LDAP server unencrypted also works on our main LDAP server (v2.2.13-6.4e)." Also you are of course welcome to make this high priority but enabling nscd is a pretty easy workaround. In fact I normally have nscd enabled but had disabled it temporarily to debug something else.
I've finally reproduced the problem here. The problem is caused by sharing the TCP connection to the LDAP server between parent and child processes and NSS linking to pthread library. The question is how to fix it.
This is another incarnation of bug 154314, it is fixed by the patch to nss_ldap.