Bug 448883
Summary: | getent -s 'ldap' passwd -- Segmentation fault | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Michal Nowak <mnowak> | ||||
Component: | nss_ldap | Assignee: | Nalin Dahyabhai <nalin> | ||||
Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 5.2 | CC: | albert.fluegel, bojan, dpal, herrold, jbastian, jnansi, ohudlick, omoris, rnunn, tao | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2010-03-30 08:34:46 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 448884 | ||||||
Attachments: |
|
Description
Michal Nowak
2008-05-29 09:09:41 UTC
Can you run LD_PRELOAD=libSegFault.so getent -s 'ldap' passwd ? If the segfault is in nss_ldap (quite likely), then it most likely isn't a glibc bug, but nss_ldap. Well, running it this way does on RHEL4 and RHEL5 nothing (just ...segfault...). But on Fedora 9, where the command segfaults too, it points me to nss_ldap as you suggested. Thanks for help, changing component. nss_ldap-253-12.el5 Starting program: /usr/bin/getent -s ldap passwd Program received signal SIGSEGV, Segmentation fault. do_init () at ldap-nss.c:1085 1085 int sd=-1; (gdb) How is your server's address being resolved (both forward and reverse)? It needs to be resolved in some way other than ldap in order for this to work at all. Put another way, does "getent -s 'dns ldap' passwd" work? dhcp-lab-198 newman # getent -s 'dns ldap' passwd
dhcp-lab-198 newman # echo $?
0
After a minute or so it finished, I guess some DNS-related timeout passed. At
least no segfault happens.
> How is your server's address being resolved (both forward and reverse)?
No idea. Hint welcome.
(In reply to comment #6) > > How is your server's address being resolved (both forward and reverse)? > No idea. Hint welcome. In /etc/ldap.conf, what's the 'host' or 'uri' setting (whichever one you're using)? Are the name and address of your LDAP server listed in /etc/hosts? For each source listed on the "hosts" line in /etc/nsswitch.conf, does running 'getent -s $source $server' (where source is the value from nsswitch.conf, and server is the server's name as listed in /etc/ldap.conf) produce a result? If the server's name can't be resolved without LDAP, then nss_ldap can't determine how to contact the server, which appears to be the problem you're encountering with your test command. (In reply to comment #7) > In /etc/ldap.conf, what's the 'host' or 'uri' setting (whichever one you're > using)? host 127.0.0.1 > Are the name and address of your LDAP server listed in /etc/hosts? sure :) > For each source listed on the "hosts" line in /etc/nsswitch.conf, does running > 'getent -s $source $server' (where source is the value from nsswitch.conf, and > server is the server's name as listed in /etc/ldap.conf) produce a result? If > the server's name can't be resolved without LDAP, then nss_ldap can't determine > how to contact the server, which appears to be the problem you're encountering > with your test command. I am confused here. `getent` syntax here is like so: getent [OPTION...] database [key ...] so no such "server" as you mentioned. I fired these commands: dhcp-lab-198 newman # getent -s dns passwd < no result > and dhcp-lab-198 newman # getent -s files passwd root:x:0:0:root:/root:/bin/bash bin:x:1:1:bin:/bin:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin < ... > Just to mention, LDAP is not even installed (I know it's stupid to call on LDAP to get these credential). Feel free to ask for more info. Attaching core, in hope it will be useful. Created attachment 308433 [details]
Corefile
I mean 'getent -s files 127.0.0.1', 'getent -s dns 127.0.0.1' and so on. Some nsswitch source other than ldap has to be able to provide the hostname and network address of the directory server given in /etc/ldap.conf in order for the LDAP client bits to attempt to contact one, otherwise you hit the chicken-egg problem you're seeing here. I mentioned it in comment #8. dhcp-lab-198 bz230567 # getent -s files 127.0.0.1 Unknown database: 127.0.0.1 Try `getent --help' or `getent --usage' for more information. dhcp-lab-198 bz230567 # getent -s dns 127.0.0.1 Unknown database: 127.0.0.1 Try `getent --help' or `getent --usage' for more information. dhcp-lab-198 bz230567 # getent -s ldap 127.0.0.1 Unknown database: 127.0.0.1 Try `getent --help' or `getent --usage' for more information. The syntax is different in here. -- dhcp-lab-198 bz230567 # getent -s ldap group Segmentation fault dhcp-lab-198 bz230567 # getent -s ldap shadow Segmentation fault -- This is not much useful, I guess... My mistake. Try this instead: getent -s files hosts 127.0.0.1 getent -s dns hosts 127.0.0.1 getent -s 'files dns ldap' hosts 127.0.0.1 If one of the sources listed for use in host resolution can't provide this answer when nss_ldap needs it, then nss_ldap can't work. newman@dhcp-lab-198 ~ $ getent -s files hosts 127.0.0.1 127.0.0.1 localhost.localdomain localhost newman@dhcp-lab-198 ~ $ getent -s dns hosts 127.0.0.1 127.0.0.1 localhost newman@dhcp-lab-198 ~ $ getent -s 'files dns ldap' hosts 127.0.0.1 127.0.0.1 localhost.localdomain localhost newman@dhcp-lab-198 ~ $ getent hosts ldap.boston.redhat.com 10.13.240.1 ldap.boston.redhat.com newman@dhcp-lab-198 ~ $ /usr/bin/getent -s ldap passwd Segmentation fault So, any way this infinite recursion can be avoided from within the code? It seems like a bad idea to let this segfault like this... PS. getent -s 'dns ldap' passwd works on F-9. From F9: newman@dhcp-lab-192 ~ $ rpmquery nss_ldap nss_ldap-259-3.fc9.i386 newman@dhcp-lab-192 ~ $ getent -s 'dns ldap' passwd Segmentation fault I found an old bug -- bug 110563 -- describing this same problem. It was eventually closed for lack of activity, and the "solution" was to run nscd, but nscd should not be necessary for this setup to work. According to bug 110563 comment 17 and bug 110563 comment 18, adding the LDAP server to /etc/hosts did not solve the problem, which matches what I am seeing with my testing on RHEL 4.7 and RHEL 5.4. I've also tested this on Fedora 11, and getent -s 'files ldap' hosts foo hangs forever (where forever >= 135 mins). According to strace, it's stuck on a futex, which also matches bug 110563. I enabled nscd on RHEL 4.7, RHEL 5.4, and Fedora 11 and found that getent no longer segfaulted or hung on a futex, and it could find passwd entries from the LDAP server, but it could not find hosts entries. We can probably fix the crash or lockup (in both cases, I expect it's recursion, either infinitely or until the calling application tries to lock something it's already locked), but if nss_ldap (or rather, the underlying ldap client library) can't look up information it needs, then running "getent -s ldap" won't produce the desired results. That's just the nature of the implementation. (In reply to comment #22) > We can probably fix the crash or lockup (in both cases, I expect it's > recursion, either infinitely or until the calling application tries to lock > something it's already locked), but if nss_ldap (or rather, the underlying ldap > client library) can't look up information it needs, then running "getent -s > ldap" won't produce the desired results. That's just the nature of the > implementation. Further info: similar behaviour on an ldap enabled Red Hat Enterprise Linux 5.3 system running nss_ldap-253-22 with tls enabled. As you pointed out Nalin, The key is that the '-s ldap' option overrides the nsswitch hosts order causing recursion of gethostbyname_r and finally: buffer=0xffffffffffffffb0 <Address 0xffffffffffffffb0 out of bounds>, for which there is no graceful exit. effectively the same as having an /etc/nsswitch.conf search order for hosts (NB not best practice with tls enabled ldap): hosts: ldap files dns semi workaround for getent and nscd off, that does not address the point (same as previous revised -s order): getent passwd this will work on an ldap enabled host whilst honouring a rational nsswitch hosts entry: hosts: files dns N.B. This issue was not seen with nscd running as would be expected (/var/run/nscd/socket is already present). Here is a backtrace illustrating the recursion on the client machine running getent -s ldap passwd: N.B. <HOSTNAMES> removed. #0 _nss_ldap_gethostbyname_r (name=0x7fff98549e90 "<HOSTNAME>", result=0x7fff98549e50, buffer=0xa08afa0 "P\037µè3", buflen=992, errnop=0x2b2d125631d0, h_errnop=0x7fff98549e8c) at ldap-hosts.c:299 #1 0x00000033e88e95e4 in gethostbyname_r@@GLIBC_2.2.5 () from /lib64/libc.so.6 #2 0x00002b2d125cc420 in ldap_pvt_gethostbyname_a () from /lib64/libnss_ldap.so.2 #3 0x00002b2d125cc4ce in ldap_pvt_get_fqdn () from /lib64/libnss_ldap.so.2 #4 0x00002b2d125cb157 in ldap_int_initialize () from /lib64/libnss_ldap.so.2 #5 0x00002b2d125b9826 in ldap_create () from /lib64/libnss_ldap.so.2 #6 0x00002b2d125b9baa in ldap_initialize () from /lib64/libnss_ldap.so.2 #7 0x00002b2d125abeac in do_init () at ldap-nss.c:1047 #8 0x00002b2d125aeb90 in _nss_ldap_search_s (args=0x7fff9854b9b0, filterprot=0x2b2d12a722c0 "(&(objectClass=ipHost)(cn=%s))", sel=LM_HOSTS, user_attrs=0x0, sizelimit=1, res=0x7fff9854b938) at ldap-nss.c:3040 #9 0x00002b2d125afd83 in _nss_ldap_getbyname (args=0x7fff9854b9b0, result=0x7fff9854bae0, buffer=0xa08abb0 "\020 µè3", buflen=992, errnop=0x2b2d125631d0, filterprot=0x2b2d12a722c0 "(&(objectClass=ipHost)(cn=%s))", sel=LM_HOSTS, parser=0x2b2d125b2860 <_nss_ldap_parse_hostv4>) at ldap-nss.c:3443 #10 0x00002b2d125b2c8c in _nss_ldap_gethostbyname2_r (name=<value optimized out>, af=<value optimized out>, result=0x7fff98549e50, buffer=0xffffffffffffffb0 <Address 0xffffffffffffffb0 out of bounds>, buflen=992, errnop=0x2b2d125631d0, h_errnop=0x7fff9854bb1c) at ldap-hosts.c:277 #11 0x00002b2d125b2cee in _nss_ldap_gethostbyname_r (name=0x7fff98549e90 "<HOSTNAME>", result=0xa08afa0, buffer=0x3e0 <Address 0x3e0 out of bounds>, buflen=47472581161424, errnop=0x7fff98549e8c, h_errnop=<value optimized out>) at ldap-hosts.c:300 #12 0x00000033e88e95e4 in gethostbyname_r@@GLIBC_2.2.5 () from /lib64/libc.so.6 #13 0x00002b2d125cc420 in ldap_pvt_gethostbyname_a () from /lib64/libnss_ldap.so.2 #14 0x00002b2d125cc4ce in ldap_pvt_get_fqdn () from /lib64/libnss_ldap.so.2 #15 0x00002b2d125cb157 in ldap_int_initialize () from /lib64/libnss_ldap.so.2 #16 0x00002b2d125b9826 in ldap_create () from /lib64/libnss_ldap.so.2 #17 0x00002b2d125b9baa in ldap_initialize () from /lib64/libnss_ldap.so.2 #18 0x00002b2d125abeac in do_init () at ldap-nss.c:1047 #19 0x00002b2d125aeb90 in _nss_ldap_search_s (args=0x7fff9854d640, filterprot=0x2b2d12a722c0 "(&(objectClass=ipHost)(cn=%s))", sel=LM_HOSTS, user_attrs=0x0, sizelimit=1, res=0x7fff9854d5c8) at ldap-nss.c:3040 #20 0x00002b2d125afd83 in _nss_ldap_getbyname (args=0x7fff9854d640, result=0x7fff9854d770, buffer=0xa08a770 "0 µè3", buflen=992, errnop=0x2b2d125631d0, filterprot=0x2b2d12a722c0 "(&(objectClass=ipHost)(cn=%s))", sel=LM_HOSTS, parser=0x2b2d125b2860 <_nss_ldap_parse_hostv4>) at ldap-nss.c:3443 #21 0x00002b2d125b2c8c in _nss_ldap_gethostbyname2_r (name=<value optimized out>, af=<value optimized out>, result=0x7fff98549e50, buffer=0xffffffffffffffb0 <Address 0xffffffffffffffb0 out of bounds>, buflen=992, errnop=0x2b2d125631d0, h_errnop=0x7fff9854d7ac) at ldap-hosts.c:277 #22 0x00002b2d125b2cee in _nss_ldap_gethostbyname_r (name=0x7fff98549e90 "HOSTNAME", result=0xa08afa0, buffer=0x3e0 <Address 0x3e0 out of bounds>, buflen=47472581161424, errnop=0x7fff98549e8c, h_errnop=<value optimized out>) at ldap-hosts.c:300 (In reply to comment #26) > The key is that the '-s ldap' option overrides the nsswitch hosts order causing > recursion of gethostbyname_r and finally: > buffer=0xffffffffffffffb0 <Address 0xffffffffffffffb0 out of bounds>, > for which there is no graceful exit. > > effectively the same as having an /etc/nsswitch.conf search order for hosts (NB > not best practice with tls enabled ldap): > hosts: ldap files dns > > semi workaround for getent and nscd off, that does not address the point (same > as previous revised -s order): > getent passwd > this will work on an ldap enabled host whilst honouring a rational nsswitch > hosts entry: > hosts: files dns Alternately, I've recently learned that this is now available with getent, and can be expected to provide the desired results: getent -s passwd:ldap passwd RHTS test created. See QA whiteboard. Successfully verified. RHEL5.5-Server-20100117.0 - i386, ia64, s390x, ppc, x86_64. RHEL5.5-Client-20100114.nightly - i386, x86_64. :: [ LOG ] :: Installed: : nss_ldap-253-22.el5_4.x86_64 :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: [ LOG ] :: Test :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: [ PASS ] :: Running 'cp -f ldap.conf /etc/ldap.conf' :: [ PASS ] :: Running 'echo "bind_policy soft" >> /etc/ldap.conf' :: [ PASS ] :: Running 'restorecon -v /etc/ldap.conf' :: [ FAIL ] :: Running 'getent -s 'ldap' passwd' (Expected 0, got 139) :: [ FAIL ] :: No segfault expected (Assert: "139" should not equal "139") :: [ PASS ] :: Running 'cp -f ldap.conf /etc/ldap.conf' :: [ PASS ] :: Running 'echo "bind_policy hard" >> /etc/ldap.conf' :: [ PASS ] :: Running 'restorecon -v /etc/ldap.conf' :: [ FAIL ] :: Running 'getent -s 'ldap' passwd' (Expected 0, got 139) :: [ FAIL ] :: No segfault expected (Assert: "139" should not equal "139") :: [ LOG ] :: Duration: 2s :: [ LOG ] :: Assertions: 6 good, 4 bad :: [ FAIL ] :: RESULT: Test -------------------------------------------------------------------------------- :: [ LOG ] :: Installed: : nss_ldap-253-25.el5.x86_64 :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: [ LOG ] :: Test :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: [ PASS ] :: Running 'cp -f ldap.conf /etc/ldap.conf' :: [ PASS ] :: Running 'echo "bind_policy soft" >> /etc/ldap.conf' :: [ PASS ] :: Running 'restorecon -v /etc/ldap.conf' :: [ PASS ] :: Running 'getent -s 'ldap' passwd' :: [ PASS ] :: No segfault expected :: [ PASS ] :: Running 'cp -f ldap.conf /etc/ldap.conf' :: [ PASS ] :: Running 'echo "bind_policy hard" >> /etc/ldap.conf' :: [ PASS ] :: Running 'restorecon -v /etc/ldap.conf' :: [ PASS ] :: Running 'getent -s 'ldap' passwd' :: [ PASS ] :: No segfault expected :: [ LOG ] :: Duration: 2m 7s :: [ LOG ] :: Assertions: 10 good, 0 bad :: [ PASS ] :: RESULT: Test An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2010-0260.html *** Bug 532892 has been marked as a duplicate of this bug. *** |