Bug 238682
Summary: | nss_ldap lookup hang in _nss_ldap_readconfigfromdns | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Georg Moritz <georg.moritz> | ||||||
Component: | nss_ldap | Assignee: | Nalin Dahyabhai <nalin> | ||||||
Status: | CLOSED WONTFIX | QA Contact: | |||||||
Severity: | high | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 4.4 | CC: | jorton, jplans | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | i586 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2012-06-20 16:09:44 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Georg Moritz
2007-05-02 09:18:51 UTC
Created attachment 153925 [details]
typescript of 'ps fauxw' and killing child processes
Thanks for the report. Can you attach a sysreport for this server? Can you determine what the hung children are doing: 1) "echo CoreDumpDirectory /tmp > /etc/httpd/conf.d/coredump.conf" (or similar) 2) "kill -SEGV <pid>" a hung root process 3) bzip and attach resultant coredump from /tmp/core.<pid> > Can you determine what the hung children are doing:
I did configure the CoreDumpDirectory, then:
citylap0002 [root] 11:59 PM /root # service httpd graceful
citylap0002 [root] 12:00 PM /root # ps fauxw | grep httpd
root 20703 0.0 0.0 5632 560 pts/4 S+ 12:01 0:00 \_ grep httpd
root 5181 0.0 0.1 12132 4972 ? Ss Apr27 0:00 /usr/sbin/httpd
apache 20347 0.0 0.1 12264 5124 ? S 11:59 0:00 \_ /usr/sbin/httpd
apache 20348 0.0 0.1 12264 5124 ? S 11:59 0:00 \_ /usr/sbin/httpd
apache 20349 0.0 0.1 12264 5120 ? S 11:59 0:00 \_ /usr/sbin/httpd
apache 20350 0.0 0.1 12264 5120 ? S 11:59 0:00 \_ /usr/sbin/httpd
apache 20351 0.0 0.1 12264 5120 ? S 11:59 0:00 \_ /usr/sbin/httpd
root 20352 0.0 0.1 12132 4988 ? S 11:59 0:00 \_ /usr/sbin/httpd
apache 20353 0.0 0.1 12132 5004 ? S 11:59 0:00 \_ /usr/sbin/httpd
root 20356 0.0 0.1 12132 4988 ? S 11:59 0:00 \_ /usr/sbin/httpd
citylap0002 [root] 12:02 PM /root # strace -p 20352
Process 20352 attached - interrupt to quit
select(1024, [6], [], NULL, NULL <unfinished ...>
Process 20352 detached
citylap0002 [root] 12:02 PM /root # strace -p 20353
Process 20353 attached - interrupt to quit
semop(11304968, 0x8f8740, 1 <unfinished ...>
Process 20353 detached
citylap0002 [root] 12:03 PM /root # strace -p 20351
Process 20351 attached - interrupt to quit
semop(11304968, 0x8f8740, 1 <unfinished ...>
Process 20351 detached
citylap0002 [root] 12:03 PM /root # strace -p 20356
Process 20356 attached - interrupt to quit
read(6, <unfinished ...>
Process 20356 detached
citylap0002 [root] 12:09 PM /root # ls -l /proc/2035{1,2,6}/fd/6
lrwx------ 1 apache apache 64 May 2 12:09 /proc/20351/fd/6 -> socket:[24194848]
lrwx------ 1 root root 64 May 2 12:01 /proc/20352/fd/6 -> socket:[24194848]
lrwx------ 1 root root 64 May 2 12:08 /proc/20356/fd/6 -> socket:[24194848]
If I get apache to dump core, I'll attach that.
Can you also attach the sysreport so the httpd configuration is apparent. Simply attaching gdb to the process and getting a backtrace will also help. It is likely this is some third-party module; httpd will call poll() rather than select() in almost all cases. It looks like you have a FastCGI running; can you reproduce the issue without the FastCGI module loaded? I don't seem to get httpd to coredump - not with SEGV,ABRT,BUS etc. Got a hint? A backtrace from a running child with UID 0: gdb /usr/sbin/httpd 21653 GNU gdb Red Hat Linux (6.3.0.0-1.132.EL4rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux-gnu"...(no debugging symbols found) Using host libthread_db library "/lib/tls/libthread_db.so.1". Attaching to program: /usr/sbin/httpd, process 21653 (no debugging symbols found) Loaded symbols for /usr/sbin/httpd Reading symbols from /lib/libpcre.so.0...(no debugging symbols found)...done. Loaded symbols for /lib/libpcre.so.0 Reading symbols from /usr/lib/libpcreposix.so.0...(no debugging symbols found)...done. [...] Reading symbols from /lib/libnsl.so.1...done. Loaded symbols for /lib/libnsl.so.1 0x00c577a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 (gdb) bt #0 0x00c577a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x00b0a473 in __read_nocancel () from /lib/tls/libpthread.so.0 #2 0x0103475d in _nss_ldap_readconfigfromdns () from /lib/libnss_ldap.so.2 #3 0x010350c0 in _nss_ldap_readconfigfromdns () from /lib/libnss_ldap.so.2 #4 0x01034251 in _nss_ldap_readconfigfromdns () from /lib/libnss_ldap.so.2 #5 0x01031f66 in _nss_ldap_readconfigfromdns () from /lib/libnss_ldap.so.2 #6 0x01010462 in _nss_ldap_readconfigfromdns () from /lib/libnss_ldap.so.2 #7 0x0101192f in _nss_ldap_readconfigfromdns () from /lib/libnss_ldap.so.2 #8 0x010064b7 in _nss_ldap_init () from /lib/libnss_ldap.so.2 #9 0x0100716c in _nss_ldap_getent_ex () from /lib/libnss_ldap.so.2 #10 0x01009690 in _nss_ldap_initgroups_dyn () from /lib/libnss_ldap.so.2 #11 0x0043ac1c in internal_getgrouplist () from /lib/tls/libc.so.6 #12 0x0043aece in initgroups () from /lib/tls/libc.so.6 #13 0x003a4523 in unixd_setup_child () from /usr/sbin/httpd #14 0x0038551f in ap_graceful_stop_signalled () from /usr/sbin/httpd #15 0x00385b5c in ap_graceful_stop_signalled () from /usr/sbin/httpd #16 0x003867f0 in ap_mpm_run () from /usr/sbin/httpd #17 0x0038d36a in main () from /usr/sbin/httpd (gdb) sysreport follows. How do I post it (does it contain sensitive information)? The FastCGI server is a business critical application. Need a downtime window to check without it :-( Created attachment 153928 [details]
coredump of a httpd child (UID 0) produced with kill -ILL
The backtrace of this one is a bit different:
# gdb /usr/sbin/httpd /tmp/core.21653
GNU gdb Red Hat Linux (6.3.0.0-1.132.EL4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...(no debugging symbols
found)
Using host libthread_db library "/lib/tls/libthread_db.so.1".
Core was generated by `/usr/sbin/httpd'.
Program terminated with signal 4, Illegal instruction.
(no debugging symbols found)
Loaded symbols for /usr/sbin/httpd
[...]
Loaded symbols for /lib/libnsl.so.1
#0 0x00c577a2 in _dl_sysinfo_int80 ()
from /lib/ld-linux.so.2
(gdb) bt
#0 0x00c577a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1 0x0047119d in poll () from /lib/tls/libc.so.6
#2 0x00e17f24 in apr_poll () from /usr/lib/libapr-0.so.0
#3 0x00385734 in ap_graceful_stop_signalled () from /usr/sbin/httpd
#4 0x00385b5c in ap_graceful_stop_signalled () from /usr/sbin/httpd
#5 0x003867f0 in ap_mpm_run () from /usr/sbin/httpd
#6 0x0038d36a in main () from /usr/sbin/httpd
(gdb) q
If you check the "private" box when attaching the sysreport, it can only be viewed within Red Hat. The first backtrace looks relevant - this is a hang somewhere doing the group lookup via nss_ldap: - from the function names, it appears to be reading LDAP configuration from DNS; could this be a generic problem with the NSS configuration on the box; does running "id apache" hang similarly? - does your /etc/nsswitch.conf have "files" before "ldap" in the "group" line? - is the "apache" user in any groups which require LDAP lookup? I'll upload the sysreport, then. As for your questions - no, yes, and no - neither UID apache nor GID apache are dependant on LDAP: citylap0002 [root] 02:45 PM /root # id apache uid=48(apache) gid=48(apache) groups=48(apache) citylap0002 [root] 02:46 PM /root # groups apache apache : apache citylap0002 [root] 02:46 PM /root # perl -ne 'print unless /^\s*(#|$)/' /etc/nsswitch.conf passwd: files ldap shadow: files ldap group: files ldap hosts: files dns bootparams: nisplus [NOTFOUND=return] files ethers: files netmasks: files networks: files protocols: files rpc: files services: files netgroup: files publickey: nisplus automount: files aliases: files nisplus Hm. In /etc/ldap/ldap.conf, a Windows Active Directory Server is named in the URI. Apache loads mod_ldap - does that module perform any LDAP lookup on startup? But that shouldn't be an issue - LDAP lookups with GSSAPI and Kerberos tickets work like a charm; apache has a valid keytab with keys for both the services HTTP and LDAP. Darn. I just straced -f -ff httpd, and it reads /etc/ldap.conf - *not* /etc/openldap/ldap.conf. I found the childs writing to fh 6 "dc=example,dc=com", then reading from it... Adding a correct ldap base to /etc/ldap.conf seems to solve the problem. But then, that problem never occurs starting httpd, only doing 'graceful' or with normal child replacement (after MaxRequestsPerChild). A bogus /etc/ldap.conf doesn't affect initial startup, but after a 'graceful', a invalid ldap base causes the child processes not to change UID. I would think it should time out rather than hang indefinitely, in any case. Re-assigning to nss_ldap maintainer to see if further analysis is needed. Thank you for submitting this issue for consideration in Red Hat Enterprise Linux. The release for which you requested us to review is now End of Life. Please See https://access.redhat.com/support/policy/updates/errata/ If you would like Red Hat to re-consider your feature request for an active release, please re-open the request via appropriate support channels and provide additional supporting details about the importance of this issue. |