Red Hat Bugzilla – Bug 162910
sendmail and nss_ldap hangs during getpwent
Last modified: 2008-02-17 21:33:39 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; fr-fr) AppleWebKit/412 (KHTML, like Gecko) Safari/412
Description of problem:
On my setup (sendmail + nss_ldap + cyrus-imapd), I am experiencing hangs during sendmail delivery
to the cyrus agent.
After tracing, I have discovered that sendmail is hung during a getpwent during the MatchGECOS. In
fact sendmail is hung just after looking at the local /etc/passwd information and just before getting the
first ldap entry from nss_ldap (it waits forever for results from ldap server).
killing the sendmail process and restarting it deliver correctly the email from the mailqueue to cyrus-
The ldap servers are on other servers (two ldap servers).
The nscd process is disabled because running it seams to hang saslauthd process needed by cyrus-
imapd. The same problem is present if nscd is running.
Looking at the gdb stacktrace, I have found that the process is hung in nss do_result waiting without
timeout to results from the server.
A simple workaround for me is to add "timelimit 30" to the /etc/ldap.conf file.
Version-Release number of selected component (if applicable):
sendmail-8.13.4-2 nss_ldap-234-4 glibc-2.3.5-10
Steps to Reproduce:
1.mail -s test email@example.com
Actual Results: sendmail delivers the mail to cyrus-imap by putting it in the mailqueue and triggering a new sendmail
process to deliver it. This new sendmail process hangs and never delivers the email to cyrus-imapd and
waits forever on nss_ldap for results.
Expected Results: the email should be delivered to cyrus-imapd.
bt of the hung process :
#0 0x00111402 in ?? ()
#1 0x0034f47d in ?? () from /lib/libc.so.6
#2 0x00ef1014 in ldap_int_select () from /lib/libnss_ldap.so.2
#3 0x00ee33ff in ldap_result () from /lib/libnss_ldap.so.2
#4 0x00ed76a6 in do_result (ctx=0xa007940, all=0) at ldap-nss.c:2154
#5 0x00ed78e0 in _nss_ldap_ent_context_init_locked (pctx=0x1142920) at ldap-nss.c:1836
#6 0x00ed9176 in _nss_ldap_ent_context_init (pctx=0xfffffdfe) at ldap-nss.c:1792
#7 0x00ed9632 in _nss_ldap_setpwent () at ldap-pwd.c:244
#8 0x0036802a in __nss_getent_r (getent_func_name=0x3a5c2a "getpwent_r",
lookup_fct=0x368c04 <*__GI___nss_passwd_lookup>, nip=0x3b40c8, startp=0x3b40c0,
last_nip=0x3b40c4, stayopen_tmp=0x0, res=0, resbuf=0x3b4050,
buffer=0xa0069c8 "nut", buflen=1024, result=0xbf84d0cc, h_errnop=0x0) at getnssent_r.c:203
#9 0x00317c77 in __getpwent_r (resbuf=0xfffffdfe, buffer=0xfffffdfe <Address 0xfffffdfe out of
bounds>, buflen=4294966782, result=0xfffffdfe)
#10 0x00367cea in __nss_getent (func=0x317bd0 <__getpwent_r>, resbuf=0x3b4050,
buffer=0x3b30a8, buflen=1024, buffer_size=0x3b406c, h_errnop=0x0)
#11 0x003177af in getpwent () at ../nss/getXXent.c:84
#12 0x00500b6f in finduser (name=0xbf84f4c7 "alain richard", fuzzyp=0xbf84f5c8,
user=0xbf84d2bc) at recipient.c:1212
#13 0x00502375 in recipient (new=0x9ff2954, sendq=0xbf851020, aliaslevel=0, e=0xbf850fa0) at
#14 0x004f49b8 in readqf (e=0xbf850fa0, openonly=0) at queue.c:4387
#15 0x004f7009 in doworklist (el=0x554cc0, forkflag=1, requeueflag=1) at queue.c:3834
#16 0x00510638 in smtp (nullserver=0x0, d_flags=0x55c2b0, e=0x554cc0) at srvrsmtp.c:3510
#17 0x004a6fff in main (argc=3, argv=0xbf854c4c, envp=0xfffffdfe) at main.c:2587
The ldap client in nss_ldap seems to share the socket across forks. This can
cause one client process to steal the answer from another, thus leaving it
waiting forever for an answer that will never be received.
This bug is probably related to one of the bugs concering nss_ldap across forks.
The ldap socket is supposed to be closed on fork. It is evident from the output
of lsof that this is not always done.
This report targets the FC3 or FC4 products, which have now been EOL'd.
Could you please check that it still applies to a current Fedora release, and
either update the target product or close it ?
Fedora Core 4 is no longer maintained.
Setting status to "INSUFFICIENT_DATA". If you can reproduce this bug in the
current Fedora release, please reopen this bug and assign it to the
corresponding Fedora version.