Bug 502133

Summary: httpd crashes when mod_nss is enabled
Product: [Fedora] Fedora Reporter: RJB <rogelio.javier>
Component: nssAssignee: Elio Maldonado Batiz <emaldona>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: low    
Version: 11CC: bernie+fedora, fabian.lema, hyc, kengert, liedekef, mnk, rcritten, rrelyea, sgallagh, tscherf
Target Milestone: ---   
Target Release: ---   
Hardware: i586   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-08-13 15:03:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
disable unloading the nspr libraries none

Description RJB 2009-05-22 03:28:24 UTC
Description of problem:

Apache web server crashes when mod_nss is enabled. 
New Fedora 11 (Testing) installation updated as of May-21-2009. Using default (not modified) httpd.conf and nss.conf configuration files. All certificates in database are verified using certutil and user running the web server (apache) has access to the database. Log file (error_log) shows the following message for all httpd child processes:

[Thu May 21 23:03:38 2009] [notice] child pid 12595 exit signal Segmentation fault (11)

Version-Release number of selected component (if applicable):

nss-3.12.3-4.fc11.i586
mod_nss-1.0.8-1.fc11.i586
httpd-2.2.11-8.i586

How reproducible:

Always

Steps to Reproduce:
1. yum install httpd mod_nss
2. verify mod_nss is enabled in /etc/httpd/conf.d (nss.conf)
3. service httpd start
4. tail -f /var/log/httpd/error_log

Actual results:

Disabling mod_nss fixes the problem (mv /etc/httpd/conf.d/nss.conf  /etc/httpd/conf.d/nss.conf.no)

Comment 1 Rob Crittenden 2009-05-22 15:57:42 UTC
It is failing in different places each time I get a stack so it must be some sort of memory corruption. The common theme is that it always crashes right after RNG_RNGInit().

I've seem crashes in mod_perl, libdbus, nss and nspr.

#0  PKIX_DoAddError (stdVars=0x0, error=0xb7a08f70, plContext=0xb69e065c)
    at pkix_errpaths.c:104
#1  0xb7a08f8d in PKIX_ValidateChain_NB (valParams=0x2e7ac8, 
    pCertIndex=0x2b1750, pAnchorIndex=0xbfffdf38, pCheckerIndex=0xb69d85df, 
    pRevChecking=0x2b0639, pCheckers=0xb69e065c, pNBIOContext=0xbfffde08, 
    pResult=0xb69d4402, pVerifyTree=0xb69d43e9, plContext=0xb69e065c)
    at pkix_validate.c:1398
#2  0x002b065a in RNG_RNGInit () at drbg.c:462
#3  0xb69d4402 in RNG_RNGInit () at loader.c:834
#4  0xb69b6a52 in nsc_CommonInitialize (pReserved=0xbfffdef4, isFIPS=0)
    at pkcs11.c:2580
#5  0xb69b6d1d in NSC_Initialize (pReserved=0xbfffdef4) at pkcs11.c:2710
#6  0xb79a1011 in secmod_ModuleInit (mod=0xb562ba80, alreadyLoaded=0xbfffdf58)
    at pk11load.c:146
#7  0xb79a1616 in SECMOD_LoadPKCS11Module (mod=0xb562ba80) at pk11load.c:378
#8  0xb79b49eb in SECMOD_LoadModule (
    modulespec=0xb562c778 " name=\"NSS Internal PKCS #11 Module\" parameters=\"configdir='/etc/httpd/alias' certPrefix='' keyPrefix='' secmod='secmod.db' flags=readOnly,passwordRequired updatedir='' updateCertPrefix='' updateKeyPr"..., 
    parent=0xb5619320, recurse=1) at pk11pars.c:323
#9  0xb79b4b73 in SECMOD_LoadModule (
    modulespec=0xb56186d8 "name=\"NSS Internal Module\" parameters=\"configdir='/etc/httpd/alias' certPrefix='' keyPrefix='' secmod='secmod.db' flags=readOnly,passwordRequired updatedir='' updateCertPrefix='' updateKeyPrefix='' up"..., 
    parent=0x0, recurse=1) at pk11pars.c:338
#10 0xb7981bd5 in nss_Init (configdir=0xb76ba690 "/etc/httpd/alias", 
    certPrefix=<value optimized out>, keyPrefix=0x0, 
    secmodName=0x9f86f2 "secmod.db", updateDir=0xb7a668cf "", 
    updCertPrefix=0xb7a668cf "", updKeyPrefix=0xb7a668cf "", 
    updateID=0xb7a668cf "", updateName=0xb7a668cf "", readOnly=1, noCertDB=0, 
    noModDB=0, forceOpen=0, noRootInit=0, optimizeSpace=0, 
    noSingleThreadedModules=0, allowAlreadyInitializedModules=0, 
    dontFinalizeModules=0) at nssinit.c:536
#11 0xb79821fd in NSS_Initialize (configdir=0xb76ba690 "/etc/httpd/alias", 
    certPrefix=0x0, keyPrefix=0x0, secmodName=0x9f86f2 "secmod.db", flags=1)
    at nssinit.c:653
#12 0x009eb693 in nss_init_SSLLibrary (base_server=<value optimized out>)
    at nss_engine_init.c:198
#13 0x009ed747 in nss_init_Child (p=0xb561cb20, base_server=0x389598)
    at nss_engine_init.c:1131
#14 0x00353ef2 in ap_run_child_init (pchild=0xb561cb20, s=0x389598)
    at /usr/src/debug/httpd-2.2.11/server/config.c:155
#15 0x00369b28 in child_main (child_num_arg=<value optimized out>)
    at /usr/src/debug/httpd-2.2.11/server/mpm/prefork/prefork.c:513
#16 0x0036a0a9 in make_child (s=<value optimized out>, slot=0)
    at /usr/src/debug/httpd-2.2.11/server/mpm/prefork/prefork.c:690
#17 0x0036a82f in ap_mpm_run (_pconf=0x3876a0, plog=0x3b5758, s=0x389598)
    at /usr/src/debug/httpd-2.2.11/server/mpm/prefork/prefork.c:966
#18 0x0033e92a in main (argc=2, argv=0xbffff6a4)
    at /usr/src/debug/httpd-2.2.11/server/main.c:740

Comment 2 RJB 2009-06-03 02:57:16 UTC
Any news on this bug?

This bug makes ipa-server and fedora-ds-admin programs useless in Fedora-11 in addition to web sites using HTTPS with this module. You cannot enable the apache mod_ssl because it conflicts with the two packages mentioned above. I have looked on the internet for possible solutions and I have not found any (I have not found any one else reporting similar crashes either so it is possible that is something related to Fedora-11)

Comment 3 Rob Crittenden 2009-06-03 03:36:38 UTC
No progress yet. The stack is getting smashed at some point during initialization but I haven't be able to determine why yet. It always fails at the same point, in the call to RNG_RNGInit() which is calling a static function. At some point the pointer to this static function is getting set to some arbitrary point in memory.

Comment 4 Rob Crittenden 2009-06-03 20:06:50 UTC
It seems that this may be related to the glibc use of NSS without NSPR. I was able to get my own build of Apache and NSS working when I didn't define FREEBL_NO_DEPEND.

Comment 5 Rob Crittenden 2009-06-04 20:23:09 UTC
I wonder if this is related to the changes made for glibc where it can use NSS hashing without requiring NSPR. It could be that some memory value is getting scrambled by the convoluted Apache load/unload module process and that is causing the crash.

Comment 6 Rob Crittenden 2009-06-05 02:01:55 UTC
Bob Relyea suggested commenting out the two calls to freebl_releaseLibrary() in mozilla/security/nss/lib/freebl/stubs.c (in the NSS source tree).

This lets the server start up again and I was able to serve up one simple client request. Looks like the problem is related to initializing the stub library used by glibc/libcrypto.

Not sure what the long-term fix for this is yet.

Comment 7 Bob Relyea 2009-06-05 18:55:32 UTC
This might be the long term fix.

The problem is apache is loading, calling, unloading modules. libfreebl is now used by libgcrypt, so it needs to run without nspr. When mod_nss is loaded, libfreebl now has access to NSPR and nssutil (which it needs to run services other than just hashing) when mod_nss gets unloaded, so does nspr and nssutil, but libfreebl now has references to them. commenting out the unload cases those libraries to stay in memory (since libfreebl now depends on them).

The upshot is that those references will appear as leaks in any leak detection.

bob

Comment 8 Rob Crittenden 2009-06-05 19:07:41 UTC
Created attachment 346704 [details]
disable unloading the nspr libraries

Kai, this is what I used in my testing.

Comment 9 Rob Crittenden 2009-06-08 20:49:16 UTC
Re-assigning to Elio

Comment 10 Elio Maldonado Batiz 2009-06-09 14:34:27 UTC
Applied the the upstream patch. 
nss-3.12.3.99.3-2.11.2.fc11 Tag to dist-f11-updates-candidate. See http://koji.fedoraproject.org/koji/buildinfo?buildID=105342

Comment 11 Bug Zapper 2009-06-09 16:18:45 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle.
Changing version to '11'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 12 Rob Crittenden 2009-06-24 00:10:07 UTC
*** Bug 507425 has been marked as a duplicate of this bug. ***

Comment 13 Rob Crittenden 2009-06-29 15:09:41 UTC
Can this build be pushed out to the testing repo so people can more easily try it, or to stable? It's been a month and a half since this problem was first reported.

Comment 14 Thorsten Scherf 2009-07-04 19:42:41 UTC
*** Bug 509090 has been marked as a duplicate of this bug. ***

Comment 15 Mathias Nicolajsen Kjærgaard 2009-07-04 21:17:05 UTC
I just updated to nss-3.12.3.99.3-2.11.3.fc11.i586 from updates-testing, and I can confirm that the problem is solved on my system.

Comment 16 Rob Crittenden 2009-07-16 14:39:18 UTC
*** Bug 512148 has been marked as a duplicate of this bug. ***

Comment 17 Rob Crittenden 2009-08-13 14:38:14 UTC
What is the hold-up of getting this into updates? The problem has been fixed for 3 months now!?

Comment 18 Rob Crittenden 2009-08-13 15:03:53 UTC
NM, the bug was simply not closed yet.

Comment 19 Bernie Innocenti 2009-08-20 01:43:52 UTC
This patch appears to be the cause of a leak of file descriptors in slapd:

trinity:~# ll /proc/`pidof slapd`/fd  | grep libnspr4 | wc -l
21
...
trinity:~# ll /proc/`pidof slapd`/fd  | grep libnspr4 | wc -l
24
...
trinity:~# ll /proc/`pidof slapd`/fd  | grep libnspr4 | wc -l
29

Comment 20 Franky Van Liedekerke 2009-10-15 21:45:49 UTC
I can confirm the leak in file descriptors in slapd. This only occurs after it runs for a while, but for me this results in needing to restart slapd at least twice a day ...

Franky

Comment 21 Bernie Innocenti 2009-10-16 05:30:00 UTC
Funny, I'm not seeing any reference to libnspr4 in /proc/nnn/fd any more.

I'm now running openldap-servers-2.4.15-6.fc11 (as of Oct 10) and nss-softokn-freebl-3.12.4-3.fc11 (Sep 28).

Nothing in the changelogs seems to indicate a specific fix.

Comment 22 Franky Van Liedekerke 2009-10-16 07:41:49 UTC
Well, I restarted slapd yesterday evening and no libnspr filedescriptor to be found anywhere, but just before I restarted it had that file open for 800 times. I didn't do any update at all, so I think the lib will be opened/closed all the time and the bug only gets triggered after some time. I'll check again this evening.
Also the libnspr.so file is in the nspr rpm package, I don't know nss-softokn-freebl, is this an alternative?

Comment 23 Franky Van Liedekerke 2009-10-16 17:56:35 UTC
I can confirm the behaviour: after 14 hours, the slapd process has lots (344) of open file descriptors to /lib/libnspr4.so. It's stable for now, but I don't have a clue what causes the slapd to behave like this.

Comment 24 Bernie Innocenti 2009-10-16 19:29:50 UTC
Same here.

Is there a specific bug report on openldap? This bug was about httpd and is now closed. It's unlikely that someone is going to follow up.

Comment 25 Franky Van Liedekerke 2009-10-16 19:44:02 UTC
Not yet, I tried to debug some things in openldap but haven't gotten anything usefull yet ... feel free to open one at openldap and point to this one :-)

Comment 26 Franky Van Liedekerke 2009-10-18 14:50:57 UTC
I opened openldap bug report 6336 for this

Franky

Comment 27 Bernie Innocenti 2009-10-19 07:34:25 UTC
(In reply to comment #26)
> I opened openldap bug report 6336 for this

Here's a link:

http://www.openldap.org/its/index.cgi/Incoming?id=6336;selectid=6336

(PS: OpenLDAP has the ugliest issue tracking system I've ever seen!!!)

Comment 28 Howard Chu 2009-11-11 00:06:06 UTC
(In reply to comment #26)
> I opened openldap bug report 6336 for this
> 
> Franky  

The OpenLDAP Project only investigates bugs in OpenLDAP code. Your bug is specific to the build provided (and patched) by Red Hat. We can't help you, sorry.