Bug 502133
Summary: | httpd crashes when mod_nss is enabled | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | RJB <rogelio.javier> | ||||
Component: | nss | Assignee: | Elio Maldonado Batiz <emaldona> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 11 | CC: | bernie+fedora, fabian.lema, hyc, kengert, liedekef, mnk, rcritten, rrelyea, sgallagh, tscherf | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i586 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2009-08-13 15:03:53 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
RJB
2009-05-22 03:28:24 UTC
It is failing in different places each time I get a stack so it must be some sort of memory corruption. The common theme is that it always crashes right after RNG_RNGInit(). I've seem crashes in mod_perl, libdbus, nss and nspr. #0 PKIX_DoAddError (stdVars=0x0, error=0xb7a08f70, plContext=0xb69e065c) at pkix_errpaths.c:104 #1 0xb7a08f8d in PKIX_ValidateChain_NB (valParams=0x2e7ac8, pCertIndex=0x2b1750, pAnchorIndex=0xbfffdf38, pCheckerIndex=0xb69d85df, pRevChecking=0x2b0639, pCheckers=0xb69e065c, pNBIOContext=0xbfffde08, pResult=0xb69d4402, pVerifyTree=0xb69d43e9, plContext=0xb69e065c) at pkix_validate.c:1398 #2 0x002b065a in RNG_RNGInit () at drbg.c:462 #3 0xb69d4402 in RNG_RNGInit () at loader.c:834 #4 0xb69b6a52 in nsc_CommonInitialize (pReserved=0xbfffdef4, isFIPS=0) at pkcs11.c:2580 #5 0xb69b6d1d in NSC_Initialize (pReserved=0xbfffdef4) at pkcs11.c:2710 #6 0xb79a1011 in secmod_ModuleInit (mod=0xb562ba80, alreadyLoaded=0xbfffdf58) at pk11load.c:146 #7 0xb79a1616 in SECMOD_LoadPKCS11Module (mod=0xb562ba80) at pk11load.c:378 #8 0xb79b49eb in SECMOD_LoadModule ( modulespec=0xb562c778 " name=\"NSS Internal PKCS #11 Module\" parameters=\"configdir='/etc/httpd/alias' certPrefix='' keyPrefix='' secmod='secmod.db' flags=readOnly,passwordRequired updatedir='' updateCertPrefix='' updateKeyPr"..., parent=0xb5619320, recurse=1) at pk11pars.c:323 #9 0xb79b4b73 in SECMOD_LoadModule ( modulespec=0xb56186d8 "name=\"NSS Internal Module\" parameters=\"configdir='/etc/httpd/alias' certPrefix='' keyPrefix='' secmod='secmod.db' flags=readOnly,passwordRequired updatedir='' updateCertPrefix='' updateKeyPrefix='' up"..., parent=0x0, recurse=1) at pk11pars.c:338 #10 0xb7981bd5 in nss_Init (configdir=0xb76ba690 "/etc/httpd/alias", certPrefix=<value optimized out>, keyPrefix=0x0, secmodName=0x9f86f2 "secmod.db", updateDir=0xb7a668cf "", updCertPrefix=0xb7a668cf "", updKeyPrefix=0xb7a668cf "", updateID=0xb7a668cf "", updateName=0xb7a668cf "", readOnly=1, noCertDB=0, noModDB=0, forceOpen=0, noRootInit=0, optimizeSpace=0, noSingleThreadedModules=0, allowAlreadyInitializedModules=0, dontFinalizeModules=0) at nssinit.c:536 #11 0xb79821fd in NSS_Initialize (configdir=0xb76ba690 "/etc/httpd/alias", certPrefix=0x0, keyPrefix=0x0, secmodName=0x9f86f2 "secmod.db", flags=1) at nssinit.c:653 #12 0x009eb693 in nss_init_SSLLibrary (base_server=<value optimized out>) at nss_engine_init.c:198 #13 0x009ed747 in nss_init_Child (p=0xb561cb20, base_server=0x389598) at nss_engine_init.c:1131 #14 0x00353ef2 in ap_run_child_init (pchild=0xb561cb20, s=0x389598) at /usr/src/debug/httpd-2.2.11/server/config.c:155 #15 0x00369b28 in child_main (child_num_arg=<value optimized out>) at /usr/src/debug/httpd-2.2.11/server/mpm/prefork/prefork.c:513 #16 0x0036a0a9 in make_child (s=<value optimized out>, slot=0) at /usr/src/debug/httpd-2.2.11/server/mpm/prefork/prefork.c:690 #17 0x0036a82f in ap_mpm_run (_pconf=0x3876a0, plog=0x3b5758, s=0x389598) at /usr/src/debug/httpd-2.2.11/server/mpm/prefork/prefork.c:966 #18 0x0033e92a in main (argc=2, argv=0xbffff6a4) at /usr/src/debug/httpd-2.2.11/server/main.c:740 Any news on this bug? This bug makes ipa-server and fedora-ds-admin programs useless in Fedora-11 in addition to web sites using HTTPS with this module. You cannot enable the apache mod_ssl because it conflicts with the two packages mentioned above. I have looked on the internet for possible solutions and I have not found any (I have not found any one else reporting similar crashes either so it is possible that is something related to Fedora-11) No progress yet. The stack is getting smashed at some point during initialization but I haven't be able to determine why yet. It always fails at the same point, in the call to RNG_RNGInit() which is calling a static function. At some point the pointer to this static function is getting set to some arbitrary point in memory. It seems that this may be related to the glibc use of NSS without NSPR. I was able to get my own build of Apache and NSS working when I didn't define FREEBL_NO_DEPEND. I wonder if this is related to the changes made for glibc where it can use NSS hashing without requiring NSPR. It could be that some memory value is getting scrambled by the convoluted Apache load/unload module process and that is causing the crash. Bob Relyea suggested commenting out the two calls to freebl_releaseLibrary() in mozilla/security/nss/lib/freebl/stubs.c (in the NSS source tree). This lets the server start up again and I was able to serve up one simple client request. Looks like the problem is related to initializing the stub library used by glibc/libcrypto. Not sure what the long-term fix for this is yet. This might be the long term fix. The problem is apache is loading, calling, unloading modules. libfreebl is now used by libgcrypt, so it needs to run without nspr. When mod_nss is loaded, libfreebl now has access to NSPR and nssutil (which it needs to run services other than just hashing) when mod_nss gets unloaded, so does nspr and nssutil, but libfreebl now has references to them. commenting out the unload cases those libraries to stay in memory (since libfreebl now depends on them). The upshot is that those references will appear as leaks in any leak detection. bob Created attachment 346704 [details]
disable unloading the nspr libraries
Kai, this is what I used in my testing.
Re-assigning to Elio Applied the the upstream patch. nss-3.12.3.99.3-2.11.2.fc11 Tag to dist-f11-updates-candidate. See http://koji.fedoraproject.org/koji/buildinfo?buildID=105342 This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle. Changing version to '11'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping *** Bug 507425 has been marked as a duplicate of this bug. *** Can this build be pushed out to the testing repo so people can more easily try it, or to stable? It's been a month and a half since this problem was first reported. *** Bug 509090 has been marked as a duplicate of this bug. *** I just updated to nss-3.12.3.99.3-2.11.3.fc11.i586 from updates-testing, and I can confirm that the problem is solved on my system. *** Bug 512148 has been marked as a duplicate of this bug. *** What is the hold-up of getting this into updates? The problem has been fixed for 3 months now!? NM, the bug was simply not closed yet. This patch appears to be the cause of a leak of file descriptors in slapd: trinity:~# ll /proc/`pidof slapd`/fd | grep libnspr4 | wc -l 21 ... trinity:~# ll /proc/`pidof slapd`/fd | grep libnspr4 | wc -l 24 ... trinity:~# ll /proc/`pidof slapd`/fd | grep libnspr4 | wc -l 29 I can confirm the leak in file descriptors in slapd. This only occurs after it runs for a while, but for me this results in needing to restart slapd at least twice a day ... Franky Funny, I'm not seeing any reference to libnspr4 in /proc/nnn/fd any more. I'm now running openldap-servers-2.4.15-6.fc11 (as of Oct 10) and nss-softokn-freebl-3.12.4-3.fc11 (Sep 28). Nothing in the changelogs seems to indicate a specific fix. Well, I restarted slapd yesterday evening and no libnspr filedescriptor to be found anywhere, but just before I restarted it had that file open for 800 times. I didn't do any update at all, so I think the lib will be opened/closed all the time and the bug only gets triggered after some time. I'll check again this evening. Also the libnspr.so file is in the nspr rpm package, I don't know nss-softokn-freebl, is this an alternative? I can confirm the behaviour: after 14 hours, the slapd process has lots (344) of open file descriptors to /lib/libnspr4.so. It's stable for now, but I don't have a clue what causes the slapd to behave like this. Same here. Is there a specific bug report on openldap? This bug was about httpd and is now closed. It's unlikely that someone is going to follow up. Not yet, I tried to debug some things in openldap but haven't gotten anything usefull yet ... feel free to open one at openldap and point to this one :-) I opened openldap bug report 6336 for this Franky (In reply to comment #26) > I opened openldap bug report 6336 for this Here's a link: http://www.openldap.org/its/index.cgi/Incoming?id=6336;selectid=6336 (PS: OpenLDAP has the ugliest issue tracking system I've ever seen!!!) (In reply to comment #26) > I opened openldap bug report 6336 for this > > Franky The OpenLDAP Project only investigates bugs in OpenLDAP code. Your bug is specific to the build provided (and patched) by Red Hat. We can't help you, sorry. |