Bug 664671

Summary: Admin server segfault when full SSL access (http+ldap+console) required
Product: [Retired] 389 Reporter: Andrey Ivanov <andrey.ivanov>
Component: AdminAssignee: Rich Megginson <rmeggins>
Status: CLOSED CURRENTRELEASE QA Contact: Viktor Ashirov <vashirov>
Severity: medium Docs Contact:
Priority: high    
Version: 1.2.7CC: 3+bugzilla, amsharma, jgalipea, luke+redhat, nkinder
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-07 17:11:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 434915    
Attachments:
Description Flags
0003-Bug-664671-Admin-server-segfault-when-full-SSL-acces.patch
nhosoi: review+
adm.conf
none
console.conf
none
nss.conf
none
admin-serv directory listing
none
admin-serv certutil listing
none
admin-serv modutil output none

Description Andrey Ivanov 2010-12-21 09:02:47 UTC
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13

The latest version of admin server (1.1.13) segfaults when the full SSL access is configured (https+ldaps+secure console)

Reproducible: Always

Steps to Reproduce:
1. Enable ldaps, activate SSL in administration server (https), activate "Use SSL in Console". 
2. (re)start the administration server

Actual Results:  
error logs of the admin server:

[Tue Dec 21 09:49:54 2010] [notice] Access Host filter is: *.polytechnique.fr
[Tue Dec 21 09:49:54 2010] [notice] Access Address filter is: *
[Tue Dec 21 09:49:54 2010] [notice] Unable to shutdown NSS - still busy - assume mod_nss is holding references - continuing
[Tue Dec 21 09:49:55 2010] [notice] Apache/2.2 configured -- resuming normal operations
[Tue Dec 21 09:49:55 2010] [error] NSS_Initialize failed. Certificate database: /Local/dirsrv/etc/dirsrv/admin-serv.
[Tue Dec 21 09:49:55 2010] [error] SSL Library Error: -8038 SEC_ERROR_NOT_INITIALIZED
[Tue Dec 21 09:49:56 2010] [notice] child pid 4553 exit signal Segmentation fault (11)
[Tue Dec 21 09:49:57 2010] [error] NSS_Initialize failed. Certificate database: /Local/dirsrv/etc/dirsrv/admin-serv.
[Tue Dec 21 09:49:57 2010] [error] SSL Library Error: -8038 SEC_ERROR_NOT_INITIALIZED
[Tue Dec 21 09:49:58 2010] [notice] child pid 4557 exit signal Segmentation fault (11)
[Tue Dec 21 09:49:59 2010] [error] NSS_Initialize failed. Certificate database: /Local/dirsrv/etc/dirsrv/admin-serv.
[Tue Dec 21 09:49:59 2010] [error] SSL Library Error: -8038 SEC_ERROR_NOT_INITIALIZED
[Tue Dec 21 09:50:00 2010] [notice] child pid 4558 exit signal Segmentation fault (11)
[Tue Dec 21 09:50:01 2010] [error] NSS_Initialize failed. Certificate database: /Local/dirsrv/etc/dirsrv/admin-serv.
[Tue Dec 21 09:50:01 2010] [error] SSL Library Error: -8038 SEC_ERROR_NOT_INITIALIZED
[Tue Dec 21 09:50:02 2010] [notice] child pid 4559 exit signal Segmentation fault (11)

...

Expected Results:  
The admin server should not crash.

The procedure of SSL activation (certificate installation, pins, password.conf, nss.conf etc) is scripted, so it is reproducible each time.

I compile the ds server, admin server, mod_nss and adminutil from sources (CentOS 5.5 x86_64) :
389-admin-1.1.13.tar.bz2
389-adminutil-1.1.13.tar.bz2
389-ds-base-1.2.7.5.tar.bz2
mod_nss-1.0.8.tar.gz

The previous versions did not exhibit this sort of behavior (we use 1.1.11.rc2  on our production servers), i think it was introduced in the patch for the Bug 618454 - mod_admserv should only clear NSS caches and shutdown if NSS is initialized

Comment 2 Nathan Kinder 2011-01-05 17:53:02 UTC
I do think that there is a problem related to NSS shutdown, but I believe that the repeated crashing is due to bug 638511.

Comment 3 Andrey Ivanov 2011-01-05 18:06:35 UTC
I don't think so, i always disable SELINUX on our production and test servers because all the 389 files are placed in special folders (somthing like /Local/dirsrv/etc, /Local/dirsrv/var/lib etc), we do configure --prefix=/Local/dirsrv

Comment 4 Rich Megginson 2011-01-07 22:00:17 UTC
Created attachment 472303 [details]
0003-Bug-664671-Admin-server-segfault-when-full-SSL-acces.patch

Comment 5 Rich Megginson 2011-01-08 03:11:41 UTC
commit f08ab2ae5a9ce1ed7d5187f5e93a7e7854faacf3
Author: Rich Megginson <rmeggins>
Date:   Wed Jan 5 15:47:28 2011 -0700
    Fix Description: Do not call NSS_Shutdown in mod_admserv.  It should always
    be called in mod_nss, after mod_admserv_unload is called.  The only thing
    we need to do in mod_admserv_unload() is to clear the session cache to
    release any resources acquired by mod_admserv.  mod_nss unload will take
    care of the rest.
    Platforms tested: RHEL5 i386 RHEL6 x86_64
    Flag Day: no
    Doc impact: no

Comment 6 Amita Sharma 2011-05-18 11:32:41 UTC
Tested with below steps :
1. Enabled ldaps, 
activated SSL in administration server (https), 
activated "Use SSL in Console". 
2. Restated the administration server

No Crash found.. Hence marking the bug as verified.

Comment 7 Anthony Messina 2011-06-10 01:15:11 UTC
I continue to see this problem with: 389-admin-1.1.16-2.fc15.i686

[Thu Jun 09 20:13:18 2011] [notice] caught SIGTERM, shutting down
[Thu Jun 09 20:13:30 2011] [notice] SELinux policy enabled; httpd running as context unconfined_u:system_r:httpd_t:s0
[Thu Jun 09 20:13:31 2011] [notice] Access Host filter is: *.elburn.messinet.com
[Thu Jun 09 20:13:31 2011] [notice] Access Address filter is: *
[Thu Jun 09 20:13:32 2011] [notice] Apache/2.2.17 (Unix) mod_nss/2.2.17 NSS/3.12.9.0 configured -- resuming normal operations
[Thu Jun 09 20:13:32 2011] [error] NSS_Initialize failed. Certificate database: /etc/dirsrv/admin-serv.
[Thu Jun 09 20:13:32 2011] [error] SSL Library Error: -8038 SEC_ERROR_NOT_INITIALIZED
[Thu Jun 09 20:13:32 2011] [error] Could not bind as []: ldap error -1: Can't contact LDAP server
[Thu Jun 09 20:13:32 2011] [error] Could not bind as []: ldap error -1: Can't contact LDAP server
[Thu Jun 09 20:13:32 2011] [warn] Unable to bind as LocalAdmin to populate LocalAdmin tasks into cache.
[Thu Jun 09 20:13:32 2011] [crit] sslinit: NSS is required to use LDAPS, but security initialization failed [-8128:security library: no security module can perform the requested operation.].  Cannot start server

Comment 8 Rich Megginson 2011-06-10 17:22:28 UTC
(In reply to comment #7)
> I continue to see this problem with: 389-admin-1.1.16-2.fc15.i686
> 
> [Thu Jun 09 20:13:18 2011] [notice] caught SIGTERM, shutting down
> [Thu Jun 09 20:13:30 2011] [notice] SELinux policy enabled; httpd running as
> context unconfined_u:system_r:httpd_t:s0
> [Thu Jun 09 20:13:31 2011] [notice] Access Host filter is:
> *.elburn.messinet.com
> [Thu Jun 09 20:13:31 2011] [notice] Access Address filter is: *
> [Thu Jun 09 20:13:32 2011] [notice] Apache/2.2.17 (Unix) mod_nss/2.2.17
> NSS/3.12.9.0 configured -- resuming normal operations
> [Thu Jun 09 20:13:32 2011] [error] NSS_Initialize failed. Certificate database:
> /etc/dirsrv/admin-serv.
> [Thu Jun 09 20:13:32 2011] [error] SSL Library Error: -8038
> SEC_ERROR_NOT_INITIALIZED
> [Thu Jun 09 20:13:32 2011] [error] Could not bind as []: ldap error -1: Can't
> contact LDAP server
> [Thu Jun 09 20:13:32 2011] [error] Could not bind as []: ldap error -1: Can't
> contact LDAP server
> [Thu Jun 09 20:13:32 2011] [warn] Unable to bind as LocalAdmin to populate
> LocalAdmin tasks into cache.
> [Thu Jun 09 20:13:32 2011] [crit] sslinit: NSS is required to use LDAPS, but
> security initialization failed [-8128:security library: no security module can
> perform the requested operation.].  Cannot start server

This doesn't look like a crash/segfault.  Can you attach /etc/dirsrv/admin-serv/adm.conf /etc/dirsrv/admin-serv/console.conf /etc/dirsrv/admin-serv/nss.conf

ls -al /etc/dirsrv/admin-serv

certutil -d /etc/dirsrv/admin-serv -L
modutil -dbdir /etc/dirsrv/admin-serv -list

Comment 9 Anthony Messina 2011-06-10 18:42:05 UTC
Created attachment 504184 [details]
adm.conf

Comment 10 Anthony Messina 2011-06-10 18:42:33 UTC
Created attachment 504185 [details]
console.conf

Comment 11 Anthony Messina 2011-06-10 18:42:55 UTC
Created attachment 504186 [details]
nss.conf

Comment 12 Anthony Messina 2011-06-10 18:43:20 UTC
Created attachment 504187 [details]
admin-serv directory listing

Comment 13 Anthony Messina 2011-06-10 18:43:42 UTC
Created attachment 504188 [details]
admin-serv certutil listing

Comment 14 Anthony Messina 2011-06-10 18:44:04 UTC
Created attachment 504189 [details]
admin-serv modutil output

Comment 15 Anthony Messina 2011-06-10 18:44:24 UTC
(In reply to comment #8)
> 
> This doesn't look like a crash/segfault.  Can you attach
> /etc/dirsrv/admin-serv/adm.conf /etc/dirsrv/admin-serv/console.conf
> /etc/dirsrv/admin-serv/nss.conf
> 
> ls -al /etc/dirsrv/admin-serv
> 
> certutil -d /etc/dirsrv/admin-serv -L
> modutil -dbdir /etc/dirsrv/admin-serv -list

Of course, you are right Rich.  There is no segfault, but the admin-serv is not will not connect to the dirsrv when SSL is enabled.

I have since reverted the problem using ldapmodify:

dn: cn=slapd-elburn,cn=389 Directory Server,cn=Server Group,cn=elburn.messinet
 .com,ou=elburn.messinet.com,o=NetscapeRoot
changetype: modify
replace: nsServerSecurity
nsServerSecurity: off

and by reverting adm.conf from:
ldapurl: ldaps://elburn.messinet.com:636/o=NetscapeRoot

to
ldapurl: ldap://elburn.messinet.com:389/o=NetscapeRoot


In this way, I am able to connect to this remote admin-serv via https on the console, but keep the link between admin-serv and dirsrv without SSL (they are on the same machine).

Comment 16 Rich Megginson 2011-06-10 19:03:24 UTC
created bug https://bugzilla.redhat.com/show_bug.cgi?id=712491 to track the NSS issue

Comment 17 John Paul Tate 2014-04-17 20:14:49 UTC
I am experiencing the symptoms of the original post, on RHEL 6, RHDS 9.1:

389-admin-1.1.34-1.el6.x86_64
389-admin-console-1.1.8-1.el6.noarch
389-admin-console-doc-1.1.8-1.el6.noarch
389-adminutil-1.1.17-1.el6.x86_64
389-console-1.1.7-1.el6.noarch
389-ds-base-1.2.11.15-31.el6_5.x86_64
389-ds-base-libs-1.2.11.15-31.el6_5.x86_64
389-ds-console-1.2.7-1.el6.noarch
389-ds-console-doc-1.2.7-1.el6.noarch
389-dsgw-1.1.11-1.el6.x86_64
redhat-ds-9.1.0-1.el6.x86_64
redhat-ds-admin-9.1.0-1.el6.x86_64
redhat-ds-base-9.1.0-1.el6dsrv.x86_64
redhat-ds-console-9.1.0-1.el6.noarch
redhat-ds-console-doc-9.1.0-1.el6.noarch

redhat-release-server-6Server-6.5.0.1.el6.x86_64

/var/log/dirsrv/admin-serv/error:

[Thu Apr 17 14:59:42 2014] [notice] SELinux policy enabled; httpd running as context unconfined_u:system_r:httpd_t:s0
[Thu Apr 17 14:59:43 2014] [notice] Access Host filter is: <removed>
[Thu Apr 17 14:59:43 2014] [notice] Access Address filter is: *
[Thu Apr 17 14:59:44 2014] [notice] Apache/2.2.15 (Unix) mod_nss/2.2.15 NSS/3.15.1 Basic ECC configured -- resuming normal operations
[Thu Apr 17 14:59:44 2014] [error] NSS_Initialize failed. Certificate database: /etc/dirsrv/admin-serv.
[Thu Apr 17 14:59:44 2014] [error] SSL Library Error: -8038 SEC_ERROR_NOT_INITIALIZED
[Thu Apr 17 14:59:45 2014] [notice] child pid 16441 exit signal Segmentation fault (11)
[Thu Apr 17 14:59:46 2014] [error] NSS_Initialize failed. Certificate database: /etc/dirsrv/admin-serv.
[Thu Apr 17 14:59:46 2014] [error] SSL Library Error: -8038 SEC_ERROR_NOT_INITIALIZED
[Thu Apr 17 14:59:47 2014] [notice] child pid 16464 exit signal Segmentation fault (11)


When the server was initially setup with full SSL, the issue did not present at that time. It began when the server was shutdown, then restarted a few days later:

[Sat Apr 12 02:48:48 2014] [notice] caught SIGTERM, shutting down
[Mon Apr 14 17:22:30 2014] [notice] SELinux policy enabled; httpd running as context unconfined_u:system_r:httpd_t:s0
[Mon Apr 14 17:22:31 2014] [notice] Access Host filter is: <removed>
[Mon Apr 14 17:22:31 2014] [notice] Access Address filter is: *
[Mon Apr 14 17:22:32 2014] [notice] Apache/2.2.15 (Unix) mod_nss/2.2.15 NSS/3.15.1 Basic ECC configured -- resuming normal operations
[Mon Apr 14 17:22:32 2014] [error] NSS_Initialize failed. Certificate database: /etc/dirsrv/admin-serv.
[Mon Apr 14 17:22:32 2014] [error] SSL Library Error: -8038 SEC_ERROR_NOT_INITIALIZED
[Mon Apr 14 17:22:33 2014] [notice] child pid 2628 exit signal Segmentation fault (11)
[Mon Apr 14 17:22:34 2014] [error] NSS_Initialize failed. Certificate database: /etc/dirsrv/admin-serv.
[Mon Apr 14 17:22:34 2014] [error] SSL Library Error: -8038 SEC_ERROR_NOT_INITIALIZED

Is there a workaround for this?

Comment 18 Nathan Kinder 2014-04-17 20:35:44 UTC
(In reply to John Paul Tate from comment #17)
> 
> Is there a workaround for this?

This problem is caused by a bug in NSS.  What version of the nss-softokn package do you have installed?  You should make sure that you are using the package versions mentioned in this errata:

https://rhn.redhat.com/errata/RHBA-2014-0398.html

Comment 19 John Paul Tate 2014-04-17 20:45:02 UTC
I just learned that the servers were just updated (on 4/12/2014), and nss-softokn was updated from 3.14.3-3.el6_4.x86_64 to 3.14.3-9.el6.x86_64. Apparently that is causing the problem, and I need to get them updated to 3.14.3-10.el6_5.x86_64.