description of problem: Admin Server Dies after you try bind to Directory Server via Secure Coonection Version-Release number of selected component (if applicable): Fedora Admin Server 1.0 Build 2005.333.229 How reproducible: Both the Fedora Admin and Direcotory Server are are SSL enabled ie For Directory Server LDAP/LDAPS enable [root@naruto ds]# tail errors [root@naruto tmp]# tail /tmp/bugzilla/working/ds/errors Fedora-Directory/1.0 B2005.333.229 naruto.csse.uwa.edu.au:636 (/opt/fedora-ds/slapd-naruto) [06/Dec/2005:17:15:08 +0800] - Fedora-Directory/1.0 B2005.333.229 starting up [06/Dec/2005:17:15:10 +0800] - slapd started. Listening on All Interfaces port 389 for LDAP requests [06/Dec/2005:17:15:10 +0800] - Listening on All Interfaces port 636 for LDAPS requests I can connect via the console (ie startconsole) to the Admin Server securely (Specifying Administration URL via https://naruto.csse.uwa.edu.au:59612) I can query/insert into the Fedora Direcotry Server(slapd) via ldap typical commands ie ldapsearch,ldapmodify etc Currently the Admin Directory Server running securely, and the Directory server is running securely (Port 636) and on the normal port (389). But the binding from the Admin Directory Server to the Directory Server is on the nonsecure port ie 389. Configeration->Configeration DS And enable the secure connection, defaulting to 636 ConfigerationDS->UserDS And rebind it using the secure connection ie by rebinding it to machinename:636, enable secure connection, specify the Directory Subtree,Bind DN, Bind Password. And Click Okay The Directory Server has been modified It says "You must shutdown and restart the administration Server and all the servers in the ServerGroup for directory service changes to take effect" (Usually I found if you put invalid entries ie wrong password or wrong port it tell you ie invalid Bind DN/Bind password, invalid LDAP Host/LDAP port etc before it will let you save it) Once its validates the certs,username,password it let me saves it and then I restart the services for admin server (restart-admin) and directory server (restart-slapd). The slapd comes back up and still functional I still can query it via the ldap command tools. But Fedora Admin Server dosn't come up at all. It keeps bombing seems like it having problems when accessing the SSL keys Looking at the error log for the DS its always this two lines, specifially the last line Steps to Reproduce: 1. Install RHE4 base configeration 2. Install IBM JDK 1.4.2 (rpm -ivh IBMJava2-142-ia32-SDK-1.4.2-3.0.i386.rpm) 3. Export Java variables to shell ie export JAVA_HOME=/opt/IBMJava2-142, export PATH=/opt/IBMJava2-142/bin:$PATH 4. Install the RPM for the appropriate distribution (rpm -ivh fedora-ds-1.0-2.RHEL4.i386.opt.rpm) 5. Run the setup for (/opt/fedora-ds/setup/setup) and setup using the defaults 6. Make sure the server / client are working by starting it up and seeeing if you can connect (start-slapd,start-admin then run startconsole to connect to admin server to connect to the directory server) 7. Generate the default Certificate DB files for Admin and DS Server (Ie by Console->Manager Security Certificate) 8. Generate Self Sign Certificates for admin server and directory server, converted to p12 format so I can use /opt/fedora/shared/bin/pk12util to insert it into the default Certifiate DB files in the alias directory (ie in /opt/fedora-ds/alias theres should be a pair of files for Admin server which are admin-serv-hostname-cert8.db and admin-serv-hostname-key3.db and similary for the Directory Server ie slapd-hostname-cert8 and slapd-hostname-key3.db) 9. Insert the p12 certificates using pk12util for Directory Server and Admin Server which should populate the files mentioned in step 8. 10. Now run the startconsole and make/check the certificates are valid for the Directory Server and Server Admin. Again you click on Console->Manager Security Certificate for both DS/Admin server and import CA certs to make the corresponding Server Certs valid. If the certs are valid you can check by clicking the server certs and click detail which should say something like certificate valid for SSL server certificate and SSL client certificate and not say broken or unvalid. 11. Enable Encryption for Admin Server by clicking Configeration->Encryption->Enable SSL for this Server Enable Use Ciper family (RSA), pick the certicate generated for Admin Server 12. Enable Encryption for Directory Server by clicking Configeration->Encryption->Enable SSL for this Server Enable Use Ciper family (RSA), pick the certicate generated for Directory Server and enable SSL in Fedora console 13. Now restart the service for Admin Server / Directory Server ie restart-admin / restart-slapd. Because you imported and enabled the SSL it should ask you for the password to access the certificates when running the scripts. The Admin server binds via https from the startconsole you may have to specify https instead. And the Directory Server if you look in the error logs like above in the msg it specifies its listening to port 636 and 389 14. Now I do a test if I can access the Admin Server and the directory Server Ie I can search the Directory Server with the current bind to port 389 ie locate a user. I can do ldap command tools via securely ie ldapadd securely. 15. Now as soon as I rebind the Admin Server to the Directory Server ie go into the Admin Server and do the following as I mentioned above. Configeration->Configeration DS And enable the secure connection, defaulting to 636 ConfigerationDS->UserDS And rebind it using the secure connection ie by rebinding it to machinename:636, enable secure connection, specify the Directory Subtree,Bind DN, Bind Password. "You must shutdown and resActual results: [Wed Dec 07 15:59:04 2005] [notice] Access Address filter is: * [Wed Dec 07 15:59:07 2005] [error] Unable to read from pin store for slot: NSS Certificate DB APR err: 0 Expected results: [Tue Dec 06 17:15:33 2005] [notice] Access Address filter is: * [Tue Dec 06 17:15:33 2005] [notice] Apache/2.0 configured -- resuming normal operations Additional info: I've had it working with Fedora Directory Server 7.1 with RHE4 tart the administration Server and all the servers in the ServerGroup for directory service changes to take effect" Then you restart the services for Admin Server and Directory Server, Directory Services (slapd) comes back up but the Admin Server refuse to come up but seems like its hitting problem with the accessing the certificates but is not a result of a wrong password getting access to the Certificates [Wed Dec 07 14:43:59 2005] [notice] Access Address filter is: * [Wed Dec 07 14:44:01 2005] [error] Unable to read from pin store for slot: NSS Certificate DB APR err: 0 As soon as I switch to secure binding between the Admin Server to the Directory Server. Ie on the Administrator Server I specify in the
Rob, I think we ran into this one before. Do you remember what the problem was?
Oh, wait - permissions? Can you provide the output of ls -l /opt/fedora-ds/alias ? Also, what is the uid that you are using to run the admin server and the directory server?
And verify that nss_pcache got started.
Thats the permissions of the /opt/fedora-ds/alias [root@naruto alias]# pwd /opt/fedora-ds/alias [root@naruto alias]# ls -al total 404 drwxr-xr-x 2 nobody nobody 4096 Dec 6 14:52 . drwxr-xr-x 17 root root 4096 Dec 6 14:50 .. -rw------- 1 nobody nobody 65536 Dec 6 14:57 admin-serv-naruto-cert8.db -rw------- 1 nobody nobody 24576 Dec 6 14:57 admin-serv-naruto-key3.db -rwxr-xr-x 1 root nobody 194880 Nov 30 06:06 libnssckbi.so -rw------- 1 nobody nobody 16384 Dec 8 09:38 secmod.db -rw------- 1 nobody nobody 65536 Dec 6 14:59 slapd-naruto-cert8.db -rw------- 1 nobody nobody 24576 Dec 6 14:58 slapd-naruto-key3.db Permissions are set for basically for noboy for rw. Now I've looked at the owner process for Fedora Direcotry Server. From the installation script, I choose the default ie Server User ID: nobody Server Group ID: nobody Administration Server: root Now if I look at the process before I switched to bind securely between the Directory Server and Admin Server I see [root@naruto slapd-naruto]# ps -aux | grep -i slap Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.3/FAQ nobody 4240 5.8 4.6 466088 23044 ? Sl 09:51 0:02 ./ns-slapd -D /opt/fedora-ds/slapd-naruto -i /opt/fedora-ds/slapd-naruto/logs/pid -w /opt/fedora-ds/slapd-naruto/logs/startpid root 4318 0.0 0.1 4452 664 pts/2 R+ 09:51 0:00 grep -i slap (slapd) Its running as nobody Now with the Fedora Admin Server before turning on the secure bind between Admin and the DS [root@naruto fedora-ds]# ps -aux | grep -i admin Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.3/FAQ root 4334 0.0 0.3 4772 1688 pts/2 S 09:53 0:00 /opt/fedora-ds/bin/admin/admin/bin/nss_pcache off /opt/fedora-ds/alias admin-serv-naruto- root 4338 0.8 2.1 36828 10800 ? Ssl 09:53 0:00 /usr/sbin//httpd.worker -k start -d /opt/fedora-ds/admin-serv -f /opt/fedora-ds/admin-serv/config/httpd.conf root 4339 0.0 1.2 32316 6212 ? S 09:53 0:00 /usr/sbin//httpd.worker -k start -d /opt/fedora-ds/admin-serv -f /opt/fedora-ds/admin-serv/config/httpd.conf nobody 4342 0.0 2.3 705096 11588 ? Sl 09:53 0:00 /usr/sbin//httpd.worker -k start -d /opt/fedora-ds/admin-serv -f /opt/fedora-ds/admin-serv/config/httpd.conf root 4411 0.0 0.1 3916 688 pts/2 R+ 09:54 0:00 grep -i admin running as root and nobody which is sounds about right. Now I thought about that permission previously about the alias directory. For test purposes I just chmod -R 777 /opt/fedora-ds/alias (Anyone has physical read/write/execute access now) Just to test the permission, then I rebind the Admin Server securely to the Directory Server like I indicated in my initial report and I pretty get much get the same thing ie this error [error] Unable to read from pin store for slot: NSS Certificate DB APR err: 0 With regards to the nss_pcache, thats mean't to be started when you initiate the start-admin or restart-admin script automatically. Without the secure binding between the Admin Server and Directory Server as you see from my process list from above nss_pcache. But as soon as I turned the secure binding between the two, no process which I should see including the nss_pcache from running the admin script ie restart-admin or start-admin show it.
When you start-admin in secure mode, does it prompt you for your pin on the command line? I think the answer is yes, inferring from your comments above. Try this: edit start-admin, and make httpd run with strace e.g. strace -o /tmp/httpd.out -f $HTTPD -k start -d $ADMSERV_ROOT -f $ADMSERV_ROOT/config/httpd.conf "$@" as the last line in start-admin. Once you have the file /tmp/httpd.out, please edit it to remove any sensitive information like passwords, then attach it to this bug. Thanks!
Created attachment 122023 [details] strace file from start-admin
I forgot to add, yes you are correct that it does ask me for the password to access the certs for both Admin Server / Directory Server. This automatically happens when you run the restart/start scripts for Admin Server / Directory Server once you entered/created a certificate database key3db/cert8db for either Servers respectively although you may not yet enabled the encryption method if I remembered correctly.
Thanks. Rob, if you look at the strace output, the command to pcache is STOR\tNSS Certificate DB\tXXXXXXXX and the error returned is 3, PIN_NOSUCHTOKEN Is "NSS Certificate DB" the correct token name?
I saw that too. That is the default token name for an NSS database. It should have been overridden to "internal". I'm not sure how this could have happened. It might be useful to see the file admin-serv/config/console.conf and nss.conf to see how SSL is configured.
Created attachment 122053 [details] Admin Serv Conf file after secure binding betweem Admin and DS
Created attachment 122054 [details] NSS conf file after DS and ADMIN are rebinded securely
Thanks for those conf files. We need one more - the console.conf file.
Created attachment 122055 [details] Console Configeration File, from Admin Server Should be attached, the configerations are all from DS / Admin Server after the attempt of securely binding between DS / Admin Server. I've not directly modified the files so they should be the default configeration files generated.
That's really odd. In the strace output, it's using "NSS Certificate DB" as the token name, but in your console.conf, it has NSSNickname Server-Cert-ADMIN as the token name. Try this: cd /opt/fedora-ds/alias ../shared/bin/certutil -L -d `pwd` -P admin-serv-naruto- What do you see? You should see a certificate named "Server-Cert-ADMIN".
Yes thats correct it should see the Server-Cert-ADMIN which is the Admin Server Cert for my local machine naruto [root@naruto alias]# pwd /opt/fedora-ds/alias [root@naruto alias]# ../shared/bin/certutil -L -d `pwd` -P admin-serv-naruto- Server-Cert-DS u,u,u Server-Cert-ADMIN u,u,u naruto.csse.uwa.edu.au CT,, naruto.csse.uwa.edu.au #2 CT,, [root@naruto alias]#
Ok, how about ../shared/bin/modutil -list -dbdir `pwd` -dbprefix admin-serv-naruto- ? I get this. Note that there is a token named "NSS Certificate DB" but the NSS code should alias this to "internal" which is what I think the code should be using. Listing of PKCS #11 Modules ----------------------------------------------------------- 1. NSS Internal PKCS #11 Module slots: 2 slots attached status: loaded slot: NSS Internal Cryptographic Services token: NSS Generic Crypto Services slot: NSS User Private Key and Certificate Services token: NSS Certificate DB 2. Root Certs library name: /opt/fedora-ds/alias/libnssckbi.so slots: 1 slot attached status: loaded slot: token: Builtin Object Token
Yeap sure here it is, looks exactly as the same above. [root@naruto alias]# date Fri Dec 9 13:36:34 WST 2005 [root@naruto alias]# pwd /opt/fedora-ds/alias [root@naruto alias]# ../shared/bin/modutil -list -dbdir `pwd` -dbprefix admin-serv-naruto- Using database directory /opt/fedora-ds/alias... Listing of PKCS #11 Modules ----------------------------------------------------------- 1. NSS Internal PKCS #11 Module slots: 2 slots attached status: loaded slot: NSS Internal Cryptographic Services token: NSS Generic Crypto Services slot: NSS User Private Key and Certificate Services token: NSS Certificate DB 2. Root Certs library name: /opt/fedora-ds/alias/libnssckbi.so slots: 1 slot attached status: loaded slot: token: Builtin Object Token -----------------------------------------------------------
Thanks for doing all of this. Unfortunately, we're baffled. There are two parts to this: mod_nss and nss_pcache. nss_pcache is a separate program that caches the pin that you enter on the console when admin server starts up, because due to the way Apache does module initialization, it may need this several times, and you probably don't want to enter it several times. In both places, the module is renamed from "NSS Certificate DB" to "internal" in memory for convenience. However, it appears in your case that in mod_nss, the rename doesn't happen, but in nss_pcache, it does. We can see that because when start-admin prompts you for the password, it's using "NSS Certificate DB" in the prompt rather than "internal". There is a workaround. You can create a pin file for your password. You would need to do this anyway if you wanted remote or unattended restarts of admin server, when there is no console to type the password into. 1) Create a file admin-serv/config/password.conf. This file should contain the following line: NSS Certificate DB:yourpassword where yourpassword is your password. 2) Make sure the file is owned by nobody:nobody and is mode 0400. 3) Edit admin-serv/config/nss.conf. Change the line NSSPassPhraseDialog builtin to NSSPassPhraseDialog file:/opt/fedora-ds/admin-serv/conf/password.conf Then restart-admin.
I gave that a shot, seems like it fails as well with the same result (Step 3 small typo error should be /opt/fedora-ds/admin-serv/config/password.conf to be consistent with step 1 just in case if anyones following this thread) [Mon Dec 12 10:03:33 2005] [notice] Access Host filter is: *.csse.uwa.edu.au [Mon Dec 12 10:03:33 2005] [notice] Access Address filter is: * [Mon Dec 12 10:03:33 2005] [error] Unable to read from pin store for slot: NSS Certificate DB APR err: 0 I was testing if it was actualy reading the changes so I substituted a dummy path to see but yes it reading teh changes you suggested or else it would of failed. [root@naruto fedora-ds]# ./restart-admin server not running Syntax error on line 48 of /opt/fedora-ds/admin-serv/config/nss.conf: NSSPassPhraseDialog: file '/opt/fedora-ds/admin-serv/config/password.conf.wrongplace' does not exist Hmmm I thought it would be a trivial thing I guess not. Its just extremely weird that the console can talk to admin serv via securely, ldap command tools can query Directory Server securely but you switch the bindings securely it fails. It dosn't really bother me as I'm using command based ldap tools to do editing and changing but some ppl who are not as fluent using ldap tools would use the GUI admin serv to do the simple things. Cheers then, Ashley
We've tested the console with ssl many times - that's why your particular case is so odd. Does it work if you use "internal" instead of "NSS Certificate DB" as the token name in password.conf?
Sorry that I took so long, I had other things which I had to do. Anyways I did the changes [root@naruto config]# pwd /opt/fedora-ds/admin-serv/config [root@naruto config]# cat password.conf internal:XXXXXXXX When I make the changes I did it without binding securely the Admin Server / Directory Server although Admin Server and Directory server is running securely. I restart the services for slapd and admin server, slapd comes back up and the admin server comes up. Then I repeat the process now changing the binding between the Admin Server and Directory Server securely. It does the same thing [Thu Dec 15 09:34:01 2005] [notice] caught SIGTERM, shutting down [Thu Dec 15 09:34:07 2005] [notice] Access Host filter is: *.csse.uwa.edu.au [Thu Dec 15 09:34:07 2005] [notice] Access Address filter is: * [Thu Dec 15 09:34:09 2005] [error] Unable to read from pin store for slot: NSS Certificate DB APR err: 0
We've finally been able to duplicate this problem in-house. To cause the problem, under the console open the Admin server, select the Configuration Directory tab and enable Security. When you restart the admin server the problem occurs. It seems to be related to the SSL working being done in mod_admserv. It has its own NSS_Initialize() and since it gets loaded first in the config, that explains why the token wasn't being renamed. Switching the order of the modules in httpd.conf gets the token renamed but mod_nss can't unload because something is holding open a reference so NSS_Shutdown() failes with SEC_ERROR_BUSY. We're investigating.
I've found several memory leaks and resource deallocation problems in mod_admserv that may be causing NSS_Shutdown to fail. We're testing a solution.
Created attachment 123293 [details] diffs for mod_admserv.c This is only part of the solution, which will require changes to mod_nss as well as the ldap c sdk and admin server.
Created attachment 123294 [details] diffs for mod_admserv.c Whoops, wrong diffs
Created attachment 123299 [details] added some admldapInfo memory leak fixes
Checking in mod_admserv.c; /cvs/dirsec/mod_admserv/mod_admserv.c,v <-- mod_admserv.c new revision: 1.19; previous revision: 1.18 done Reviewed by: Rob C. (Thanks!) Files: mod_admserv.c Branch: HEAD Fix Description: This fix makes the assumption that mod_nss will always be used. It is possible to use mod_admserv without mod_nss - this would mean that the admin server accepts http, but uses ldaps to communicate with the DS. However, I don't forsee that happening, so in order to simplify things, this fix makes mod_nss resposible for initializing NSS and shutting it down properly. Another problem was the memory and resource leaks. pset's have to be disposed of after use. This appears to have been a problem in the old NES libAdmservPlugin as well since most of the code was just copied/pasted. There were also a couple of other memory leaks. NOTE: This is only part of the total fix, which will involve changes to mod_nss, ldap sdk, and admin server components. Platforms tested: FC4 Flag Day: no Doc impact: no