This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1005446 - ipa-ca-install fails with CA not starting after 120 seconds.
ipa-ca-install fails with CA not starting after 120 seconds.
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: ipa (Show other bugs)
7.0
Unspecified Unspecified
high Severity unspecified
: rc
: ---
Assigned To: Martin Kosek
Namita Soman
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-09-06 22:37 EDT by Michael Gregg
Modified: 2014-08-05 07:17 EDT (History)
7 users (show)

See Also:
Fixed In Version: ipa-3.3.2-2.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-06-13 06:15:21 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
ipa ca install log file (117.71 KB, text/x-log)
2013-09-06 22:37 EDT, Michael Gregg
no flags Details
client install log from replica (56.97 KB, text/x-log)
2013-09-06 22:38 EDT, Michael Gregg
no flags Details

  None (edit)
Description Michael Gregg 2013-09-06 22:37:29 EDT
Created attachment 795045 [details]
ipa ca install log file

Description of problem:
install is one master, and one replica

After the master is installed, attempting to run ipa-ca-install on replica fails.

Version-Release number of selected component (if applicable):
ipa-server-3.3.1-2.el7.x86_64

How reproducible:
always

Steps to Reproduce:
# On master:
1. /usr/sbin/ipa-server-install --setup-dns --forwarder=10.11.5.19 --hostname=qe-blade-04.testrelm.com -r TESTRELM.COM -n testrelm.com -p Secret123 -P Secret123 -a Secret123 -U
2. ipa-replica-prepare --ip-address=10.16.76.36 qe-blade-05.testrelm.com
3. copy gpg file to Replica
# On Replica
4. ipa-replica-install --setup-dns /tmp/replica-info-qe-blade-05.testrelm.com.gpg
5. ipa-ca-install -p Secret123 -w Secret123 --skip-conncheck --unattended /tmp/replica-info-qe-blade-05.testrelm.com.gpg

Actual results:
[root@qe-blade-05 ~]# ipa-ca-install -p Secret123 -w Secret123 --skip-conncheck --unattended /tmp/replica-info-qe-blade-05.testrelm.com.gpg 
Configuring certificate server (pki-tomcatd): Estimated time 3 minutes 30 seconds
  [1/18]: creating certificate server user
  [2/18]: configuring certificate server instance
  [3/18]: stopping certificate server instance to update CS.cfg
  [4/18]: disabling nonces
  [5/18]: set up CRL publishing
  [6/18]: starting certificate server instance

Your system may be partly configured.
Run /usr/sbin/ipa-server-install --uninstall to clean up.

CA did not start in 120s


Expected results:

Additional info:
Comment 1 Michael Gregg 2013-09-06 22:38:01 EDT
Created attachment 795046 [details]
client install log from replica
Comment 2 Michael Gregg 2013-09-06 22:40:41 EDT
As you can see, the ca install log ends with a python exception that looks like it was caused when the CA does in fact not return:

2013-09-07T02:29:06Z DEBUG request headers {'date': 'Sat, 07 Sep 2013 02:29:06 GMT', 'content-length': '219', 'content-type': 'text/html; charset=iso-8859-1', 'server': 'Apache/2.4.6 (Red Hat) mod_auth_kerb/5.4 mod_nss/2.4.4 NSS/3.15 mod_wsgi/3.4 Python/2.7.5'}
2013-09-07T02:29:06Z DEBUG request body '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">\n<html><head>\n<title>404 Not Found</title>\n</head><body>\n<h1>Not Found</h1>\n<p>The requested URL /ca/admin/ca/getStatus was not found on this server.</p>\n</body></html>\n'
2013-09-07T02:29:06Z DEBUG The CA status is: check interrupted
2013-09-07T02:29:06Z DEBUG Waiting for CA to start...
2013-09-07T02:29:07Z DEBUG   File "/usr/lib/python2.7/site-packages/ipaserver/install/installutils.py", line 622, in run_script
    return_value = main_function()

  File "/usr/sbin/ipa-ca-install", line 186, in main
    config, dogtag_master_ds_port, postinstall=True)

  File "/usr/lib/python2.7/site-packages/ipaserver/install/cainstance.py", line 1680, in install_replica_ca
    subject_base=config.subject_base)

  File "/usr/lib/python2.7/site-packages/ipaserver/install/cainstance.py", line 478, in configure_instance
    self.start_creation(runtime=210)

  File "/usr/lib/python2.7/site-packages/ipaserver/install/service.py", line 364, in start_creation
    method()

  File "/usr/lib/python2.7/site-packages/ipaserver/install/cainstance.py", line 484, in __start
    self.start()

  File "/usr/lib/python2.7/site-packages/ipaserver/install/service.py", line 268, in start
    self.service.start(instance_name, capture_output=capture_output, wait=wait)

  File "/usr/lib/python2.7/site-packages/ipapython/platform/fedora16/service.py", line 170, in start
    self.__wait_until_running()

  File "/usr/lib/python2.7/site-packages/ipapython/platform/fedora16/service.py", line 164, in __wait_until_running
    raise RuntimeError('CA did not start in %ss' % timeout)

2013-09-07T02:29:07Z DEBUG The ipa-ca-install command failed, exception: RuntimeError: CA did not start in 120s
Comment 5 Martin Kosek 2013-09-09 07:39:47 EDT
What looks suspicious to me is the discrepancy between checked ports & the port where the real check is happening:

2013-09-07T02:27:05Z DEBUG args=/bin/systemctl is-active pki-tomcatd.target
2013-09-07T02:27:05Z DEBUG Process finished, return code=0
2013-09-07T02:27:05Z DEBUG stdout=active

2013-09-07T02:27:05Z DEBUG stderr=
2013-09-07T02:27:05Z DEBUG wait_for_open_ports: localhost [8080, 8443] timeout 120
2013-09-07T02:27:07Z DEBUG Waiting until the CA is running
2013-09-07T02:27:07Z DEBUG request 'https://qe-blade-05.testrelm.com:443/ca/admin/ca/getStatus'
2013-09-07T02:27:07Z DEBUG request body ''
2013-09-07T02:27:07Z DEBUG request status 404
2013-09-07T02:27:07Z DEBUG request reason_phrase u'Not Found'
2013-09-07T02:27:07Z DEBUG request headers {'date': 'Sat, 07 Sep 2013 02:27:07 GMT', 'content-length': '219', 'content-type': 'text/html; charset=iso-8859-1', 'server': 'Apache/2.4.6 (Red Hat) mod_auth_kerb/5.4 mod_nss/2.4.4 NSS/3.15 mod_wsgi/3.4 Python/2.7.5'}
2013-09-07T02:27:07Z DEBUG request body '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">\n<html><head>\n<title>404 Not Found</title>\n</head><body>\n<h1>Not Found</h1>\n<p>The requested URL /ca/admin/ca/getStatus was not found on this server.</p>\n</body></html>\n'
2013-09-07T02:27:07Z DEBUG The CA status is: check interrupted

Petr, is this by design?



2013-09-09T11:20:17Z DEBUG stdout=Loading deployment configuration from /tmp/tmpDxJPWq.
Installing CA into /var/lib/pki/pki-tomcat.
Storing deployment configuration into /etc/sysconfig/pki/tomcat/pki-tomcat/ca/deployment.cfg.

    ==========================================================================
                                INSTALLATION SUMMARY
    ==========================================================================

      Administrator's username:             admin
      Administrator's PKCS #12 file:
            /root/.dogtag/pki-tomcat/ca_admin_cert.p12

      Administrator's certificate nickname: ipa-ca-agent
      Administrator's certificate database:
            /tmp/tmp-KNcbaH

      To check the status of the subsystem: 
            systemctl status pki-tomcatd\@pki-tomcat.service
      To restart the subsystem: 
            systemctl restart pki-tomcatd\@pki-tomcat.service
      The URL for the subsystem is: 
            https://vm-052.idm.lab.bos.redhat.com:8443/ca

    ==========================================================================


2013-09-09T11:20:17Z DEBUG stderr=pkispawn    : WARNING  ....... unable to validate security domain     user/password through REST interface. Interface not available

2013-09-09T11:20:17Z DEBUG completed creating ca instance
2013-09-09T11:20:17Z DEBUG   duration: 79 seconds
2013-09-09T11:20:17Z DEBUG   [3/18]: stopping certificate server instance to update CS.cfg
2013-09-09T11:20:17Z DEBUG Starting external process
2013-09-09T11:20:17Z DEBUG args=/bin/systemctl stop pki-tomcatd.target
2013-09-09T11:20:19Z DEBUG Process finished, return code=0
2013-09-09T11:20:19Z DEBUG stdout=
2013-09-09T11:20:19Z DEBUG stderr=
2013-09-09T11:20:19Z DEBUG   duration: 1 seconds
2013-09-09T11:20:19Z DEBUG   [4/18]: disabling nonces
2013-09-09T11:20:19Z DEBUG   duration: 0 seconds
2013-09-09T11:20:19Z DEBUG   [5/18]: set up CRL publishing
2013-09-09T11:20:19Z DEBUG Starting external process
2013-09-09T11:20:19Z DEBUG args=/usr/sbin/selinuxenabled
2013-09-09T11:20:19Z DEBUG Process finished, return code=0
2013-09-09T11:20:19Z DEBUG stdout=
2013-09-09T11:20:19Z DEBUG stderr=
2013-09-09T11:20:19Z DEBUG Starting external process
2013-09-09T11:20:19Z DEBUG args=/usr/sbin/restorecon /var/lib/ipa/pki-ca/publish
2013-09-09T11:20:19Z DEBUG Process finished, return code=0
2013-09-09T11:20:19Z DEBUG stdout=
2013-09-09T11:20:19Z DEBUG stderr=
2013-09-09T11:20:19Z DEBUG   duration: 0 seconds
2013-09-09T11:20:19Z DEBUG   [6/18]: starting certificate server instance
2013-09-09T11:20:19Z DEBUG Starting external process
2013-09-09T11:20:19Z DEBUG args=/bin/systemctl start pki-tomcatd.target
2013-09-09T11:20:19Z DEBUG Process finished, return code=0
2013-09-09T11:20:19Z DEBUG stdout=
2013-09-09T11:20:19Z DEBUG stderr=
2013-09-09T11:20:19Z DEBUG Starting external process
2013-09-09T11:20:19Z DEBUG args=/bin/systemctl is-active pki-tomcatd.target
2013-09-09T11:20:19Z DEBUG Process finished, return code=0
2013-09-09T11:20:19Z DEBUG stdout=active

2013-09-09T11:20:19Z DEBUG stderr=
2013-09-09T11:20:19Z DEBUG wait_for_open_ports: localhost [8080, 8443] timeout 120
2013-09-09T11:22:19Z DEBUG   File "/usr/lib/python2.7/site-packages/ipaserver/install/installutils.py", line 622, in run_script
    return_value = main_function()

  File "/usr/sbin/ipa-ca-install", line 186, in main
    config, dogtag_master_ds_port, postinstall=True)



Ade, I just tried installing a 7.0 replica for RHEL-6.5 master as above and it seems to me that CA is not running:

# ps ax | grep tomcat
16883 pts/0    S+     0:00 grep --color=auto tomcat

Even manual startup seems non-functional:


# /bin/systemctl start pki-tomcatd.target
# /bin/systemctl is-active pki-tomcatd.target
active
# /bin/systemctl status pki-tomcatd.target
pki-tomcatd.target - PKI Tomcat Server
   Loaded: loaded (/usr/lib/systemd/system/pki-tomcatd.target; disabled)
   Active: active since Mon 2013-09-09 07:20:19 EDT; 16min ago

Sep 09 07:20:19 vm-052.idm.lab.bos.redhat.com systemd[1]: Starting PKI Tomcat Server.
Sep 09 07:20:19 vm-052.idm.lab.bos.redhat.com systemd[1]: Reached target PKI Tomcat Server.
Sep 09 07:35:29 vm-052.idm.lab.bos.redhat.com systemd[1]: Reached target PKI Tomcat Server.
Sep 09 07:36:42 vm-052.idm.lab.bos.redhat.com systemd[1]: Reached target PKI Tomcat Server.

# ps ax | grep tomcat
16883 pts/0    S+     0:00 grep --color=auto tomcat

Also this line looked suspicious:
2013-09-09T11:20:17Z DEBUG stderr=pkispawn    : WARNING  ....... unable to validate security domain     user/password through REST interface. Interface not available
Can it be related?

Can you please advise what could be the reason of these replica install failures?
Comment 6 Petr Viktorin 2013-09-09 07:53:01 EDT
Martin, those are two different checks; ports [8080, 8443] are the internal Dogtag ports, on port 443 (HTTPS) we then ask for status via our Apache proxy.

The bug happens because the pki service is not properly set up to always start when the target is started. (Usually this manifests itself when the service fails, which happens often because it can return a non-zero exit code that systemd is not set up to recognize.)

This is already fixed in Dogtag upstream as https://fedorahosted.org/pki/ticket/716
Comment 7 Martin Kosek 2013-09-09 07:59:11 EDT
Thanks for info. I suspected that this is a systemd issue.

Moving to pki-ca so that they can close the bug when the upstream fix gets to 7.0. Ade, I would ask for this to be done rather sooner than later so that testing is not blocked.
Comment 8 Nathan Kinder 2013-09-09 11:06:26 EDT
Upstream ticket:
https://fedorahosted.org/pki/ticket/740
Comment 9 Michael Gregg 2013-09-13 20:23:52 EDT
Still failing despite new builds:
ipa-server-3.3.1-3.el7.x86_64
pki-ca-10.0.5-1.el7.noarch

[root@hp-sl230sgen8-01 ~]# ipa-ca-install /tmp/replica-info-hp-sl230sgen8-01.testrelm.com.gpg 
Directory Manager (existing master) password: 

Run connection check to master
Check connection from replica to remote master 'ipaqa64vmg.testrelm.com':
   Directory Service: Unsecure port (389): OK
   Directory Service: Secure port (636): OK
   Kerberos KDC: TCP (88): OK
   Kerberos Kpasswd: TCP (464): OK
   HTTP Server: Unsecure port (80): OK
   HTTP Server: Secure port (443): OK

The following list of ports use UDP protocol and would need to be
checked manually:
   Kerberos KDC: UDP (88): SKIPPED
   Kerberos Kpasswd: UDP (464): SKIPPED

Connection from replica to master is OK.
Start listening on required ports for remote master check
Get credentials to log in to remote master
admin@TESTRELM.COM password: 

Check SSH connection to remote master
Execute check on remote master
Check connection from master to remote replica 'hp-sl230sgen8-01.testrelm.com':
   Directory Service: Unsecure port (389): OK
   Directory Service: Secure port (636): OK
   Kerberos KDC: TCP (88): OK
   Kerberos KDC: UDP (88): WARNING
   Kerberos Kpasswd: TCP (464): OK
   Kerberos Kpasswd: UDP (464): WARNING
   HTTP Server: Unsecure port (80): OK
   HTTP Server: Secure port (443): OK
The following UDP ports could not be verified as open: 88, 464
This can happen if they are already bound to an application
and ipa-replica-conncheck cannot attach own UDP responder.

Connection from master to replica is OK.

Connection check OK
Configuring certificate server (pki-tomcatd): Estimated time 3 minutes 30 seconds
  [1/18]: creating certificate server user
  [2/18]: configuring certificate server instance
  [3/18]: stopping certificate server instance to update CS.cfg
  [4/18]: disabling nonces
  [5/18]: set up CRL publishing
  [6/18]: starting certificate server instance

Your system may be partly configured.
Run /usr/sbin/ipa-server-install --uninstall to clean up.

CA did not start in 120s
Comment 10 Ade Lee 2013-10-03 10:55:58 EDT
mkosek, 

Your original suspicion was well founded.  I have reproduced the problem and found the following:

1. dogtag does in fact come up.  You can see this by doing ps -ef |grep java
2. You can also try to get to the getStatus servlets as follows:
  wget https://localhost:8443/ca/admin/ca/getStatus --no-check-certificate
  or use a browser.
3. Trying to get to the servlet through port 443, though, gives a 404.

The reason for this is that the file ipa-pki-proxy.conf needs to be copied to /etc/httpd/conf.d  first and httpd restarted.  And there may be other httpd config to be performed -- certs to be imported?

The point though is that you are trying to confirm dogtag status through the apache port when apache has not yet been configured to pass requests to dogtag.

Switching bug component back to IPA.
Comment 12 Martin Kosek 2013-10-07 04:51:32 EDT
Upstream ticket:
https://fedorahosted.org/freeipa/ticket/3964
Comment 15 Michael Gregg 2013-11-19 14:58:53 EST
Verified against ipa-server-3.3.3-4.el7.x86_64

On master:
Setup complete

Next steps:
	1. You must make sure these network ports are open:
		TCP Ports:
		  * 80, 443: HTTP/HTTPS
		  * 389, 636: LDAP/LDAPS
		  * 88, 464: kerberos
		  * 53: bind
		UDP Ports:
		  * 88, 464: kerberos
		  * 53: bind
		  * 123: ntp

	2. You can now obtain a kerberos ticket using the command: 'kinit admin'
	   This ticket will allow you to use the IPA tools (e.g., ipa user-add)
	   and the web user interface.

Be sure to back up the CA certificate stored in /root/cacert.p12
This file is required to create replicas. The password for this
file is the Directory Manager password
[root@qe-blade-01 ~]# ipa-replica-prepare --ip-address=10.16.4.19 mgmt9.testrelm.com
Directory Manager (existing master) password: 


On slave: 
[root@mgmt9 rhqa_ipa]# ipa-ca-install -p Secret123 -w Secret123 --skip-conncheck --unattended /tmp/replica-info-mgmt9.testrelm.com.gpg 
Configuring certificate server (pki-tomcatd): Estimated time 3 minutes 30 seconds
  [1/18]: creating certificate server user
  [2/18]: configuring certificate server instance
  [3/18]: stopping certificate server instance to update CS.cfg
  [4/18]: disabling nonces
  [5/18]: set up CRL publishing
  [6/18]: starting certificate server instance
  [7/18]: importing CA chain to RA certificate database
  [8/18]: fixing RA database permissions
  [9/18]: setting up signing cert profile
  [10/18]: set certificate subject base
  [11/18]: enabling Subject Key Identifier
  [12/18]: enabling CRL and OCSP extensions for certificates
  [13/18]: setting audit signing renewal to 2 years
  [14/18]: configuring certificate server to start on boot
  [15/18]: configure certmonger for renewals
  [16/18]: configure clone certificate renewals
  [17/18]: configure Server-Cert certificate renewal
  [18/18]: Configure HTTP to proxy connections
Done configuring certificate server (pki-tomcatd).
Restarting the directory and certificate servers
[root@mgmt9 rhqa_ipa]#
Comment 16 Ludek Smid 2014-06-13 06:15:21 EDT
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.

Note You need to log in before you can comment on or make changes to this bug.