Bug 1415852 - Upgrading to latest IPA and pki(dogtag) breaks IPA system
Summary: Upgrading to latest IPA and pki(dogtag) breaks IPA system
Keywords:
Status: CLOSED DUPLICATE of bug 1224623
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: pki-core
Version: 7.3
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: rc
: ---
Assignee: RHCS Maintainers
QA Contact: Asha Akkiangady
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-01-23 23:50 UTC by bugsredhat
Modified: 2020-10-04 21:24 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-09-21 14:54:32 UTC
Target Upstream Version:


Attachments (Terms of Use)
pki-tomcat-catalina-logfile (32.18 KB, text/plain)
2017-01-24 00:00 UTC, bugsredhat
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github dogtagpki pki issues 2744 0 None closed Upgrading to latest IPA and pki(dogtag) breaks IPA system 2021-02-09 15:42:21 UTC

Description bugsredhat 2017-01-23 23:50:47 UTC
Description of problem:

After upgrading to latest packages of ipaserver and pki-core/dogtag ipaserver doesn't start anymore and hangs at: pki-tomcatd

The system was working before and I install updates as soon as they are available. The problem is since the latest update on 01/19/2017.

After some time it gets a timeout:
[root@ipaserver ~]# ipactl start
Existing service file detected!
Assuming stale, cleaning and proceeding
Starting Directory Service
Starting krb5kdc Service
Starting kadmin Service
Starting named Service
Starting ipa_memcached Service
Starting httpd Service
Starting ipa-custodia Service
Starting ntpd Service
Starting pki-tomcatd Service
Failed to start pki-tomcatd Service
Shutting down
Hint: You can use --ignore-service-failure option for forced start in case that a non-critical service failed
Aborting ipactl


The Log /var/log/pki/pki-tomcat/ca/selftest.log:

0.localhost-startStop-1 - [19/Jan/2017:10:17:21 MEZ] [20] [1] CAPresence:  CA is present
0.localhost-startStop-1 - [19/Jan/2017:10:17:21 MEZ] [20] [1] SystemCertsVerification: system certs verification failure: Certificate auditSigningCert cert-pki-ca is invalid: Invalid certificate: (-8101) Certificate type not approved for application.
0.localhost-startStop-1 - [19/Jan/2017:10:17:21 MEZ] [20] [1] SelfTestSubsystem: The CRITICAL self test plugin called selftests.container.instance.SystemCertsVerification running at startup FAILED!


This is the ONLY log for selftest.log directly after the upgrade of the server. No further Logs in selftest.log (?!). It seems that I doesn't reach this point anymore.


Here's the output for the certificates:
root@ipaserver pki-tomcat]# certutil -L -d /var/lib/pki/pki-tomcat/ca/alias/
Certificate Nickname                                         Trust Attributes
                                                            SSL,S/MIME,JAR/XPI
subsystemCert cert-pki-ca                                    u,u,u
auditSigningCert cert-pki-ca                                 u,u,u
caSigningCert cert-pki-ca                                    CTu,cu,u
Server-Cert cert-pki-ca                                      u,u,u
ocspSigningCert cert-pki-ca                                  u,u,u

If I try to change the settings for auditSigningCert with:
certutil -M -d /var/lib/pki/pki-tomcat/ca/alias/ -n "auditSigningCert cert-pki-ca" -t "u,u,Pu"

The new permissions are set but "ipactl start" breaks with the same error.


Version-Release number of selected component (if applicable):
[root@ipaserver ~]# rpm -qa|grep ipa-
ipa-client-4.4.0-14.el7.centos.4.x86_64
ipa-server-4.4.0-14.el7.centos.4.x86_64
ipa-server-common-4.4.0-14.el7.centos.4.noarch
ipa-server-dns-4.4.0-14.el7.centos.4.noarch
sssd-ipa-1.14.0-43.el7_3.11.x86_64
ipa-client-common-4.4.0-14.el7.centos.4.noarch
ipa-admintools-4.4.0-14.el7.centos.4.noarch
ipa-python-compat-4.4.0-14.el7.centos.4.noarch
ipa-common-4.4.0-14.el7.centos.4.noarch


[root@ipaserver ~]# rpm -qa|grep pki
pki-ca-10.3.3-16.el7_3.noarch
pki-base-java-10.3.3-16.el7_3.noarch
pki-server-10.3.3-16.el7_3.noarch
pki-base-10.3.3-16.el7_3.noarch
pki-kra-10.3.3-16.el7_3.noarch
pki-tools-10.3.3-16.el7_3.x86_64
pki-symkey-10.3.3-16.el7_3.x86_64


How reproducible: 
I'm not sure. Maybe Upgrading to latest ipaserver or pki/dogtag or maybe there was a broken ipa-cacert-manage renew?



Additional info:
Maybe this bug is also related to https://bugzilla.redhat.com/show_bug.cgi?id=1390319

Comment 1 bugsredhat 2017-01-23 23:52:10 UTC
The debug output of
"ipactl start -d":

ipa: DEBUG: stderr=
Starting pki-tomcatd Service
ipa: DEBUG: Starting external process
ipa: DEBUG: args=/bin/systemctl start pki-tomcatd.target
ipa: DEBUG: Process finished, return code=0
ipa: DEBUG: request POST http://ipaserver.ind.rwth-aachen.de:8080/ca/admin/ca/getStatus
ipa: DEBUG: request body ''
ipa: DEBUG: Failed to check CA status: cannot connect to 'http://ipaserver.ind.rwth-aachen.de:8080/ca/admin/ca/getStatus': [Errno 111] Connection refused
ipa: DEBUG: Waiting until the CA is running
ipa: DEBUG: request POST http://ipaserver.ind.rwth-aachen.de:8080/ca/admin/ca/getStatus
ipa: DEBUG: request body ''
ipa: DEBUG: The CA status is: check interrupted due to error: cannot connect to 'http://ipaserver.ind.rwth-aachen.de:8080/ca/admin/ca/getStatus': [Errno 111] Connection refused
ipa: DEBUG: Waiting for CA to start...
ipa: DEBUG: request POST http://ipaserver.ind.rwth-aachen.de:8080/ca/admin/ca/getStatus
ipa: DEBUG: request body ''
ipa: DEBUG: The CA status is: check interrupted due to error: cannot connect to 'http://ipaserver.ind.rwth-aachen.de:8080/ca/admin/ca/getStatus': [Errno 111] Connection refused
ipa: DEBUG: Waiting for CA to start...
ipa: DEBUG: request POST http://ipaserver.ind.rwth-aachen.de:8080/ca/admin/ca/getStatus
ipa: DEBUG: request body ''
ipa: DEBUG: The CA status is: check interrupted due to error: cannot connect to 'http://ipaserver.ind.rwth-aachen.de:8080/ca/admin/ca/getStatus': [Errno 111] Connection refused
ipa: DEBUG: Waiting for CA to start...
ipa: DEBUG: request POST http://ipaserver.ind.rwth-aachen.de:8080/ca/admin/ca/getStatus
ipa: DEBUG: request body ''
ipa: DEBUG: response status 404
ipa: DEBUG: response headers {'date': 'Thu, 19 Jan 2017 11:21:11 GMT', 'content-length': '993', 'content-type': 'text/html;charset=utf-8', 'content-language': 'en', 'server': 'Apache-Coyote/1.1'}
ipa: DEBUG: response body '<html><head><title>Apache Tomcat/7.0.69 - Error report</title><style><!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--></style> </head><body><h1>HTTP Status 404 - /ca/admin/ca/getStatus</h1><HR size="1" noshade="noshade"><p><b>type</b> Status report</p><p><b>message</b> <u>/ca/admin/ca/getStatus</u></p><p><b>description</b> <u>The requested resource is not available.</u></p><HR size="1" noshade="noshade"><h3>Apache Tomcat/7.0.69</h3></body></html>'
ipa: DEBUG: The CA status is: check interrupted due to error: Retrieving CA status failed with status 404
ipa: DEBUG: Waiting for CA to start...
ipa: DEBUG: request POST http://ipaserver.ind.rwth-aachen.de:8080/ca/admin/ca/getStatus
ipa: DEBUG: request body ''
ipa: DEBUG: response status 404

Comment 2 bugsredhat 2017-01-24 00:00:48 UTC
Created attachment 1243781 [details]
pki-tomcat-catalina-logfile

pki-tomcat log /var/log/pki/pki-tomcat/catalina

Comment 3 bugsredhat 2017-01-24 00:09:23 UTC
Also very strange is that some certificates seem to be renewed during the server upgrade but not all certificates.

These were renewed:
auditSigningCert cert-pki-ca
Server-Cert cert-pki-ca
ocspSigningCert cert-pki-ca


But NOT:
caSigningCert cert-pki-ca 
subsystemCert cert-pki-ca

!!AND!! Only the certificates on the broken master server were renewed. Not the ones at the two working replica servers.


Here the output of auditSigningCert for example:
[root@ipaserver pki-tomcat]# certutil -L -d /var/lib/pki/pki-tomcat/ca/alias -n "auditSigningCert cert-pki-ca"
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 56 (0x38)
        Signature Algorithm: PKCS #1 SHA-256 With RSA Encryption
        Issuer: "CN=Certificate Authority,O=IND.RWTH-AACHEN.DE"
        Validity:
            Not Before: Thu Jan 19 09:16:44 2017
            Not After : Wed Jan 09 09:16:44 2019
        Subject: "CN=CA Audit,O=IND.RWTH-AACHEN.DE"

Comment 5 Petr Vobornik 2017-01-31 13:47:40 UTC
Are all certs valid?

IF you renewed/add/changed CA cert in the past, have you run ipa-certupdate on all other systems to distribute the new cert(s)?

Comment 6 Jan Cholasta 2017-02-07 09:30:10 UTC
Petr, ipa-certupdate is needed only after the CA certificate ("caSigningCert cert-pki-ca") was renewed, which I think is not the case here.

bugsredhat@iks.rwth-aachen.de, could you please attach /var/log/pki/pki-tomcat/ca/debug?

Comment 7 Petr Vobornik 2017-02-10 17:11:07 UTC
per triage, this looks like a duplicate of bug 1417766 

I.e. issue might be that Dogtag does not respect the original subject name encoding when renewing a certificate.

Comment 8 Petr Vobornik 2017-02-17 16:55:55 UTC
No answer for a week. I'm assuming that theory in comment 7 is correct.

*** This bug has been marked as a duplicate of bug 1417766 ***

Comment 9 bugsredhat 2017-03-05 19:38:44 UTC
Sorry for the late reply. Solved the problem myself weeks ago and was on vacation.

Here my solution:
1.) During the update of the packages and the renewal of the certificate something went wrong with the permissions of auditSigningCert. I had to add the missing flag "P" like this:
certutil -L -d /var/lib/pki/pki-tomcat/ca/alias -n "auditSigningCert cert-pki-ca" -t "u,u,Pu"

2.) Another problem (maybe) caused through (1.) during the upgrade was that the file "/etc/pki/pki-tomcat/Catalina/localhost/ca.xml" was deleted!
May be it was opened during upgrade, upgrade breaks, and the file isn't rewritten?!

Solution for this: Copy ca.xml from replica or from old backup during upgrade.
I.e.:
cp /var/log/pki/server/upgrade/10.2.2/1/oldfiles/var/lib/pki/pki-tomcat/conf/Catalina/localhost/ca.xml

After this all was working again. I only had to enforce a resync of the replicas.

I think it would be good to build in a check for existing file "ca.xml" in the post-routine of the upgrade process.

Comment 10 Petr Vobornik 2017-03-07 09:37:21 UTC
Thanks for the update.

Moving to pki-core to check if PKI upgrader can do some additional checks based on comment 9. 

And reopening so that it is not overlooked.

Comment 11 Matthew Harmsen 2017-03-08 16:57:21 UTC
(In reply to Petr Vobornik from comment #10)
> Thanks for the update.
> 
> Moving to pki-core to check if PKI upgrader can do some additional checks
> based on comment 9. 
> 
> And reopening so that it is not overlooked.

As there is a workaround to this issue, and the focus of RHEL 7.4 has to be Common Criteria, I added the team to the email cc list, and am moving this bug to 7.5.

If the bug appears more often, we can always move it back to consideration for RHEL 7.4 in a late snapshot and/or batch update.

Comment 12 Matthew Harmsen 2017-03-30 20:58:31 UTC
Upstream ticket:
https://pagure.io/dogtagpki/issue/2624

Comment 15 Ade Lee 2017-08-31 18:07:07 UTC
Responding to the comments in comment 9:

1.  When the CA starts up, there is a self test that checks the permisions/ expiration of the system certs.  It does in fact check the P permisision for the audit signing cert.

2.  When that test fails, failure messages are logged and the CA does not start up.

3.  Furthermore, the CA is disabled.  This means that the CA application is undeployed - which means that the deployment descriptor (ca.xml) is deleted.  This file is automatically recreated when the CA subsystem is re-enabled.

There is a CLI to re-enable the CA subsystem.

Comment 16 Ade Lee 2017-08-31 18:16:12 UTC
Trying to figure out what to take from this.  All in all, we need a better health check tool - which is something that was proposed in https://bugzilla.redhat.com/show_bug.cgi?id=1303254.

As I understand it, at least some of this check has been implemented on the IPA side. (including the permissions on the system certs).

Also, with recent 10.4 code, when you restart the CA, the subsytem is automatically redeployed and the ca.xml is recreated.  (https://pagure.io/dogtagpki/issue/2699)

Propose to close this bug.

Comment 17 Petr Vobornik 2017-09-20 08:42:42 UTC
I think this bug can be closed. 

Ideal solution would be if Dogtag could do a verification of input without doing an actual installation. So that IPA can run this validation in validation step of IPA installer and not in CA installation step where part of the server is already configured. There is a Dogtag RFE, I believe it is called --dry-run to implement this.

Until Dogtag has the 'dry run' option such checks needs to be done on IPA side. E.g. as suggested in bug 1489962.

Comment 18 Matthew Harmsen 2017-09-21 14:54:32 UTC
(In reply to Petr Vobornik from comment #17)
> I think this bug can be closed. 
> 
> Ideal solution would be if Dogtag could do a verification of input without
> doing an actual installation. So that IPA can run this validation in
> validation step of IPA installer and not in CA installation step where part
> of the server is already configured. There is a Dogtag RFE, I believe it is
> called --dry-run to implement this.
> 
> Until Dogtag has the 'dry run' option such checks needs to be done on IPA
> side. E.g. as suggested in bug 1489962.

CLOSING as DUPLICATE of "Bug 1224623 - Validate input in pkispawn, add dry run option"

*** This bug has been marked as a duplicate of bug 1224623 ***


Note You need to log in before you can comment on or make changes to this bug.