Bug 1486225

Summary: ipa-replica-install --setup-kra broken on DL0
Product: Red Hat Enterprise Linux 7 Reporter: Petr Vobornik <pvoborni>
Component: pki-coreAssignee: Fraser Tweedale <ftweedal>
Status: CLOSED ERRATA QA Contact: Asha Akkiangady <aakkiang>
Severity: urgent Docs Contact: Marc Muehlfeld <mmuehlfe>
Priority: urgent    
Version: 7.4CC: ksiddiqu, mharmsen, mhonek, ndehadra, nkinder, nsoman, pbokoc, pkis, pvoborni, rcritten, sumenon, tscherf
Target Milestone: rcKeywords: Regression, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Certificate System no longer fails to import PKCS #12 files An earlier change to PKCS #12 password encoding in the Network Security Services (NSS) caused Certificate System to fail to import PKCS #12 files. As a consequence, the Certificate Authority (CA) clone installation could not be completed. With this update, PKI will retry a failed PKCS #12 decryption with a different password encoding, which allows it to import PKCS #12 files produced by both old and new versions of NSS, and CA clone installation succeeds.
Story Points: ---
Clone Of:
: 1492560 (view as bug list) Environment:
Last Closed: 2018-04-10 17:01:25 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1492560    

Description Petr Vobornik 2017-08-29 09:12:54 UTC
Cloned from upstream: https://pagure.io/freeipa/issue/7087

Thanks to Dogtag update to their export of the admin certificate to a PKCS12 file (`/root/cacert.p12` in our case), `ipa-replica-install --setup-kra` is broken on domain level 0.

More information about the Dogtag change is to be found at https://bugzilla.redhat.com/show_bug.cgi?id=1426754

Comment 2 Petr Vobornik 2017-08-29 09:13:06 UTC
Upstream ticket:
https://pagure.io/freeipa/issue/7087

Comment 6 Nathan Kinder 2017-09-13 20:28:17 UTC
A bug has been filed against NSS to allow it to read PKCS#12 files from older NSS versions:

  https://bugzilla.redhat.com/show_bug.cgi?id=1491444

In addition to fixing NSS, we should fix Dogtag (pki-core) to generate PKCS#12 files in the correct format for newer NSS versions (and for openssl).  Moving this bug to the pki-core component for this part of the problem.

Comment 8 Fraser Tweedale 2017-09-15 11:09:43 UTC
Upstream pki ticket: https://pagure.io/dogtagpki/issue/2809
Gerrit review: https://review.gerrithub.io/#/c/378753/1

Comment 9 Matthew Harmsen 2017-09-15 17:37:45 UTC
edewata reviewed and pushed ftweedal changes:

Author: Fraser Tweedale <ftweedal>
Date:   Thu Sep 14 12:22:47 2017 +1000

    Make PKCS #12 files compatible with OpenSSL, NSS >= 3.31
    
    For compatibility with OpenSSL and NSS >= 3.31, the passphrase must
    not be BMPString-encoded when non-PKCS #12 PBE schemes such as
    PBES2.
    
    Fixes: https://pagure.io/dogtagpki/issue/2809
    
    Change-Id: Ic78ad337ac0b9b2f5d2e75581cc0ee55e6d82782

NOTE: edewata believes that we still need to update the pki-core.spec
      file to require the appropriate version of NSS (e. g. - >= 3.31 on
      Fedora, and hopefully nss >= 3.28-? on RHEL once the proper changes
      have been cherry-picked to RHEL)

Comment 10 Fraser Tweedale 2017-09-18 01:07:14 UTC
We don't need to update the spec file in relation to this change.  This change
makes Dogtag compatible with older and newer NSS.

But I think we do need to update spec file in relation to a different
JSS change, oncen the relevant builds/releases of JSS are available.

Comment 13 Matthew Harmsen 2017-10-26 22:47:06 UTC
Ade Lee 2017-10-26 13:26:50 EDT

Commits (in order):

386357c347f8433e14ccd8637576f4c4a4e42492
bc329a0162ae9af382c81e75742b282ea8c5df0d
9eb354883c9d965bb271223bf870839bb756db26
fa2d731b6ce51c5db9fb0b004d586b8f3e1decd3
8c0a7eee3bbfe01b2d965dbe09e95221c5031c8b

Comment 14 Nikhil Dehadrai 2018-01-09 12:06:45 UTC
IPA-server version: ipa-server-4.5.4-7.el7.x86_64

Tested the bug with following observations:
1. Setup IPA master with KRA at domain Level 0.
2. Prepare Replica file for Replica.
3. Setup Replica with kra against ipa-master at domain level0.

Observations:
1. After step1, ipa-master is installed successfully
ipa-server-install --setup-dns --forwarder <forwarder_ip> --domain testrelm.test --realm TESTRELM.TEST --admin-password <admin-password> --ds-password <ds-password> -U --reverse-zone <reverse_zone_details> --allow-zone-overlap --setup-kra --domain-level=0

2. After step3, the replica installation fails:
#  ipa-replica-install -U --setup-dns --forwarder <forwarder_ip> --setup-ca --setup-kra --admin-password <admin-password> --password <password> /var/lib/ipa/replica-info-replica.testrelm.test.gpg

Configuring certificate server (pki-tomcatd). Estimated time: 3 minutes
  [1/28]: configuring certificate server instance
ipa.ipaserver.install.cainstance.CAInstance: CRITICAL Failed to configure CA instance: Command '/usr/sbin/pkispawn -s CA -f /tmp/tmp7HAvcR' returned non-zero exit status 1
ipa.ipaserver.install.cainstance.CAInstance: CRITICAL See the installation logs and the following files/directories for more information:
ipa.ipaserver.install.cainstance.CAInstance: CRITICAL   /var/log/pki/pki-tomcat
  [error] RuntimeError: CA configuration failed.
Your system may be partly configured.
Run /usr/sbin/ipa-server-install --uninstall to clean up.

ipa.ipapython.install.cli.install_tool(CompatServerReplicaInstall): ERROR    CA configuration failed.
ipa.ipapython.install.cli.install_tool(CompatServerReplicaInstall): ERROR    The ipa-replica-install command failed. See /var/log/ipareplica-install.log for more information

Thus on the basis of above observations, changing status of bug to "ASSIGNED"

Comment 17 Fraser Tweedale 2018-01-09 14:08:24 UTC
Nikhil, what were the expact package versions of ipa and pki-core used for
the test?

Comment 18 Nikhil Dehadrai 2018-01-10 12:31:44 UTC
(In reply to Fraser Tweedale from comment #17)
> Nikhil, what were the expact package versions of ipa and pki-core used for
> the test?

# rpm -q ipa-server pki-core
ipa-server-4.5.4-7.el7.x86_64
package pki-core is not installed

# rpm -qa | grep pki
pki-server-10.5.1-5.el7.noarch
pki-base-java-10.5.1-5.el7.noarch
pki-ca-10.5.1-5.el7.noarch
krb5-pkinit-1.15.1-18.el7.x86_64
pki-base-10.5.1-5.el7.noarch
pki-tools-10.5.1-5.el7.x86_64
pki-kra-10.5.1-5.el7.noarch

Let me know if you need anymore information.

Comment 19 Fraser Tweedale 2018-01-10 14:05:57 UTC
Confirmed that the nss version in use (nss-3.34.0-3.el7.x86_64)
contains the password encoding fix.

The fix for https://pagure.io/dogtagpki/issue/2557 is also included
in pki 10.5.1-5.el7.

The debug log shows errors in setting up replication:

[09/Jan/2018:06:47:46][http-bio-8443-exec-3]: setupReplication: consumer initialization failed. -11  - LDAP error: Connect error
[09/Jan/2018:06:47:46][http-bio-8443-exec-3]: setupReplication: java.io.IOException: consumer initialization failed. -11  - LDAP error: Connect error
java.io.IOException: Failed to setup the replication for cloning.
	at com.netscape.cms.servlet.csadmin.ConfigurationUtils.setupReplication(ConfigurationUtils.java:2070)
	at org.dogtagpki.server.rest.SystemConfigService.initializeDatabase(SystemConfigService.java:713)
	at org.dogtagpki.server.ca.rest.CAInstallerService.initializeDatabase(CAInstallerService.java:116)

I haven't encountered this kind of error before.  It is not related to the fixes previously
associated with this ticket and may be a regression elsewhere.

Is there anything of note in the DS error or access log?

Will continue investigations tomorrow.

Comment 20 Nikhil Dehadrai 2018-01-11 07:54:09 UTC
(In reply to Fraser Tweedale from comment #19)
> Confirmed that the nss version in use (nss-3.34.0-3.el7.x86_64)
> contains the password encoding fix.
> 
> The fix for https://pagure.io/dogtagpki/issue/2557 is also included
> in pki 10.5.1-5.el7.
> 
> The debug log shows errors in setting up replication:
> 
> [09/Jan/2018:06:47:46][http-bio-8443-exec-3]: setupReplication: consumer
> initialization failed. -11  - LDAP error: Connect error
> [09/Jan/2018:06:47:46][http-bio-8443-exec-3]: setupReplication:
> java.io.IOException: consumer initialization failed. -11  - LDAP error:
> Connect error
> java.io.IOException: Failed to setup the replication for cloning.
> 	at
> com.netscape.cms.servlet.csadmin.ConfigurationUtils.
> setupReplication(ConfigurationUtils.java:2070)
> 	at
> org.dogtagpki.server.rest.SystemConfigService.
> initializeDatabase(SystemConfigService.java:713)
> 	at
> org.dogtagpki.server.ca.rest.CAInstallerService.
> initializeDatabase(CAInstallerService.java:116)
> 
> I haven't encountered this kind of error before.  It is not related to the
> fixes previously
> associated with this ticket and may be a regression elsewhere.
> 
> Is there anything of note in the DS error or access log?
> 
> Will continue investigations tomorrow.


Hi Fraser,

This is what I could get from DB error logs:

[09/Jan/2018:06:44:09.658701995 -0500] - INFO - Security Initialization - SSL info:     TLS_AES_256_GCM_SHA384: enabled
[09/Jan/2018:06:44:09.698924923 -0500] - INFO - Security Initialization - slapd_ssl_init2 - Configured SSL version range: min: TLS1.0, max: TLS1.2
[09/Jan/2018:06:44:09.708657456 -0500] - INFO - main - 389-Directory/1.3.7.5 B2017.342.2028 starting up
[09/Jan/2018:06:44:17.192386262 -0500] - INFO - ldbm_instance_config_cachememsize_set - force a minimal value 512000
[09/Jan/2018:06:44:17.231422793 -0500] - WARN - default_mr_indexer_create - Plugin [caseIgnoreIA5Match] does not handle caseExactIA5Match
[09/Jan/2018:06:44:17.242546217 -0500] - NOTICE - ldbm_back_start - found 1833416k physical memory
[09/Jan/2018:06:44:17.244622311 -0500] - NOTICE - ldbm_back_start - found 1265760k available
[09/Jan/2018:06:44:17.246620969 -0500] - NOTICE - ldbm_back_start - cache autosizing: db cache: 45835k
[09/Jan/2018:06:44:17.249095195 -0500] - NOTICE - ldbm_back_start - cache autosizing: userRoot entry cache (1 total): 131072k
[09/Jan/2018:06:44:17.253804567 -0500] - NOTICE - ldbm_back_start - cache autosizing: userRoot dn cache (1 total): 65536k
[09/Jan/2018:06:44:17.257579538 -0500] - NOTICE - ldbm_back_start - total cache size: 238874951 B; 
[09/Jan/2018:06:44:17.484824656 -0500] - ERR - attrcrypt_cipher_init - No symmetric key found for cipher AES in backend userRoot, attempting to create one...
[09/Jan/2018:06:44:17.502005411 -0500] - INFO - attrcrypt_cipher_init - Key for cipher AES successfully generated and stored
[09/Jan/2018:06:44:17.504378128 -0500] - ERR - attrcrypt_cipher_init - No symmetric key found for cipher 3DES in backend userRoot, attempting to create one...
[09/Jan/2018:06:44:17.519884184 -0500] - INFO - attrcrypt_cipher_init - Key for cipher 3DES successfully generated and stored
[09/Jan/2018:06:44:17.575003965 -0500] - ERR - NSACLPlugin - acl_parse - The ACL target cn=groups,cn=compat,dc=testrelm,dc=test does not exist
[09/Jan/2018:06:44:17.578635413 -0500] - ERR - NSACLPlugin - acl_parse - The ACL target cn=computers,cn=compat,dc=testrelm,dc=test does not exist
[09/Jan/2018:06:44:17.580847379 -0500] - ERR - NSACLPlugin - acl_parse - The ACL target cn=ng,cn=compat,dc=testrelm,dc=test does not exist
[09/Jan/2018:06:44:17.583613869 -0500] - ERR - NSACLPlugin - acl_parse - The ACL target ou=sudoers,dc=testrelm,dc=test does not exist
[09/Jan/2018:06:44:17.587267755 -0500] - ERR - NSACLPlugin - acl_parse - The ACL target cn=users,cn=compat,dc=testrelm,dc=test does not exist
[09/Jan/2018:06:44:17.595150396 -0500] - ERR - NSACLPlugin - acl_parse - The ACL target cn=ad,cn=etc,dc=testrelm,dc=test does not exist
[09/Jan/2018:06:44:17.620153340 -0500] - ERR - NSACLPlugin - acl_parse - The ACL target cn=casigningcert cert-pki-ca,cn=ca_renewal,cn=ipa,cn=etc,dc=testrelm,dc=test does not exist
[09/Jan/2018:06:44:17.622421384 -0500] - ERR - NSACLPlugin - acl_parse - The ACL target cn=casigningcert cert-pki-ca,cn=ca_renewal,cn=ipa,cn=etc,dc=testrelm,dc=test does not exist
[09/Jan/2018:06:44:17.829732545 -0500] - ERR - cos-plugin - cos_dn_defs_cb - Skipping CoS Definition cn=Password Policy,cn=accounts,dc=testrelm,dc=test--no CoS Templates found, which should be added before the CoS Definition.
[09/Jan/2018:06:44:17.908908578 -0500] - ERR - set_krb5_creds - Could not get initial credentials for principal [ldap/ipaqavmb.testrelm.test] in keytab [FILE:/etc/dirsrv/ds.keytab]: -1765328324 (Generic error (see e-text))
[09/Jan/2018:06:44:17.931425822 -0500] - INFO - slapd_daemon - slapd started.  Listening on All Interfaces port 389 for LDAP requests
[09/Jan/2018:06:44:17.933635954 -0500] - INFO - slapd_daemon - Listening on All Interfaces port 636 for LDAPS requests
[09/Jan/2018:06:44:17.935733007 -0500] - INFO - slapd_daemon - Listening on /var/run/slapd-TESTRELM-TEST.socket for LDAPI requests
[09/Jan/2018:06:45:54.435254800 -0500] - INFO - ldbm_instance_config_cachememsize_set - force a minimal value 512000
[09/Jan/2018:06:45:54.519814883 -0500] - ERR - attrcrypt_cipher_init - No symmetric key found for cipher AES in backend ipaca, attempting to create one...
[09/Jan/2018:06:45:54.545681336 -0500] - INFO - attrcrypt_cipher_init - Key for cipher AES successfully generated and stored
[09/Jan/2018:06:45:54.555292711 -0500] - ERR - attrcrypt_cipher_init - No symmetric key found for cipher 3DES in backend ipaca, attempting to create one...
[09/Jan/2018:06:45:54.587747960 -0500] - INFO - attrcrypt_cipher_init - Key for cipher 3DES successfully generated and stored
[09/Jan/2018:06:45:54.628389194 -0500] - ERR - ipa-topology-plugin - ipa_topo_be_state_change - backend ipaca is coming online; checking domain level and init shared topology
[09/Jan/2018:06:45:54.678544386 -0500] - ERR - cos-plugin - cos_dn_defs_cb - Skipping CoS Definition cn=Password Policy,cn=accounts,dc=testrelm,dc=test--no CoS Templates found, which should be added before the CoS Definition.
[09/Jan/2018:06:47:30.774289178 -0500] - ERR - slapi_ldap_bind - Error: could not send startTLS request: error -11 (Connect error)
[09/Jan/2018:06:47:30.778253927 -0500] - ERR - NSMMReplicationPlugin - bind_and_check_pwp - agmt="cn=cloneAgreement1-ipaqavmb.testrelm.test-pki-tomcat" (auto-hv-01-guest10:389) - Replication bind with SIMPLE auth failed: LDAP error -11 (Connect error) (error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed (self signed certificate in certificate chain))
[09/Jan/2018:06:47:33.804309057 -0500] - ERR - slapi_ldap_bind - Error: could not send startTLS request: error -11 (Connect error)
[09/Jan/2018:06:47:39.826847902 -0500] - ERR - slapi_ldap_bind - Error: could not send startTLS request: error -11 (Connect error)
[09/Jan/2018:06:47:51.968539276 -0500] - ERR - slapi_ldap_bind - Error: could not send startTLS request: error -11 (Connect error)
[09/Jan/2018:06:48:15.022542284 -0500] - ERR - slapi_ldap_bind - Error: could not send startTLS request: error -11 (Connect error)


Nothing in access log during that instance.

Comment 22 Fraser Tweedale 2018-01-30 12:55:55 UTC
Same or similar errors - a failure to establish replication - occurs when
performing DL0 CA installation either as part of ipa-replica-install (with or without --setup-kra) or in ipa-ca-install after performing replica installation without `--setup-ca'.  So the problem does not seem to involve the KRA specifically.

Does not occur in DL1.  One difference between replica installation on DL0 vs DL1 is that in DL0 Dogtag itself sets up the replication as part of pkispawn, whereas in DL1 the database is created and replication is configured before invoking pkispawn.

Specifically, in the ipa-server-install --setup-ca case, the DS error log shows TLS handshake failures ("self signed certificate in chain").  In the ipa-ca-install case, it shows "Unable to get certificate CRL".  AFAICT both master and replica DS NSSDBs contain the CA certificate with correct trust flags.

Feeling quite out of my depth here, and I would appreciate if someone who understands how DS replication is established could take a look at it.

I'll continue investigation tomorrow.

Comment 23 Fraser Tweedale 2018-02-01 08:24:27 UTC
Hoo boy, this is a curly one.

So, I think it *may* be related to a change in openldap-2.4.44.9:

  * Fri Nov 03 2017 Matus Honek <mhonek> - 2.4.44-9 - Build with 
    OpenSSL and MozNSS compatibility layer instead of MozNSS (#1400578)

This occurs between RHEL 7.4 and RHEL 7.5.  Notably, this change does not seem to have occurred in Fedora.

Now, when 389ds detects that the OpenLDAP client is using the OpenSSL backend,
it sets LDAP_CRLCHECK option to `all'.  In setup_ol_tls_conn(LDAP *ld, int clientauth):

        if (slapi_client_uses_openssl(ld)) {
            const int crlcheck = LDAP_OPT_X_TLS_CRL_ALL;
            /* Sets the CRL evaluation strategy. */
            rc = ldap_set_option(ld, LDAP_OPT_X_TLS_CRLCHECK, &crlcheck);
            if (rc) {
                slapi_log_err(SLAPI_LOG_ERR, "setup_ol_tls_conn",
                              "Could not set CRLCHECK [%d]: %d:%s\n",
                              crlcheck, rc, ldap_err2string(rc));
            }
        }

Replication setup uses these methods to communicate between replicas.  I don't fully understand how replication gets set up but will discuss with wibrown tomorrow.  Anyhow, it seems that this causes replication setup to fail; the TLS
doesn't get established because the CRL is not available to check the cert.

This does not occur in domainlevel=1 because in DL1, ipa-(replica|ca)-install sets up replication agreements itself, and configures a GSSAPI replica bind method.  In ipaserver/install/replication.py (note: isgssapi = True):

       if isgssapi:                     
           entry['nsds5replicatransportinfo'] = ['LDAP']       
           entry['nsds5replicabindmethod'] = ['SASL/GSSAPI']   
       else:                                                   
           entry['nsds5replicabinddn'] = [repl_man_dn]         
           entry['nsds5replicacredentials'] = [repl_man_passwd]
           entry['nsds5replicatransportinfo'] = ['TLS']        
           entry['nsds5replicabindmethod'] = ['simple']        

whereas in pkispawn (actually, ConfigurationUtils.java):

      if (replicationSecurity.equals("SSL")) {                             
          attrs.add(new LDAPAttribute("nsDS5ReplicaTransportInfo", "SSL"));
      } else if (replicationSecurity.equals("TLS")) {                      
          attrs.add(new LDAPAttribute("nsDS5ReplicaTransportInfo", "TLS"));
      }                                                                    

(so there is no option to set nsds5replicabindmethod=SASL/GSSAPI).

If my theories are correct (I'm yet to hack up the changes to test them), I
see three possible approaches to resolving (not mutually exclusive):

1. rollback the openldap change from RHEL-7.5.  not a permanent solution but it buys
   time to work out how to fix it properly.

2. update 389ds to make the CRLCHECK configurable.  If the default remains 'all'
   (or 'peer') then we'll also have to update IPA to tweak that setting to 'none'.

3. update Dogtag to allow setting GSS-API based replication agreements.  I'm not
   100% sure if this is feasible in DL0 or whether associated changes in IPA would
   also be required.

Comment 24 Nathan Kinder 2018-02-01 17:41:25 UTC
(In reply to Fraser Tweedale from comment #23)
> 2. update 389ds to make the CRLCHECK configurable.  If the default remains
> 'all'
>    (or 'peer') then we'll also have to update IPA to tweak that setting to
> 'none'.

I believe that this change makes sense (though I still would like to understand more about why the CRL is not available).  I have filed a bug against 389-ds-base to allow this to be configurable:

  https://bugzilla.redhat.com/show_bug.cgi?id=1541108

Comment 26 Fraser Tweedale 2018-02-02 03:56:10 UTC
Nathan, yes I agree, we want to understand why the CRL processing isn't working.
Ideally we could make using CRLs or OCSP work, and even be the default for IPA.
But making 389 configurable in this regard means we can fix the regression quickly and then have to time to work out the rest of the puzzle.

Comment 28 Fraser Tweedale 2018-02-07 07:09:56 UTC
Yes, it's caused by a change in openldap (building against OpenSSL instead of NSS now).  Openldap BZ is https://bugzilla.redhat.com/show_bug.cgi?id=1400578.

Comment 29 Fraser Tweedale 2018-02-08 10:26:01 UTC
Kicking it back to ON_QA because there's a new 389 build (I think?) that should
resolve (part of?) the issues...

Comment 30 Nikhil Dehadrai 2018-02-08 11:23:59 UTC
IPa-server: ipa-server-4.5.4-10.el7.x86_64

389-ds package version:
389-ds-base-libs-1.3.7.5-16.el7.x86_64
389-ds-base-1.3.7.5-16.el7.x86_64

Tested the bug with following observations:
1. Setup IPA master with KRA at domain Level 0.

RUNCMD: /usr/sbin/ipa-server-install --setup-dns --forwarder <forwarder_ip> --domain testrelm.test --realm TESTRELM.TEST --admin-password Secret123 --ds-password Secret123 -U --reverse-zone <reverse_zone_details> --allow-zone-overlap --setup-kra --domain-level=0

2. Prepare Replica file for Replica.
3. Setup Replica with kra against ipa-master at domain level0.
RUNCMD: /usr/sbin/ipa-replica-install -U --setup-dns --forwarder <forwarder_ip> --setup-ca --setup-kra --admin-password Secret123 --password Secret123 /var/lib/ipa/replica-info-replica.testrelm.test.gpg


Observations:
1. After step1, ipa-master is installed successfully
2. After step3, the replica installation is successful.

Thus on the basis of above observations, markign the status of bug to 'Verified'

Comment 32 Sudhir Menon 2018-02-09 06:54:54 UTC
*** Bug 1542839 has been marked as a duplicate of this bug. ***

Comment 34 errata-xmlrpc 2018-04-10 17:01:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0925