Bug 1649267
| Summary: | [downstream clone - 4.2.8] [RHEL76] libvirt is unable to start after upgrade due to malformed UTCTIME values in cacert.pem, because properly renewed CA certificate was not passed to hosts by executing "Enroll certificate" or "Reinstall" | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | RHV bug bot <rhv-bugzilla-bot> |
| Component: | ovirt-engine | Assignee: | Simone Tiraboschi <stirabos> |
| Status: | CLOSED ERRATA | QA Contact: | Petr Matyáš <pmatyas> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 4.2.7 | CC: | aperotti, chyan, cshao, didi, gveitmic, huzhao, lsvaty, mperina, mtessun, nashok, qiyuan, ratamir, Rhev-m-bugs, sbonazzo, stirabos, weiwang, yaniwang, yaoxu, ycui, yturgema |
| Target Milestone: | ovirt-4.2.8 | Keywords: | Upgrades, ZStream |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | ovirt-engine-4.2.8.1 | Doc Type: | Enhancement |
| Doc Text: |
Internal CAs generated in the past (<= 3.5) can contain UTCTIME values without timezone indication and this is not acceptable anymore with up to date openssl and gnutls libraries.
engine-setup was already checking it proposing a remediation but the user can postpone it, making it more evident since now postponing can cause serious issues.
|
Story Points: | --- |
| Clone Of: | 1648190 | Environment: | |
| Last Closed: | 2019-01-22 12:44:51 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | Infra | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1648190 | ||
| Bug Blocks: | |||
|
Description
RHV bug bot
2018-11-13 09:44:54 UTC
Forgot this, also, pointing to the same error: # GNUTLS_DEBUG_LEVEL=9 libvirtd --listen <...> gnutls[3]: ASSERT: x509.c:311 <-- ASN1_DER_ERROR gnutls[3]: ASSERT: x509.c:3496 (Originally by Germano Veit Michel) More info: In function gnutls_x509_crt_import, gnutls uses: RHEL 7.5 : v3.3.26 asn1_der_decoding(&cert->cert, cert->der.data, cert->der.size, NULL); RHEL 7.6 : v3.3.29 _asn1_strict_der_decode(&cert->cert, cert->der.data, cert->der.size, NULL); Here: https://gitlab.com/gnutls/gnutls/commit/5691dc3e4331000765c694bc101b445a80c0a9e2 So in RHEL 7.6, it is using libtasn1 "asn1_der_decoding2" function, with some sort of STRICT option, ASN1_DECODE_FLAG_STRICT_DER. Not sure if related, but: * If %ASN1_DECODE_FLAG_STRICT_DER flag is set then the function will * not decode any BER-encoded elements. Tried to 'watch' the result return value in gdb to find where its returning from, but its optimized out. So gnutls is using a different function, now from libtasn1, which for some reason does not like that certificate. (Originally by Germano Veit Michel) Ran out of time today... here is some more info I got. Not sure if related. Breaking down the certs, found 2 other differences. 1) from date (UTCTIME type) is slightly different: 110:d=3 hl=2 l= 17 prim: UTCTIME :150517165500+0000 94:d=3 hl=2 l= 13 prim: UTCTIME :181022044108Z 2) The working cert uses UTF8_STRING instead of PRINTABLE_STRING. Maybe I need to recompile libtasn1 to track it down more precisely where it returns ASN1_DER_ERROR, with optimizations is a bit hard. (Originally by Germano Veit Michel) (In reply to Germano Veit Michel from comment #6) > Ran out of time today... here is some more info I got. > > Not sure if related. Breaking down the certs, found 2 other differences. > > 1) from date (UTCTIME type) is slightly different: > 110:d=3 hl=2 l= 17 prim: UTCTIME :150517165500+0000 > 94:d=3 hl=2 l= 13 prim: UTCTIME :181022044108Z According to https://bugzilla.redhat.com/show_bug.cgi?id=1636023#c15 and https://github.com/openssl/openssl/pull/2668 , the issue is due to the malformed value for UTCTIME: it was acceptable in the past but now openssl and gnutls refuses it. (Originally by Simone Tiraboschi) On the valid cert example we see:
Validity
Not Before: Oct 22 04:41:08 2018 GMT
Not After : Oct 20 04:41:08 2028 GMT
while on the bad one:
Validity
Not Before: May 17 16:55:00 2015
Not After : May 15 16:55:00 2025 GMT
so the issue is specific to 'Not Before:' value.
But in engine-setup we already have code to handle that case:
https://github.com/oVirt/ovirt-engine/blob/master/packaging/setup/plugins/ovirt-engine-setup/ovirt-engine/pki/ca.py#L275
Now we have to understand if it never triggered or if it got correctly triggered on engine side but the CA cert was never redistributed to all the hosts.
(Originally by Simone Tiraboschi)
According to engine-setup logs it seams that engine-setup correctly detected the issue and offered to start the PKI renewal process as for https://access.redhat.com/solutions/1572983 2018-05-02 10:51:51 DEBUG otopi.plugins.otopi.dialog.human human.queryString:145 query OVESETUP_RENEW_PKI 2018-05-02 10:51:51 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND One or more of the certificates should be renewed, because they expire soon, or include an invalid expiry date, or do not include the subjectAltName extension, which can cause them to be rejected by recent browsers. 2018-05-02 10:51:51 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND If you choose "No", you will be asked again the next time you run Setup. 2018-05-02 10:51:51 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND See https://access.redhat.com/solutions/1572983 for more details. 2018-05-02 10:51:51 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND Renew certificates? (Yes, No) [No]: 2018-05-02 10:51:51 DEBUG otopi.context context.dumpEnvironment:760 ENVIRONMENT DUMP - BEGIN 2018-05-02 10:51:51 DEBUG otopi.context context.dumpEnvironment:770 ENV OVESETUP_PKI/renew=bool:'False' The user reject it at least 5 times: [root@t470s setup]# grep -R "Renew PKI" ovirt-engine-setup-20171213144824-oppz1d.log:2017-12-13 14:49:16 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND Renew PKI : False ovirt-engine-setup-20171213144427-y922jr.log:2017-12-13 14:45:36 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND Renew PKI : False ovirt-engine-setup-20180102112424-dmayx8.log:2018-01-02 11:26:04 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND Renew PKI : False ovirt-engine-setup-20180102182847-cf5zz9.log:2018-01-02 18:29:32 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND Renew PKI : False ovirt-engine-setup-20180502105103-z9uh1h.log:2018-05-02 10:52:04 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND Renew PKI : False I have to add that the default response is No while Yes probably makes more sense now also due to this one. (Originally by Simone Tiraboschi) My customer already has the ca.pem in the engine renewed and corrected long back. openssl x509 -in ca.pem -noout -dates notBefore=Mar 24 06:33:24 2017 GMT notAfter=Mar 22 06:33:24 2027 GMT But the host is still having the old CA. openssl x509 -in certs/cacert.pem -noout -dates notBefore=Jun 14 13:41:58 2015 notAfter=Jun 12 13:41:58 2025 GMT I think the engine-setup should automatically push the CA certificate automatically to the hosts instead of asking the user to manually enroll the certificate as it's only CA has been changed and the vdsm certificates are still valid. Can we do this automatically during engine-setup? (Originally by Nijin Ashok) (In reply to nijin ashok from comment #15) > My customer already has the ca.pem in the engine renewed and corrected long back. ... > I think the engine-setup should automatically push the CA certificate > automatically to the hosts instead of asking the user to manually enroll the > certificate as it's only CA has been changed and the vdsm certificates are > still valid. Can we do this automatically during engine-setup? For sure not during engine-setup since the host has to be in maintenance since we have to restart libvirt and vdsm to make it effective and so the host has to be in maintenance. I think we should instead re-enroll host certs (just if needed?) during host upgrades at least if the upgrade is triggered from the engine. Martin, do you know if we are already re-enrolling certs in host upgrades lead by the engine? Nijin, do you know if the customer triggered host upgrade from the engine as recommended? (Originally by Simone Tiraboschi) What are the reproduction steps or how do I reach the change that is linked to this bug? Should the bad certificate linked to this bug work correctly? Verified on ovirt-engine-4.2.8.1-0.0.master.20181127093701.gite34295b.el7.noarch Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0121 |