Bug 1321381

Summary: hosted-engine-setup trusts also the system defined CA certs while the oVirt python SDK ignores them
Product: Red Hat Enterprise Virtualization Manager Reporter: nijin ashok <nashok>
Component: ovirt-hosted-engine-setupAssignee: Simone Tiraboschi <stirabos>
Status: CLOSED ERRATA QA Contact: Artyom <alukiano>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.6.3CC: gklein, juan.hernandez, lsurette, melewis, nashok, sbonazzo, ykaul
Target Milestone: ovirt-4.0.0-betaKeywords: Triaged, ZStream
Target Release: 4.0.0   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Previously, when a user replaced the apache certificate hosted-engine-setup trusted the system defined CA certificate and skipped the certificate question when deploying a second host. This meant that the deployment of the second host failed because the certificate was not actually trusted. Now, hosted-engine-setup no longer skips the certificate question and prompts the user for the correct CA certificate or asks to proceed in insecure mode.
Story Points: ---
Clone Of:
: 1334702 (view as bug list) Environment:
Last Closed: 2016-08-23 21:00:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1326386    
Bug Blocks: 1146710, 1334702    

Description nijin ashok 2016-03-26 00:00:14 UTC
Description of problem:

As per the below document, it's possible to change the SSL certificate of web portal.

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.6/html/Administration_Guide/appe-Red_Hat_Enterprise_Virtualization_and_SSL.html#Replacing_the_SSL_certificate_used_by_Red_Hat_Enterprise_Virtualization_Manager_to_identify_itself_to_users_connecting_over_https

However when we add a second host to hosted-engine, it still downloads the internal default CA certificate located at /etc/pki/ovirt-engine/ca.pem and will try to add the host to RHEV-M using this certificate. This will fails with

ConnectionError: [ERROR]::RHEV API connection failure, (60, "Peer's certificate issuer has been marked as not trusted by the user.")


Version-Release number of selected component (if applicable):
ovirt-hosted-engine-setup-1.3.3.4-1.el7ev.noarch
Red Hat Enterprise Virtualization 3.6

How reproducible:
100%

Steps to Reproduce:

1. Change the apache certificate of RHEV-M with a custom SSL certificate.

2. Try to add a new host to the hosted-engine setup.

3. During the process of adding this host to RHEV-M, it fails with error 

"Peer's certificate issuer has been marked as not trusted by the user"

Actual results:

Additional HE host is not able to add when we use custom apache SSL certificate

Expected results:

Additional HE host should be able to add with custom apache SSL certificate

Additional info:

Comment 3 Sandro Bonazzola 2016-04-11 10:00:32 UTC
Targeting to 4.1 having a workaround for this.

Comment 4 Simone Tiraboschi 2016-04-11 12:54:46 UTC
Nijin, was the custom Apache cert valid if validated with system’s default CA certificates?

Comment 5 Simone Tiraboschi 2016-04-11 14:43:20 UTC
This is quite sneaky:
hosted-engine-setup already tries to validate the CA cert it's going to use and query the user if not valid; the flow is:
1. download the initial engine CA cert from https://{fqdn}/ovirt-engine/services/pki-resource?resource=ca-certificate&format=X509-PEM-CA in insecure mode
2. download again the initial engine CA cert from https://{fqdn}/ovirt-engine/services/pki-resource?resource=ca-certificate&format=X509-PEM-CA is secure mode using the CA cert downloaded at step 1 (* the issue is here!!!) to validate it
3. if 2 is OK, go to 4, otherwise loop on 2 till the user provides a valid custom CA cert or choose insecure mode
4. download the engine SSH pub key from https://{fqdn}/ovirt-engine/services/pki-resource?resource=engine-certificate&format=OPENSSH-PUBKEY in secure mode with the internal/custom CA o in insecure mode according to user choice
5. use ovirt python SDK with the internal/custom CA o in insecure mode according to user choice

In your logs, all the steps from 1 to 4 were OK so it proceed to step 5 assuming that the CA cert was OK but it failed there.
The issue is at point 2!
If available, we are calling the new ssl.create_default_context() method introduced with python 3.4 and backported on python 2.7.9 and backported to python 2.7.5 by Redhat so in RHEL7 (see rhbz#1259421).

The issue is that ssl.create_default_context() is loading by default also the system defined CA certs while the older ssl.wrap_socket() was loading just the ones in the provided file.

On the other side, oVirt engine SDK is using pycurl which probably loads just the CA cert in the passed file so hosted-engine-setup validation doesn't complain since the custom apache CA cert validates against the system wide CAs while then the SDK uses just provided one and so it fails.

Two options here:
1. fix hosted-engine-setup to use just the provided CA skipping the system defined CA certs
2. fix oVirt SDK to also use the system wide CA certs

Personally I think that trusting by default also the system defined CA certs is a good idea so I'm for option 2

Comment 6 Simone Tiraboschi 2016-04-11 14:45:59 UTC
Reproducible with:

from ovirt_hosted_engine_setup import ohttpshandler
import tempfile
import ssl

fqdn = 'enginevm.localdomain'
url='https://{fqdn}/ovirt-engine/services/pki-resource?resource=ca-certificate&format=X509-PEM-CA'.format(
    fqdn=fqdn,
)

he_https_handler = ohttpshandler.OVHTTPSHandler()
code, info, content = he_https_handler.fetchUrl(
    url=url,
    ca_certs=None,
)
if code==200:
    fd, cert = tempfile.mkstemp(
        prefix='engine-ca',
        suffix='.crt',
    )
    with open(cert, 'w') as fileobj:
        fileobj.write(content)
    context = ssl.create_default_context()
    context.load_verify_locations(cafile=cert)
    context.verify_mode = ssl.CERT_REQUIRED
    context.check_hostname = ssl.match_hostname
    print context.get_ca_certs()

That prints:
[{'notBefore': u'Apr 16 07:09:14 2007 GMT', 'serialNumber': u'49330001', 'notAfter': 'Apr 16 07:09:14 2027 GMT', 'version': 3L, 'subject': ((('countryName', u'CN'),), (('organizationName', u'CNNIC'),), (('commonName', u'CNNIC ROOT'),)), 'issuer': ((('countryName', u'CN'),), (('organizationName', u'CNNIC'),), (('commonName', u'CNNIC ROOT'),))}, ....
{'notBefore': u'Apr  7 15:56:37 2016 GMT', 'serialNumber': u'1000', 'notAfter': 'Apr  6 15:56:37 2026 GMT', 'version': 3L, 'subject': ((('countryName', u'US'),), (('organizationName', u'localdomain'),), (('commonName', u'enginevm.localdomain.31641'),)), 'issuer': ((('countryName', u'US'),), (('organizationName', u'localdomain'),), (('commonName', u'enginevm.localdomain.31641'),))}]

Comment 7 Simone Tiraboschi 2016-04-11 14:59:14 UTC
The explanation is here:

https://docs.python.org/3/library/ssl.html#ssl-certificates

18.2.4.2. CA certificates
If you are going to require validation of the other side of the connection’s certificate, you need to provide a “CA certs” file, filled with the certificate chains for each issuer you are willing to trust. Again, this file just contains these chains concatenated together. For validation, Python will use the first chain it finds in the file which matches. The platform’s certificates file can be used by calling SSLContext.load_default_certs(), *** this is done automatically with create_default_context(). ***

Comment 8 Juan Hernández 2016-04-11 15:00:46 UTC
If I understand correctly the root problem here is that the hosted engine setup is downloading a certificate that may not be the certificate actually used by the web server. I think that the solution should be to get the certificate directly from the SSL handshake, and not from any of the URLs.

The SDK can also be modified to use the system wide CA certificates, but that won't solve the problem if the certificate used by the web server isn't in that database.

Also, the SDK should give its users full control, so if we decide to modify it to trust the system wide CA certificates it should be optional, and disabled by default, to preserve backwards compatibility.

Comment 9 Simone Tiraboschi 2016-04-11 15:19:09 UTC
(In reply to Juan Hernández from comment #8)
> If I understand correctly the root problem here is that the hosted engine
> setup is downloading a certificate that may not be the certificate actually
> used by the web server. I think that the solution should be to get the
> certificate directly from the SSL handshake, and not from any of the URLs.

engine-setup creates an internal CA and signs all the cert, including the apache one, with that CA acting as a root CA.

hosted-engine-setup fetches it from here:
hosted-engine-setup is downloading the oVirt internal CA from https://{fqdn}/ovirt-engine/services/pki-resource?resource=ca-certificate&format=X509-PEM-CA

As for our instructions, the user can replace the apache cert with a proper one signed by a third party commercial CA.
But, AFAIK, https doesn't mandate to provide the whole cert chain, for instance is quite common to omit the root CA cert and so the SSL handshake is not enough by itself to get also the root CA cert.

> The SDK can also be modified to use the system wide CA certificates, but
> that won't solve the problem if the certificate used by the web server isn't
> in that database.

In that code also hosted-engine-setup will not validate it and in that case it's already asking to provide the missing CA certs file or proceed in insecure mode.
If the user provide a custom CA chain, hosted-engine-setup will pass that also to the python SDK. If the user decides for insecure mode, hosted-engine-setup uses the python SDK in insecure mode.
So no issue here.

> Also, the SDK should give its users full control, so if we decide to modify
> it to trust the system wide CA certificates it should be optional, and
> disabled by default, to preserve backwards compatibility.

+1

Comment 10 Juan Hernández 2016-04-11 15:37:42 UTC
The TLS handshake must provide at least the last CA certificate, the one that was used to actually sign the web server certificate, otherwise it is impossible to build the trust chain. If the hosted engine setup gets this from the handshake and passes it to to the SDK (after additional validation, maybe) then things should work regardless of what certificate is used by the web server.

Please open an SDK bug for the change to support the system wide CA certificates.

Comment 11 nijin ashok 2016-04-18 16:44:49 UTC
(In reply to Simone Tiraboschi from comment #4)
> Nijin, was the custom Apache cert valid if validated with system’s default
> CA certificates?

The customer was using certificate signed by GoDaddy . So it was trusted . However I think the issue is clear now and you were able to replicate.

Comment 13 Artyom 2016-05-22 15:16:11 UTC
Verified on ovirt-hosted-engine-setup-2.0.0-0.2.master.20160518095239.git7459b63.el7.centos.noarch

1) Deploy hosted-engine on the first host
2) Change apache certificate on engine VM according to https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.6/html/Administration_Guide/appe-Red_Hat_Enterprise_Virtualization_and_SSL.html#Replacing_the_SSL_certificate_used_by_Red_Hat_Enterprise_Virtualization_Manager_to_identify_itself_to_users_connecting_over_https(do not explain about SELINUX labels), also I was need to change CA certificate under truststore
3) Add custom CA that signed new apache certificate to second host
   * cp my-custom-ca.pem /etc/pki/ca-trust/source/anchors/
   * update-ca-trust
4) Deploy the second host
...
[ INFO  ] The following CA certificate is going to be used, please immediately interrupt if not correct:
[ INFO  ] Issuer: C=US, O=qa.lab.tlv.redhat.com, CN=alukiano-he-1.qa.lab.tlv.redhat.com.86247, Subject: C=US, O=qa.lab.tlv.redhat.com, CN=alukiano-he-1.qa.lab.tlv.redhat.com.86247, Fingerprint (SHA-1): 85DC095268B216C14BBA5F960D8AB09F4E7BF336
          The REST API cert couldn't be trusted with the internal CA cert
          Would you like to continue in insecure mode (not recommended)?
          If not, please provide your CA cert at /etc/pki/CA/ovirtcustomcacert.pem before continuing
          (Yes, No)[No]? Yes
[ INFO  ] Connecting to the Engine
[ INFO  ] Waiting for the host to become operational in the engine. This may take several minutes...
[ INFO  ] The VDSM Host is now operational
...
Deploy succeed without any problems

Comment 15 errata-xmlrpc 2016-08-23 21:00:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-1744.html