Hide Forgot
Description of problem: As per the below document, it's possible to change the SSL certificate of web portal. https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.6/html/Administration_Guide/appe-Red_Hat_Enterprise_Virtualization_and_SSL.html#Replacing_the_SSL_certificate_used_by_Red_Hat_Enterprise_Virtualization_Manager_to_identify_itself_to_users_connecting_over_https However when we add a second host to hosted-engine, it still downloads the internal default CA certificate located at /etc/pki/ovirt-engine/ca.pem and will try to add the host to RHEV-M using this certificate. This will fails with ConnectionError: [ERROR]::RHEV API connection failure, (60, "Peer's certificate issuer has been marked as not trusted by the user.") Version-Release number of selected component (if applicable): ovirt-hosted-engine-setup-1.3.3.4-1.el7ev.noarch Red Hat Enterprise Virtualization 3.6 How reproducible: 100% Steps to Reproduce: 1. Change the apache certificate of RHEV-M with a custom SSL certificate. 2. Try to add a new host to the hosted-engine setup. 3. During the process of adding this host to RHEV-M, it fails with error "Peer's certificate issuer has been marked as not trusted by the user" Actual results: Additional HE host is not able to add when we use custom apache SSL certificate Expected results: Additional HE host should be able to add with custom apache SSL certificate Additional info:
Targeting to 4.1 having a workaround for this.
Nijin, was the custom Apache cert valid if validated with system’s default CA certificates?
This is quite sneaky: hosted-engine-setup already tries to validate the CA cert it's going to use and query the user if not valid; the flow is: 1. download the initial engine CA cert from https://{fqdn}/ovirt-engine/services/pki-resource?resource=ca-certificate&format=X509-PEM-CA in insecure mode 2. download again the initial engine CA cert from https://{fqdn}/ovirt-engine/services/pki-resource?resource=ca-certificate&format=X509-PEM-CA is secure mode using the CA cert downloaded at step 1 (* the issue is here!!!) to validate it 3. if 2 is OK, go to 4, otherwise loop on 2 till the user provides a valid custom CA cert or choose insecure mode 4. download the engine SSH pub key from https://{fqdn}/ovirt-engine/services/pki-resource?resource=engine-certificate&format=OPENSSH-PUBKEY in secure mode with the internal/custom CA o in insecure mode according to user choice 5. use ovirt python SDK with the internal/custom CA o in insecure mode according to user choice In your logs, all the steps from 1 to 4 were OK so it proceed to step 5 assuming that the CA cert was OK but it failed there. The issue is at point 2! If available, we are calling the new ssl.create_default_context() method introduced with python 3.4 and backported on python 2.7.9 and backported to python 2.7.5 by Redhat so in RHEL7 (see rhbz#1259421). The issue is that ssl.create_default_context() is loading by default also the system defined CA certs while the older ssl.wrap_socket() was loading just the ones in the provided file. On the other side, oVirt engine SDK is using pycurl which probably loads just the CA cert in the passed file so hosted-engine-setup validation doesn't complain since the custom apache CA cert validates against the system wide CAs while then the SDK uses just provided one and so it fails. Two options here: 1. fix hosted-engine-setup to use just the provided CA skipping the system defined CA certs 2. fix oVirt SDK to also use the system wide CA certs Personally I think that trusting by default also the system defined CA certs is a good idea so I'm for option 2
Reproducible with: from ovirt_hosted_engine_setup import ohttpshandler import tempfile import ssl fqdn = 'enginevm.localdomain' url='https://{fqdn}/ovirt-engine/services/pki-resource?resource=ca-certificate&format=X509-PEM-CA'.format( fqdn=fqdn, ) he_https_handler = ohttpshandler.OVHTTPSHandler() code, info, content = he_https_handler.fetchUrl( url=url, ca_certs=None, ) if code==200: fd, cert = tempfile.mkstemp( prefix='engine-ca', suffix='.crt', ) with open(cert, 'w') as fileobj: fileobj.write(content) context = ssl.create_default_context() context.load_verify_locations(cafile=cert) context.verify_mode = ssl.CERT_REQUIRED context.check_hostname = ssl.match_hostname print context.get_ca_certs() That prints: [{'notBefore': u'Apr 16 07:09:14 2007 GMT', 'serialNumber': u'49330001', 'notAfter': 'Apr 16 07:09:14 2027 GMT', 'version': 3L, 'subject': ((('countryName', u'CN'),), (('organizationName', u'CNNIC'),), (('commonName', u'CNNIC ROOT'),)), 'issuer': ((('countryName', u'CN'),), (('organizationName', u'CNNIC'),), (('commonName', u'CNNIC ROOT'),))}, .... {'notBefore': u'Apr 7 15:56:37 2016 GMT', 'serialNumber': u'1000', 'notAfter': 'Apr 6 15:56:37 2026 GMT', 'version': 3L, 'subject': ((('countryName', u'US'),), (('organizationName', u'localdomain'),), (('commonName', u'enginevm.localdomain.31641'),)), 'issuer': ((('countryName', u'US'),), (('organizationName', u'localdomain'),), (('commonName', u'enginevm.localdomain.31641'),))}]
The explanation is here: https://docs.python.org/3/library/ssl.html#ssl-certificates 18.2.4.2. CA certificates If you are going to require validation of the other side of the connection’s certificate, you need to provide a “CA certs” file, filled with the certificate chains for each issuer you are willing to trust. Again, this file just contains these chains concatenated together. For validation, Python will use the first chain it finds in the file which matches. The platform’s certificates file can be used by calling SSLContext.load_default_certs(), *** this is done automatically with create_default_context(). ***
If I understand correctly the root problem here is that the hosted engine setup is downloading a certificate that may not be the certificate actually used by the web server. I think that the solution should be to get the certificate directly from the SSL handshake, and not from any of the URLs. The SDK can also be modified to use the system wide CA certificates, but that won't solve the problem if the certificate used by the web server isn't in that database. Also, the SDK should give its users full control, so if we decide to modify it to trust the system wide CA certificates it should be optional, and disabled by default, to preserve backwards compatibility.
(In reply to Juan Hernández from comment #8) > If I understand correctly the root problem here is that the hosted engine > setup is downloading a certificate that may not be the certificate actually > used by the web server. I think that the solution should be to get the > certificate directly from the SSL handshake, and not from any of the URLs. engine-setup creates an internal CA and signs all the cert, including the apache one, with that CA acting as a root CA. hosted-engine-setup fetches it from here: hosted-engine-setup is downloading the oVirt internal CA from https://{fqdn}/ovirt-engine/services/pki-resource?resource=ca-certificate&format=X509-PEM-CA As for our instructions, the user can replace the apache cert with a proper one signed by a third party commercial CA. But, AFAIK, https doesn't mandate to provide the whole cert chain, for instance is quite common to omit the root CA cert and so the SSL handshake is not enough by itself to get also the root CA cert. > The SDK can also be modified to use the system wide CA certificates, but > that won't solve the problem if the certificate used by the web server isn't > in that database. In that code also hosted-engine-setup will not validate it and in that case it's already asking to provide the missing CA certs file or proceed in insecure mode. If the user provide a custom CA chain, hosted-engine-setup will pass that also to the python SDK. If the user decides for insecure mode, hosted-engine-setup uses the python SDK in insecure mode. So no issue here. > Also, the SDK should give its users full control, so if we decide to modify > it to trust the system wide CA certificates it should be optional, and > disabled by default, to preserve backwards compatibility. +1
The TLS handshake must provide at least the last CA certificate, the one that was used to actually sign the web server certificate, otherwise it is impossible to build the trust chain. If the hosted engine setup gets this from the handshake and passes it to to the SDK (after additional validation, maybe) then things should work regardless of what certificate is used by the web server. Please open an SDK bug for the change to support the system wide CA certificates.
(In reply to Simone Tiraboschi from comment #4) > Nijin, was the custom Apache cert valid if validated with system’s default > CA certificates? The customer was using certificate signed by GoDaddy . So it was trusted . However I think the issue is clear now and you were able to replicate.
Verified on ovirt-hosted-engine-setup-2.0.0-0.2.master.20160518095239.git7459b63.el7.centos.noarch 1) Deploy hosted-engine on the first host 2) Change apache certificate on engine VM according to https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.6/html/Administration_Guide/appe-Red_Hat_Enterprise_Virtualization_and_SSL.html#Replacing_the_SSL_certificate_used_by_Red_Hat_Enterprise_Virtualization_Manager_to_identify_itself_to_users_connecting_over_https(do not explain about SELINUX labels), also I was need to change CA certificate under truststore 3) Add custom CA that signed new apache certificate to second host * cp my-custom-ca.pem /etc/pki/ca-trust/source/anchors/ * update-ca-trust 4) Deploy the second host ... [ INFO ] The following CA certificate is going to be used, please immediately interrupt if not correct: [ INFO ] Issuer: C=US, O=qa.lab.tlv.redhat.com, CN=alukiano-he-1.qa.lab.tlv.redhat.com.86247, Subject: C=US, O=qa.lab.tlv.redhat.com, CN=alukiano-he-1.qa.lab.tlv.redhat.com.86247, Fingerprint (SHA-1): 85DC095268B216C14BBA5F960D8AB09F4E7BF336 The REST API cert couldn't be trusted with the internal CA cert Would you like to continue in insecure mode (not recommended)? If not, please provide your CA cert at /etc/pki/CA/ovirtcustomcacert.pem before continuing (Yes, No)[No]? Yes [ INFO ] Connecting to the Engine [ INFO ] Waiting for the host to become operational in the engine. This may take several minutes... [ INFO ] The VDSM Host is now operational ... Deploy succeed without any problems
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-1744.html