Bug 1953795

Summary: Ironic can't virtual media attach ISOs sourced from ingress routes
Product: OpenShift Container Platform Reporter: Antoni Segura Puimedon <asegurap>
Component: Bare Metal Hardware ProvisioningAssignee: Bob Fournier <bfournie>
Bare Metal Hardware Provisioning sub component: ironic QA Contact: Lubov <lshilin>
Status: CLOSED ERRATA Docs Contact: jfrye
Severity: high    
Priority: high CC: akrzos, bfournie, ccrum, janders, jfrye, keyoung, lshilin, mhrivnak, ncarboni, rbartal, stbenjam, trwest, tsedovic
Version: 4.8Keywords: Triaged
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Release Note text: Previously, Ironic failed to download an image for installation because it uses HTTPS by default and Ironic does not have the correct certificate bundle available. This issue is fixed by setting the image download as insecure to request a transfer without the certificate. (BZ#1953795) ------- Cause: By default Ironic uses https when downloading the image to install. Consequence: As Ironic doesn't have the correct certificate bundle available the image download fails. Fix: Set the image download as insecure to request a transfer without the certificate. Result: The image download will succeed.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 23:04:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Antoni Segura Puimedon 2021-04-27 00:22:39 UTC
Description of problem: When using BMH to boot a live iso from an HTTPS source served by an OpenShift route, the image fails to be attached and boot.


Version-Release number of selected component (if applicable): 4.8


How reproducible: 100%


Steps to Reproduce:
1. Create a Deployment with a webserver serving a bootable ISO
2. Create a Service for it
3. Expose the Service with a Route
4. Put the https URL in the image section of a BMH object as a live iso

Actual results:
Ironic fails to download the image for caching due to failing to verify the OpenShift default ingress certificate.


Expected results:
The image is attached to the machine represented by the BMH object where the ISO was set


Additional info:

Comment 1 Antoni Segura Puimedon 2021-04-27 00:42:41 UTC
From earlier discussion, there seems to be a way to prevent ironic from downloading the image with passthrough, though it may not be available at the moment in the "ramdisk" deploy that the Bare Metal Operatore live-iso functionality uses.

The current ironic default to cachine is due to:
* Wanting to ensure image reachability
* Ensuring the image is available until the machine is undeployed, since some BMC access the image in chunks as needed. In the Central Hub Management the need is probably to ensure that the ISO remains available in the assisted service endpoint until the reboot.

Another angle that was discussed at the end of last week was that the baremetal operator would adapt the way it deploys ironic so that it mounts the configmap/secret that contains the default Ingress certificate. This would allow it to fetch the image via HTTPS and continue to serve it via HTTP to the BMC (since Ironic does not currently provide certificates to BMC for HTTPS fetching).

Comment 2 Dmitry Tantsur 2021-04-27 12:58:48 UTC
The naive fix is https://github.com/metal3-io/ironic-image/pull/255, which just makes the image downloading code respect IRONIC_INSECURE. I haven't looked into accepting the actual certificate (no time until Friday).

Comment 3 Dmitry Tantsur 2021-04-27 16:34:34 UTC
We've discussed this on the bug review. Apparently CBO already mounts certificates to each container, we only need to tell Ironic where they are. The PR above should be reworked to accept the path (or False) via a variable, then we can set it in CBO.

[1] https://github.com/openshift/cluster-baremetal-operator/pull/109

Comment 4 Bob Fournier 2021-04-28 13:48:42 UTC
Added https://github.com/metal3-io/ironic-image/pull/258 to allow setting webserver_verify_ca to a path.

Comment 6 Bob Fournier 2021-04-30 00:18:26 UTC
We still need a fix to cluster-baremetal-operator to set the WEBSERVER_CACERT_FILE. Moving this back to POST.

Comment 9 Nick Carboni 2021-05-04 17:52:32 UTC
Our testing for this issue failed for a few reasons.

1. The file configured in https://github.com/openshift/cluster-baremetal-operator/pull/139 doesn't exist
  - The mounted configmap (cbo-trusted-ca) contains a key named "ca-bundle.crt" not "trusted-ca"
2. The ca bundle in that configmap doesn't configure trust with the default ingress cert for the cluster.
  - This can be tested by using the contents of the configmap to curl from an https route.

During the debugging session we determined that it might be easier to configure the image url to point to the cluster internal service. In this case the assisted service is using a service signing cert and the CA bundle for that cert can be retrieved using the steps in https://docs.openshift.com/container-platform/4.7/security/certificate_types_descriptions/service-ca-certificates.html

Alternatively, to fix this bug as reported (accessing the route) the default ingress certificate would need to be fetched from the openshift-config-managed namespace as described in https://docs.openshift.com/container-platform/4.7/security/certificate_types_descriptions/ingress-certificates.html#workflow

@Bob let me know which direction you want to go and we can adjust accordingly.

Comment 13 Bob Fournier 2021-05-09 13:19:00 UTC
To close on the decision for implementation, the patch to set the CA was removed from CBO so only the ironic-image patch remains which will disable cert verification.

This is the note from Stephen in the CBO revert:
This is the wrong cert to use for the assisted images, it uses the service ca, this needs more investigation in another release to make the service, ingress, and trusted CA bundles available to our containers, as well as investigating using TLS for the httpd hosted by Ironic.

Comment 14 Chad Crum 2021-05-11 14:06:24 UTC
I tested with 4.8.0-0.nightly-2021-05-10-151546 and ironic was able to pull from an https unsecure registry (in this case the assisted service pod https api). 

@Lubov - Is there any other validation you want to do or can we consider this verified? (Just asking because you are assigned QA on this one)

Comment 15 Lubov 2021-05-11 14:31:47 UTC
good for me

Comment 19 errata-xmlrpc 2021-07-27 23:04:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438