Bug 1985483
| Summary: | Cleaning a BMH deployed using live ISO results in a TLS failure | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Ian Main <imain> | ||||
| Component: | Bare Metal Hardware Provisioning | Assignee: | Dmitry Tantsur <dtantsur> | ||||
| Bare Metal Hardware Provisioning sub component: | ironic | QA Contact: | Lubov <lshilin> | ||||
| Status: | CLOSED ERRATA | Docs Contact: | |||||
| Severity: | high | ||||||
| Priority: | medium | CC: | lshilin, rbartal, rpittau, zbitter | ||||
| Version: | 4.9 | Keywords: | Triaged | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 4.9.0 | ||||||
| Hardware: | All | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2021-10-18 17:40:56 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
> However when performing deprovisioning we are seeing an SSL error.
On all hosts, both internal and external?
Could you please attach all logs from the Metal3 pod?
Additionally, would it be possible to log into the ramdisk and fetch the complete logs? BMO should have a parameter to set an SSH key. If not, maybe make a video of booting? I lack some information that preceds the screenshot.
*** Bug 1986118 has been marked as a duplicate of this bug. *** (In reply to Dmitry Tantsur from comment #1) > > However when performing deprovisioning we are seeing an SSL error. > > On all hosts, both internal and external? Yes. Once you set the external IP option it fails since this changes the external_callback_url and callback_endpoint_override in the ironic configuration. Yes it fails for both locally provisioned and externally provisioned hosts as they both use the same new callbacks. The certs generated by CBO are for the provisioning IP and not the external IP so SSL validation fails. > Could you please attach all logs from the Metal3 pod? I don't think there's anything useful in there. > Additionally, would it be possible to log into the ramdisk and fetch the > complete logs? BMO should have a parameter to set an SSH key. If not, maybe > make a video of booting? I lack some information that preceds the screenshot. None of the extra kernel params in the ironic.conf seem to be taking effect for the cleaning operation. I edited the grub command line to add a console and managed to get a log. The boot params are: [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz root=/dev/ram0 text ipa-api-url=https://192.168.111.21:6385 ipa-agent-token=igSKNii0jy2mqgcNe17_pzYvrWUpq5FQpMmdLgyho3o ipa- debug=1 boot_method=vmedia console=ttyS0 vga=normal nomodeset -- (I added console/vga/nomodeset by hand) If I add ipa-insecure=1 to the boot via grub it can talk to ironic API just fine. So somehow there are no extra kernel params being set for cleaning. I tracked this down to a regression in ironic caused by: https://review.opendev.org/q/I25c28df048c706f0c5b013b4d252f09d5a7e57bd The BMO sets the deploy_interface to "ramdisk" whenever the image format is "live-iso". After that patch, ironic returns "ramdisk" from get_boot_option(). And when that happens, ironic ignores the kernel arguments in the config file and substitutes hard-coded ones consistent with what we are seeing ("root=/dev/ram0 text"): https://opendev.org/openstack/ironic/src/branch/master/ironic/drivers/modules/image_utils.py#L466-L469 So this issue affects all deployments where the image format is "live-iso", which in practice I think means ZTP. The patch in Ironic was backported to stable/wallaby and bugfix/18.0 branches, so some previous releases might be affected. The live ISO workflow does not use cleaning (or IPA at all), and this bug is not about the live ISO (or so I was told). I will double-check how cleaning works with the ramdisk deploy. The regression is hopefully fixed by https://review.opendev.org/c/openstack/ironic/+/802437, however I need to understand what you're trying to do. The live ISO workflow is reserved for assisted installer, which doesn't use cleaning or inspection. If you do not use the live ISO workflow, I'll still need the ironic-conductor logs for investigation. It may be that there are no customer scenarios affected. Ian found the issue by hand-testing with a live-iso image (which is a thing that ought to work upstream at least) for convenience, on the (evidently mistaken) assumption that it would work the same for those purposes. Okay, I will fix it, but with a lower priority. The fixed package is available in 4.9. verified on 4.9.0-0.nightly-2021-09-29-172320 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |
Created attachment 1804957 [details] Screenshot of SSL error. Description of problem: We have been working on MPINSTALL-1, the ability to perform redfish+virtualmedia deployments outside of the provisioning network. We've successfully provisioned hosts both on the provisioning network and outside of it. However when performing deprovisioning we are seeing an SSL error. See attachment for a screenshot of the error. Version-Release number of selected component (if applicable): How reproducible: 100% Steps to Reproduce: 1. Set virtualMediaViaExternalNetwork = true in provisioning CR 2. Provision a new baremetal host. 3. Delete the new baremetal host after it has successfully started. Actual results: Watch deprovisioning stall forever. Check host for SSL error during cleaning phase. Expected results: Deprovisioning works. Additional info: