Description of problem: In order to ensure ISO availability for BMC (both from the perspective of it being reachable and to make sure that it stays reachable throughout the provisioning process) Ironic downloads the images that it is to virtual-media attach and keeps them until the machine is un-deployed. With usage patterns that needs to start many virtual ram-disks, ironic can fill the ephemeral storage it uses and cause the ironic pod to be evicted due to OCP resource usage handling Version-Release number of selected component (if applicable): 4.8 How reproducible: Steps to Reproduce: 1. Prepare an cdrom sized ISO and make it available via HTTP 2. Create a large pool of bare metal systems and the BMH for them 3. Set the ISO from step (1) to the image section of all the BMH image spec section for live-iso booting Actual results: Ironic batches the attachment of the ISOs to the nodes without any de-deduplication but eventually runs out of ephemeral storage. Expected results: All the systems boot the ISO from step (1) Additional info: If there was a way to disable caching, then the ISO URL provider would be responsible for the ISO being available and reachable during the lifetime of the virtual ram-disk OS.
I would say it affects also: 1. consistency? 2. Doubles the load on the cluster because ironic has to fetch it from the AI operator, and then serve it.
Dev notes: we have image_download_source for the direct deploy already, but it's not respected in the ramdisk deploy (and the default is different from what we want). The current behavior corresponds to image_download_source=local, we need an option image_download_source=http (with the similar semantic to the direct deploy). Then we need to update BMO to use the new option when the live ISO workflow is requested.
I can run through testing today...
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438