Bug 1953979 - Ironic caching virtualmedia images results in disk space limitations
Summary: Ironic caching virtualmedia images results in disk space limitations
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Bare Metal Hardware Provisioning
Version: 4.8
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.8.0
Assignee: Riccardo Pittau
QA Contact: Lubov
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-27 10:53 UTC by Antoni Segura Puimedon
Modified: 2021-07-27 23:04 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-27 23:04:05 UTC
Target Upstream Version:
Embargoed:
asegurap: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift ironic-image pull 164 0 None open Bug 1953979: Add parameter to set boot iso source 2021-05-06 12:32:04 UTC
Github openshift ironic-image pull 169 0 None open Bug 1953979: Disable caching live boot iso by default 2021-05-11 09:53:33 UTC
OpenStack gerrit 788734 0 None NEW Provide an option to not cache bootable iso ramdisks 2021-04-29 13:25:42 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 23:04:22 UTC

Description Antoni Segura Puimedon 2021-04-27 10:53:27 UTC
Description of problem:
In order to ensure ISO availability for BMC (both from the perspective of it being reachable and to make sure that it stays reachable throughout the provisioning process) Ironic downloads the images that it is to virtual-media attach and keeps them until the machine is un-deployed.

With usage patterns that needs to start many virtual ram-disks, ironic can fill the ephemeral storage it uses and cause the ironic pod to be evicted due to OCP resource usage handling

Version-Release number of selected component (if applicable):
4.8

How reproducible:


Steps to Reproduce:
1. Prepare an cdrom sized ISO and make it available via HTTP
2. Create a large pool of bare metal systems and the BMH for them
3. Set the ISO from step (1) to the image section of all the BMH image spec section for live-iso booting

Actual results:
Ironic batches the attachment of the ISOs to the nodes without any de-deduplication but eventually runs out of ephemeral storage.


Expected results:
All the systems boot the ISO from step (1)

Additional info:
If there was a way to disable caching, then the ISO URL provider would be responsible for the ISO being available and reachable during the lifetime of the virtual ram-disk OS.

Comment 1 Rom Freiman 2021-04-27 10:56:38 UTC
I would say it affects also:
1. consistency?
2. Doubles the load on the cluster because ironic has to fetch it from the AI operator, and then serve it.

Comment 2 Dmitry Tantsur 2021-04-27 16:39:56 UTC
Dev notes: we have image_download_source for the direct deploy already, but it's not respected in the ramdisk deploy (and the default is different from what we want). The current behavior corresponds to image_download_source=local, we need an option image_download_source=http (with the similar semantic to the direct deploy).

Then we need to update BMO to use the new option when the live ISO workflow is requested.

Comment 10 Chad Crum 2021-05-13 12:21:38 UTC
I can run through testing today...

Comment 14 errata-xmlrpc 2021-07-27 23:04:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.