Bug 2211691 - [OSP17.1] Ironic fails to unprovision baremetal nodes using boot from volume due to Cinder CVE-2023-2088 fix
Summary: [OSP17.1] Ironic fails to unprovision baremetal nodes using boot from volume ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-ironic
Version: 17.1 (Wallaby)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ga
: 17.1
Assignee: Julia Kreger
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-06-01 14:06 UTC by Julia Kreger
Modified: 2023-08-16 01:15 UTC (History)
5 users (show)

Fixed In Version: openstack-ironic-17.1.1-1.20230128052013.el9ost
Doc Type: Bug Fix
Doc Text:
Before this update, the Bare Metal Provisioning service (ironic) was unable to detach a Block Storage service (cinder) volume from a physical bare metal node. This volume detachment is required to tear down physical machines that have an instance deployed on them by using the boot from volume functionality. With this update, the Bare Metal Provisioning service (ironic) can detach a volume from a physical bare metal node to automatically tear down these physical machines.
Clone Of:
Environment:
Last Closed: 2023-08-16 01:15:29 UTC
Target Upstream Version:
Embargoed:
ifrangs: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad.net 2004555 0 None None None 2023-06-01 14:09:54 UTC
Launchpad.net 2019892 0 None None None 2023-06-01 14:06:36 UTC
OpenStack gerrit 883581 0 None MERGED Fix Cinder Integration fallout from CVE-2023-2088 2023-06-28 21:12:31 UTC
Red Hat Issue Tracker OSP-25548 0 None None None 2023-06-01 14:07:28 UTC
Red Hat Product Errata RHEA-2023:4577 0 None None None 2023-08-16 01:15:52 UTC

Description Julia Kreger 2023-06-01 14:06:37 UTC
Description of problem:

The Cinder fix for CVE-2023-2088 breaks Ironic's Boot from Volume support by changing the underlying requirement for credentials to be passed to perform a "detach" operation of a volume outside of Nova's direct handling.

Ironic independently triggers a detach on a tear down, regardless of if the end user requested a Boot from Volume node from Cinder, or from Ironic directly, in order to ensure that the storage system does not keep an attachment to a physical machine after the instance is removed. This prevents potential data loss issues with volumes and potential security issues if Ironic did not halt it's tear down process.

NOTE: This fix is not to address a vulnerability nor is this issue a vulnerability, but the change in the behavior in Cinder to remedy the vulnerability in cinder. While the fundamental issue *is* similar, the use case and the attachments are to whole physical machines in Ironic's case, and Ironic failing as a result of CVE-2023-2088 fixes is actually a good thing proving that the base Cinder fix works as expected, and that Ironic's base behavior was already properly guarding in the event there was an issue. In this case.


How reproducible:


Steps to Reproduce:

0) Deploy a cloud with cinder in the enabled_storage_interfaces configuration list for ironic.

1) Set the desired baremetal node to utilize the "cinder" storage_interface on the Ironic node. This may or may not be the install default based upon settings with which the cloud was deployed.

baremetal node set --storage-interface cinder $NODE_UUID

2) Set the baremetal node to advertise itself as having an iscsi boot capability.

baremetal node set --property capabilities=iscsi_boot:True $NODE_UUID

3) Set an iscsi initiator IQN for the node. This is the bare minimum, and is for iscsi only. Users utilizing fiber channel storage would use different parameters.

baremetal volume connector create \
         --node $NODE_UUID --type iqn --connector-id iqn.2017-08.org.openstack.$NODE_UUID

4) Request a boot from cinder volume instance matching the baremetal node from the Compute service *or* directly ask ironic to boot the volume (https://docs.openstack.org/ironic/latest/admin/boot-from-volume.html#use-without-cinder) and then trigger a baremetal node deployment.

Actual results:

Instance boots, but cannot be torn down automatically. The baremetal node tear down process fails due to the Cinder behavior change as a result of CVE-2023-2088 leaving the node in an error state when the node is requested to be removed. Potentially consuming and orphaning the baremetal node from being used until manual intervention is taken.

Manual recovery is likely a path of manually setting the node to maintenance and removing/re-adding to Ironic which is a highly undesirable action for a bare metal operator.

Expected results:

Instance and baremetal node is appropriately torn down and volume attachments are released.

Additional info:

Comment 24 errata-xmlrpc 2023-08-16 01:15:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.1 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2023:4577


Note You need to log in before you can comment on or make changes to this bug.