Description of problem: Version-Release number of selected component (if applicable): RHOSP 16.1 How reproducible: Everytime Steps to Reproduce: 1. Deploy an OSP 16.1.x with Ceph backend 2. Enable barbican with this custom policy: https://access.redhat.com/solutions/6479601 3. Add the the below workaround: ComputeExtraConfig: nova::config::nova_config: workarounds/disable_native_luksv1: value: true workarounds/rbd_volume_local_attach: value: true nova::compute::keymgr_backend: 'barbican' 4. Create VM with user-one 5. Create encrypted volume with user-one 6. Attach volume to vm with user-one 7. At this point user-two is able to detach without issue 8. Re-attach with user-one if detached in 4. 9. Live migrate vm to the other compute node with project-admin 10. Attempt to detach with user-two Actual results: Live migration failed with this error message: /var/log/containers/nova/nova-compute.log:2022-04-28 13:18:13.360 7 ERROR oslo_messaging.rpc.server [req-c36f7540-31d3-484f-8f76-d99ef9decf31 9173bf85e45f4bfb9281f1a2a8e8f31d dfad97e476f74774af6ebc92bb7730a1 - default default] Exception during message handling: os_brick.exception.VolumeEncryptionNotSupported: Volume encryption is not supported for rbd volume 5371c26c-c3fe-433f-a77e-63a1d3e1f427. 2022-05-02 21:33:10.156 7 ERROR nova.virt.block_device [req-8bc4320d-87a1-488c-81a7-93cccfd4fc5a 9173bf85e45f4bfb9281f1a2a8e8f31d dfad97e476f74774af6ebc92bb7730a1 - default default] [instance: 4f6939c9-7ced-4aea-9db8-5ef9b245e17a] Failed to detach volume 99e7a4ee-3d90-4968-987a-7ac644360e1d from /dev/vdb: oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. Command: rbd unmap /dev/rbd1 --id openstack --mon_host 172.17.3.20:6789 --mon_host 172.17.3.26:6789 --mon_host 172.17.3.84:6789 Exit code: 16 Stdout: '' Stderr: 'rbd: sysfs write failed\nrbd: unmap failed: (16) Device or resource busy\n' Expected results: Live migration succeeds. Additional info:
Reproduced it in upstream CI [1] without Barbican. Should be quicker moving forward figuring out what's going on, it's much easier to add logging to a Devstack-based deployment/job than a multinode containerized env. [1] https://review.opendev.org/c/openstack/nova/+/843146
I have a PoC of a fix at [1]. There's still stuff left to be figured out, like making sure we're not accidentally breaking other things, so I want to be clear and set expectations correctly that we're still far from an actual releasable fix, but I did want to at least report progress. [1] https://review.opendev.org/c/openstack/nova/+/843554
After trying to reproduce on master without the workaround using the cryptsetup encryptor and iSCSI volumes that should have triggered this same issue, I tracked down that there is already a fix on master in the form of [1]. I've started the backport to stable/wallaby initially, with the intention of taking it all the way back to stable/train and OSP 16.x eventually. [1] https://review.opendev.org/c/openstack/nova/+/804230
@alifshit when will this fix be available on the 16.2.z line?
(In reply to Eoghan Glynn from comment #10) > @alifshit when will this fix be available on the 16.2.z line? Targeted at 16.2.4: https://bugzilla.redhat.com/show_bug.cgi?id=2096418
Small correction: "With this update, you can correctly persist any block device mapping updates done by the libvirt driver on the destination host." should read: "With this update, any block device mapping updates done by the libvirt driver on the destination host are correctly persisted by Nova." The user is not involved in the process at all, this is all internal logic.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 16.1.9 bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:8795